Class: IiifPrint::SplitPdfs::ChildWorkCreationFromPdfService

Inherits:
Object
  • Object
show all
Defined in:
lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb

Overview

Encapsulates methods used for pdf splitting into child works.

The primary point of entry is ChildWorkCreationFromPdfService.conditionally_enqueue.

Class Method Summary collapse

Class Method Details

.conditionally_enqueue(file_set:, file:, user:, import_url: nil, work: nil) ⇒ Symbol, TrueClass

Responsible for conditionally enqueueing the PDF splitting job. The conditions attempt to sniff out whether the given file was a PDF.

rubocop:disable Metrics/MethodLength

Parameters:

  • file_set (FileSet)

    What is the containing file set for the provided file.

  • file (#path, #id)
  • user (User)

    Who did the upload?

  • import_url (NilClass, String) (defaults to: nil)

    Provided when we’re dealing with a file provided via a URL.

  • work (Hydra::PCDM::Work) (defaults to: nil)

    An optional parameter that saves us a bit of time in not needing to query for the parent of the given :file_set (see parent_for)

Returns:

  • (Symbol)

    when we don’t enqueue the job

  • (TrueClass)

    when we actually enqueue the job underlying job.



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 25

def self.conditionally_enqueue(file_set:, file:, user:, import_url: nil, work: nil)
  work ||= IiifPrint.parent_for(file_set)

  return :no_split_for_parent unless iiif_print_split?(work: work)
  return :no_pdfs_to_split_for_import_url if import_url && !pdfs?(paths: [import_url])

  file_locations = if import_url
                     # TODO: Fix this logic, currently unsupported in Bulkrax
                     [Hyrax::WorkingDirectory.find_or_retrieve(file.id, file_set.id)]
                   else
                     pdf_paths(file: file)
                   end
  return :no_pdfs_to_split if file_locations.empty?

  IiifPrint.conditionally_submit_split_for(work: work, file_set: file_set, locations: file_locations, user: user)
  :enqueued
end

.filter_file_ids(input) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



95
96
97
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 95

def self.filter_file_ids(input)
  Array.wrap(input).select(&:present?)
end

.iiif_print_split?(work:) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Is child work splitting defined for model?

Parameters:

  • A (GenericWork, etc)

    valid type of hyrax work

Returns:

  • (Boolean)


84
85
86
87
88
89
90
91
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 84

def self.iiif_print_split?(work:)
  config = work.try(:iiif_print_config)
  return false unless config
  return false if config.pdf_splitter_service.try(:never_split_pdfs?)
  # defined only if work has include IiifPrint.model_configuration with pdf_split_child_model
  return true if config&.pdf_split_child_model
  false
end

.pdf_paths(file:) ⇒ Array

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Load an array of paths to pdf files

Parameters:

  • Hyrax::Upload (Array)

    file ids]

Returns:

  • (Array)

    String] file paths to temp directory



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 61

def self.pdf_paths(file:)
  return [] unless file

  if file.class < Valkyrie::Resource
    # assuming that if one PDF is uploaded to a Valkyrie resource then all of them should be
    paths = [Hyrax.storage_adapter.file_path(file.file_identifier)]
    pdfs_only_for(paths)
  else
    upload_ids = filter_file_ids(file.id.to_s)
    return [] if upload_ids.empty?

    uploads = Hyrax::UploadedFile.find(upload_ids)
    paths = uploads.map(&method(:upload_path))
    pdfs_only_for(paths)
  end
end

.pdfs?(paths:) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Are there any PDF files?

Parameters:

  • String] (Array)

    paths to PDFs

Returns:

  • (Boolean)


50
51
52
53
54
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 50

def self.pdfs?(paths:)
  pdf_paths = pdfs_only_for(paths)
  return false unless pdf_paths.count.positive?
  true
end

.pdfs_only_for(paths) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

TODO: Consider other methods to identify a PDF file.

This sub-selection may need to be moved to use mimetype if there
is a need to support paths not ending in .pdf (i.e. remote_urls)


116
117
118
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 116

def self.pdfs_only_for(paths)
  paths.select { |path| IiifPrint.split_for_path_suffix?(path) }
end

.upload_path(upload) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Given Hyrax::Upload object, return path to file on local filesystem



103
104
105
106
107
108
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 103

def self.upload_path(upload)
  # so many layers to this onion:
  # TODO: Write a recursive function to keep calling file until
  # the file doesn't respond to file then return that file.
  upload.file.file.file
end