Class: IiifPrint::SplitPdfs::ChildWorkCreationFromPdfService

Inherits:
Object
  • Object
show all
Defined in:
lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb

Overview

Encapsulates methods used for pdf splitting into child works.

The primary point of entry is ChildWorkCreationFromPdfService.conditionally_enqueue.

Class Method Summary collapse

Class Method Details

.conditionally_enqueue(file_set:, file:, user:, import_url: nil, work: nil) ⇒ Symbol, TrueClass

Responsible for conditionally enqueueing the PDF splitting job. The conditions attempt to sniff out whether the given file was a PDF.

rubocop:disable Metrics/MethodLength, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity

Parameters:

  • file_set (FileSet)

    What is the containing file set for the provided file.

  • file (#path, #id)
  • user (User)

    Who did the upload?

  • import_url (NilClass, String) (defaults to: nil)

    Provided when we’re dealing with a file provided via a URL.

  • work (Hydra::PCDM::Work) (defaults to: nil)

    An optional parameter that saves us a bit of time in not needing to query for the parent of the given :file_set (see parent_for)

Returns:

  • (Symbol)

    when we don’t enqueue the job

  • (TrueClass)

    when we actually enqueue the job underlying job.



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 25

def self.conditionally_enqueue(file_set:, file:, user:, import_url: nil, work: nil)
  work ||= IiifPrint.parent_for(file_set)

  return :no_split_for_parent unless iiif_print_split?(work: work)
  return :no_pdfs_to_split_for_import_url if import_url && !pdfs?(paths: [import_url])

  file_locations = if import_url
                     # TODO: Fix this logic, currently unsupported in Bulkrax
                     [Hyrax::WorkingDirectory.find_or_retrieve(file.id, file_set.id)]
                   else
                     pdf_paths(file: file)
                   end
  return :no_pdfs_to_split if file_locations.empty?

  file_set_id = file_set.id.try(:id) || file_set.id
  work_admin_set_id = work.admin_set_id.try(:id) || work.admin_set_id
  job = work.try(:iiif_print_config)&.pdf_splitter_job&.perform_later(
    file_set_id,
    file_locations,
    user,
    work_admin_set_id,
    0 # A no longer used parameter; but we need to preserve the method signature (for now)
  )
  job ? :enqueued : :pdf_job_failed_enqueue
end

.filter_file_ids(input) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



103
104
105
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 103

def self.filter_file_ids(input)
  Array.wrap(input).select(&:present?)
end

.iiif_print_split?(work:) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Is child work splitting defined for model?

Parameters:

  • A (GenericWork, etc)

    valid type of hyrax work

Returns:

  • (Boolean)


92
93
94
95
96
97
98
99
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 92

def self.iiif_print_split?(work:)
  config = work.try(:iiif_print_config)
  return false unless config
  return false if config.pdf_splitter_service.try(:never_split_pdfs?)
  # defined only if work has include IiifPrint.model_configuration with pdf_split_child_model
  return true if config&.pdf_split_child_model
  false
end

.pdf_paths(file:) ⇒ Array

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Load an array of paths to pdf files

Parameters:

  • Hyrax::Upload (Array)

    file ids]

Returns:

  • (Array)

    String] file paths to temp directory



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 69

def self.pdf_paths(file:)
  return [] unless file

  if file.class < Valkyrie::Resource
    # assuming that if one PDF is uploaded to a Valkyrie resource then all of them should be
    return [] unless file.pdf?
    [file.file.disk_path.to_s]
  else
    upload_ids = filter_file_ids(file.id.to_s)
    return [] if upload_ids.empty?

    uploads = Hyrax::UploadedFile.find(upload_ids)
    paths = uploads.map(&method(:upload_path))
    pdfs_only_for(paths)
  end
end

.pdfs?(paths:) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Are there any PDF files?

Parameters:

  • String] (Array)

    paths to PDFs

Returns:

  • (Boolean)


58
59
60
61
62
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 58

def self.pdfs?(paths:)
  pdf_paths = pdfs_only_for(paths)
  return false unless pdf_paths.count.positive?
  true
end

.pdfs_only_for(paths) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

TODO: Consider other methods to identify a PDF file.

This sub-selection may need to be moved to use mimetype if there
is a need to support paths not ending in .pdf (i.e. remote_urls)


124
125
126
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 124

def self.pdfs_only_for(paths)
  paths.select { |path| IiifPrint.split_for_path_suffix?(path) }
end

.upload_path(upload) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Given Hyrax::Upload object, return path to file on local filesystem



111
112
113
114
115
116
# File 'lib/iiif_print/split_pdfs/child_work_creation_from_pdf_service.rb', line 111

def self.upload_path(upload)
  # so many layers to this onion:
  # TODO: Write a recursive function to keep calling file until
  # the file doesn't respond to file then return that file.
  upload.file.file.file
end