Class: Bulkrax::ApplicationParser Abstract

Inherits:
Object
  • Object
show all
Defined in:
app/parsers/bulkrax/application_parser.rb

Overview

This class is abstract.

Subclass the Bulkrax::ApplicationParser to create a parser that handles a specific format (e.g. CSV, Bagit, XML, etc).

An abstract class that establishes the API for Bulkrax’s import and export parsing.

Direct Known Subclasses

CsvParser, OaiDcParser, XmlParser

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(importerexporter) ⇒ ApplicationParser

Returns a new instance of ApplicationParser.



39
40
41
42
# File 'app/parsers/bulkrax/application_parser.rb', line 39

def initialize(importerexporter)
  @importerexporter = importerexporter
  @headers = []
end

Instance Attribute Details

#headersObject

rubocop:disable Metrics/ClassLength



10
11
12
# File 'app/parsers/bulkrax/application_parser.rb', line 10

def headers
  @headers
end

#importerexporterObject Also known as: importer, exporter

rubocop:disable Metrics/ClassLength



10
11
12
# File 'app/parsers/bulkrax/application_parser.rb', line 10

def importerexporter
  @importerexporter
end

Class Method Details

.export_supported?TrueClass, FalseClass

TODO:

Convert to ‘class_attribute :export_supported, default: false, instance_predicate: true` and `self << class; alias export_supported? export_supported; end`

Returns this parser does or does not support exports.

Returns:

  • (TrueClass, FalseClass)

    this parser does or does not support exports.



28
29
30
# File 'app/parsers/bulkrax/application_parser.rb', line 28

def self.export_supported?
  false
end

.import_supported?TrueClass, FalseClass

TODO:

Convert to ‘class_attribute :import_supported, default: false, instance_predicate: true` and `self << class; alias import_supported? import_supported; end`

Returns this parser does or does not support imports.

Returns:

  • (TrueClass, FalseClass)

    this parser does or does not support imports.



35
36
37
# File 'app/parsers/bulkrax/application_parser.rb', line 35

def self.import_supported?
  true
end

.parser_fieldsObject



21
22
23
# File 'app/parsers/bulkrax/application_parser.rb', line 21

def self.parser_fields
  {}
end

Instance Method Details

#base_path(type = 'import') ⇒ String

Base path for imported and exported files

Parameters:

  • (String)

Returns:

  • (String)

    the base path for files that this parser will “parse”



212
213
214
215
216
# File 'app/parsers/bulkrax/application_parser.rb', line 212

def base_path(type = 'import')
  # account for multiple versions of hyku
  is_multitenant = ENV['HYKU_MULTITENANT'] == 'true' || ENV['SETTINGS__MULTITENANCY__ENABLED'] == 'true'
  is_multitenant ? File.join(Bulkrax.send("#{type}_path"), ::Site.instance..name) : Bulkrax.send("#{type}_path")
end

#collection_entry_classObject

This method is abstract.

Subclass and override #collection_entry_class to implement behavior for the parser.

Raises:

  • (NotImplementedError)


52
53
54
# File 'app/parsers/bulkrax/application_parser.rb', line 52

def collection_entry_class
  raise NotImplementedError, 'must be defined'
end

#collections_totalObject



331
332
333
# File 'app/parsers/bulkrax/application_parser.rb', line 331

def collections_total
  0
end

#create_collectionsObject

This method is abstract.

Subclass and override #create_collections to implement behavior for the parser.

Raises:

  • (NotImplementedError)


176
177
178
# File 'app/parsers/bulkrax/application_parser.rb', line 176

def create_collections
  raise NotImplementedError, 'must be defined' if importer?
end

#create_file_setsObject

This method is abstract.

Subclass and override #create_file_sets to implement behavior for the parser.

Raises:

  • (NotImplementedError)


186
187
188
# File 'app/parsers/bulkrax/application_parser.rb', line 186

def create_file_sets
  raise NotImplementedError, 'must be defined' if importer?
end

#create_objects(types = []) ⇒ Object

Parameters:

  • types (Array<Symbol>) (defaults to: [])

    the types of objects that we’ll create.

See Also:



169
170
171
172
173
# File 'app/parsers/bulkrax/application_parser.rb', line 169

def create_objects(types = [])
  types.each do |object_type|
    send("create_#{object_type.pluralize}")
  end
end

#create_relationshipsObject

This method is abstract.

Subclass and override #create_relationships to implement behavior for the parser.

Raises:

  • (NotImplementedError)


191
192
193
# File 'app/parsers/bulkrax/application_parser.rb', line 191

def create_relationships
  raise NotImplementedError, 'must be defined' if importer?
end

#create_worksObject

This method is abstract.

Subclass and override #create_works to implement behavior for the parser.

Raises:

  • (NotImplementedError)


181
182
183
# File 'app/parsers/bulkrax/application_parser.rb', line 181

def create_works
  raise NotImplementedError, 'must be defined' if importer?
end

#entry_classObject

This method is abstract.

Subclass and override #entry_class to implement behavior for the parser.

Raises:

  • (NotImplementedError)


46
47
48
# File 'app/parsers/bulkrax/application_parser.rb', line 46

def entry_class
  raise NotImplementedError, 'must be defined'
end

#exporter?TrueClass, FalseClass

Returns:

  • (TrueClass, FalseClass)


243
244
245
# File 'app/parsers/bulkrax/application_parser.rb', line 243

def exporter?
  importerexporter.is_a?(Bulkrax::Exporter)
end

#file_set_entry_classObject

This method is abstract.

Subclass and override #file_set_entry_class to implement behavior for the parser.

Raises:

  • (NotImplementedError)


58
59
60
# File 'app/parsers/bulkrax/application_parser.rb', line 58

def file_set_entry_class
  raise NotImplementedError, 'must be defined'
end

#file_sets_totalObject



335
336
337
# File 'app/parsers/bulkrax/application_parser.rb', line 335

def file_sets_total
  0
end

#find_or_create_entry(entryclass, identifier, type, raw_metadata = nil) ⇒ Object



307
308
309
310
311
312
313
314
315
316
# File 'app/parsers/bulkrax/application_parser.rb', line 307

def find_or_create_entry(entryclass, identifier, type,  = nil)
  entry = entryclass.where(
    importerexporter_id: importerexporter.id,
    importerexporter_type: type,
    identifier: identifier
  ).first_or_create!
  entry. = 
  entry.save!
  entry
end

#generated_metadata_mappingString

Returns:

  • (String)


92
93
94
# File 'app/parsers/bulkrax/application_parser.rb', line 92

def 
   ||= 'generated'
end

#get_field_mapping_hash_for(key) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Raises:

  • (StandardError)


121
122
123
124
125
126
127
128
129
130
131
132
# File 'app/parsers/bulkrax/application_parser.rb', line 121

def get_field_mapping_hash_for(key)
  return instance_variable_get("@#{key}_hash") if instance_variable_get("@#{key}_hash").present?

  mapping = importerexporter.field_mapping.is_a?(Hash) ? importerexporter.field_mapping : {}
  instance_variable_set(
    "@#{key}_hash",
    mapping&.with_indifferent_access&.select { |_, h| h.key?(key) }
  )
  raise StandardError, "more than one #{key} declared: #{instance_variable_get("@#{key}_hash").keys.join(', ')}" if instance_variable_get("@#{key}_hash").length > 1

  instance_variable_get("@#{key}_hash")
end

#import_file_pathString

Path for the import

Returns:

  • (String)


371
372
373
# File 'app/parsers/bulkrax/application_parser.rb', line 371

def import_file_path
  @import_file_path ||= real_import_file_path
end

#importer?TrueClass, FalseClass

Returns:

  • (TrueClass, FalseClass)


238
239
240
# File 'app/parsers/bulkrax/application_parser.rb', line 238

def importer?
  importerexporter.is_a?(Bulkrax::Importer)
end

#invalid_record(message) ⇒ Object

rubocop:disable Rails/SkipsModelValidations



276
277
278
279
280
281
282
# File 'app/parsers/bulkrax/application_parser.rb', line 276

def invalid_record(message)
  current_run.invalid_records ||= ""
  current_run.invalid_records += message
  current_run.save
  ImporterRun.increment_counter(:failed_records, current_run.id)
  ImporterRun.decrement_counter(:enqueued_records, current_run.id) unless ImporterRun.find(current_run.id).enqueued_records <= 0 # rubocop:disable Style/IdenticalConditionalBranches
end

#limit_reached?(limit, index) ⇒ TrueClass, FalseClass

Parameters:

  • limit (Integer)

    limit set on the importerexporter

  • index (Integer)

    index of current iteration

Returns:

  • (TrueClass, FalseClass)


250
251
252
253
# File 'app/parsers/bulkrax/application_parser.rb', line 250

def limit_reached?(limit, index)
  return false if limit.nil? || limit.zero? # no limit
  index >= limit
end

#model_field_mappingsArray<String>

Returns:

  • (Array<String>)


135
136
137
138
139
140
# File 'app/parsers/bulkrax/application_parser.rb', line 135

def model_field_mappings
  model_mappings = Bulkrax.field_mappings[self.class.to_s]&.dig('model', :from) || []
  model_mappings |= ['model']

  model_mappings
end

#new_entry(entryclass, type) ⇒ Object



300
301
302
303
304
305
# File 'app/parsers/bulkrax/application_parser.rb', line 300

def new_entry(entryclass, type)
  entryclass.new(
    importerexporter_id: importerexporter.id,
    importerexporter_type: type
  )
end

#path_for_importString

Path where we’ll store the import metadata and files

this is used for uploaded and cloud files

Returns:

  • (String)


221
222
223
224
225
# File 'app/parsers/bulkrax/application_parser.rb', line 221

def path_for_import
  @path_for_import = File.join(base_path, importerexporter.path_string)
  FileUtils.mkdir_p(@path_for_import) unless File.exist?(@path_for_import)
  @path_for_import
end

#perform_methodString

Returns:

  • (String)


143
144
145
146
147
148
149
# File 'app/parsers/bulkrax/application_parser.rb', line 143

def perform_method
  if self.validate_only
    'perform_now'
  else
    'perform_later'
  end
end

#record(identifier, _opts = {}) ⇒ Object



319
320
321
322
323
324
325
# File 'app/parsers/bulkrax/application_parser.rb', line 319

def record(identifier, _opts = {})
  return @record if @record

  @record = entry_class.new(self, identifier)
  @record.build
  return @record
end

#record_has_source_identifier(record, index) ⇒ TrueClass, FalseClass

Returns:

  • (TrueClass, FalseClass)


262
263
264
265
266
267
268
269
270
271
272
273
# File 'app/parsers/bulkrax/application_parser.rb', line 262

def record_has_source_identifier(record, index)
  if record[source_identifier].blank?
    if Bulkrax.fill_in_blank_source_identifiers.present?
      record[source_identifier] = Bulkrax.fill_in_blank_source_identifiers.call(self, index)
    else
      invalid_record("Missing #{source_identifier} for #{record.to_h}\n")
      false
    end
  else
    true
  end
end

#records(_opts = {}) ⇒ Object

This method is abstract.

Subclass and override #records to implement behavior for the parser.

Raises:

  • (NotImplementedError)


64
65
66
# File 'app/parsers/bulkrax/application_parser.rb', line 64

def records(_opts = {})
  raise NotImplementedError, 'must be defined'
end

Returns:

  • (String)

See Also:



116
117
118
# File 'app/parsers/bulkrax/application_parser.rb', line 116

def related_children_parsed_mapping
  @related_children_parsed_mapping ||= (get_field_mapping_hash_for('related_children_field_mapping')&.keys&.first || 'children')
end

Returns:

  • (String, NilClass)

See Also:



110
111
112
# File 'app/parsers/bulkrax/application_parser.rb', line 110

def related_children_raw_mapping
  @related_children_raw_mapping ||= get_field_mapping_hash_for('related_children_field_mapping')&.values&.first&.[]('from')&.first
end

Returns:

  • (String)

See Also:

  • #related_parents_field_mapping


104
105
106
# File 'app/parsers/bulkrax/application_parser.rb', line 104

def related_parents_parsed_mapping
  @related_parents_parsed_mapping ||= (get_field_mapping_hash_for('related_parents_field_mapping')&.keys&.first || 'parents')
end

Returns:

  • (String, NilClass)

See Also:



98
99
100
# File 'app/parsers/bulkrax/application_parser.rb', line 98

def related_parents_raw_mapping
  @related_parents_raw_mapping ||= get_field_mapping_hash_for('related_parents_field_mapping')&.values&.first&.[]('from')&.first
end

#required_elementsArray<String>

Returns:

  • (Array<String>)


286
287
288
289
290
291
292
293
294
295
296
297
298
# File 'app/parsers/bulkrax/application_parser.rb', line 286

def required_elements
  matched_elements = ((importerexporter.mapping.keys || []) & (Bulkrax.required_elements || []))
  unless matched_elements.count == Bulkrax.required_elements.count
    missing_elements = Bulkrax.required_elements - matched_elements
    error_alert = "Missing mapping for at least one required element, missing mappings are: #{missing_elements.join(', ')}"
    raise StandardError, error_alert
  end
  if Bulkrax.fill_in_blank_source_identifiers
    Bulkrax.required_elements
  else
    Bulkrax.required_elements + [source_identifier]
  end
end

#retrieve_cloud_files(files) ⇒ Object

Optional, define if using browse everything for file upload



196
# File 'app/parsers/bulkrax/application_parser.rb', line 196

def retrieve_cloud_files(files); end

#setup_export_fileObject

This method is abstract.

Subclass and override #setup_export_file to implement behavior for the parser.

Raises:

  • (NotImplementedError)


228
229
230
# File 'app/parsers/bulkrax/application_parser.rb', line 228

def setup_export_file
  raise NotImplementedError, 'must be defined' if exporter?
end

#source_identifierSymbol

importing (e.g. is not this application that mounts this Bulkrax engine).

Returns:

  • (Symbol)

    the name of the identifying property in the source system from which we’re

See Also:



73
74
75
# File 'app/parsers/bulkrax/application_parser.rb', line 73

def source_identifier
  @source_identifier ||= get_field_mapping_hash_for('source_identifier')&.values&.first&.[]('from')&.first&.to_sym || :source_identifier
end

#totalObject



327
328
329
# File 'app/parsers/bulkrax/application_parser.rb', line 327

def total
  0
end

#unzip(file_to_unzip) ⇒ Object



344
345
346
347
348
349
350
351
352
# File 'app/parsers/bulkrax/application_parser.rb', line 344

def unzip(file_to_unzip)
  Zip::File.open(file_to_unzip) do |zip_file|
    zip_file.each do |entry|
      entry_path = File.join(importer_unzip_path, entry.name)
      FileUtils.mkdir_p(File.dirname(entry_path))
      zip_file.extract(entry, entry_path) unless File.exist?(entry_path)
    end
  end
end

#valid_import?TrueClass, FalseClass

Override to add specific validations

Returns:

  • (TrueClass, FalseClass)


257
258
259
# File 'app/parsers/bulkrax/application_parser.rb', line 257

def valid_import?
  true
end

#visibilityString

The visibility of the record. Acceptable values are: “open”, “embaro”, “lease”, “authenticated”, “restricted”. The default is “open”



156
157
158
# File 'app/parsers/bulkrax/application_parser.rb', line 156

def visibility
  @visibility ||= self.parser_fields['visibility'] || 'open'
end

#work_identifierSymbol

Returns the name of the identifying property for the system which we’re importing into (e.g. the application that mounts this Bulkrax engine).

Returns:

  • (Symbol)

    the name of the identifying property for the system which we’re importing into (e.g. the application that mounts this Bulkrax engine)

See Also:



80
81
82
# File 'app/parsers/bulkrax/application_parser.rb', line 80

def work_identifier
  @work_identifier ||= get_field_mapping_hash_for('source_identifier')&.keys&.first&.to_sym || :source
end

#work_identifier_search_fieldSymbol

Returns the solr property of the source_identifier. Used for searching. defaults to work_identifier value + “_sim”.

Returns:

  • (Symbol)

    the solr property of the source_identifier. Used for searching. defaults to work_identifier value + “_sim”

See Also:



87
88
89
# File 'app/parsers/bulkrax/application_parser.rb', line 87

def work_identifier_search_field
  @work_identifier_search_field ||= get_field_mapping_hash_for('source_identifier')&.values&.first&.[]('search_field')&.first&.to_s || "#{work_identifier}_sim"
end

#writeObject



339
340
341
342
# File 'app/parsers/bulkrax/application_parser.rb', line 339

def write
  write_files
  zip
end

#write_filesObject

This method is abstract.

Subclass and override #write_files to implement behavior for the parser.

Raises:

  • (NotImplementedError)


233
234
235
# File 'app/parsers/bulkrax/application_parser.rb', line 233

def write_files
  raise NotImplementedError, 'must be defined' if exporter?
end

#write_import_file(file) ⇒ Object

Parameters:

  • file (#path, #original_filename)

    the file object that with the relevant data for the import.



200
201
202
203
204
205
206
207
# File 'app/parsers/bulkrax/application_parser.rb', line 200

def write_import_file(file)
  path = File.join(path_for_import, file.original_filename)
  FileUtils.mv(
    file.path,
    path
  )
  path
end

#zipObject



354
355
356
357
358
359
360
361
362
363
364
365
366
367
# File 'app/parsers/bulkrax/application_parser.rb', line 354

def zip
  FileUtils.mkdir_p(exporter_export_zip_path)

  Dir["#{exporter_export_path}/**"].each do |folder|
    zip_path = "#{exporter_export_zip_path.split('/').last}_#{folder.split('/').last}.zip"
    FileUtils.rm_rf("#{exporter_export_zip_path}/#{zip_path}")

    Zip::File.open(File.join("#{exporter_export_zip_path}/#{zip_path}"), create: true) do |zip_file|
      Dir["#{folder}/**/**"].each do |file|
        zip_file.add(file.sub("#{folder}/", ''), file)
      end
    end
  end
end