Class: Dor::SdrIngestService

Inherits:
Object
  • Object
show all
Defined in:
lib/dor/services/sdr_ingest_service.rb

Class Method Summary collapse

Class Method Details

.extract_datastreams(dor_item, workspace) ⇒ Pathname

Returns Pull all the datastreams specified in the configuration file into the workspace’s metadata directory, overwriting existing file if present.

Parameters:

  • dor_item (Dor::Item)

    The representation of the digital object

  • workspace (DruidTools::Druid)

    The representation of the item’s work area

Returns:

  • (Pathname)

    Pull all the datastreams specified in the configuration file into the workspace’s metadata directory, overwriting existing file if present



54
55
56
57
58
59
60
61
62
63
# File 'lib/dor/services/sdr_ingest_service.rb', line 54

def self.extract_datastreams(dor_item, workspace)
   = Pathname.new(workspace.path('metadata', true))
  Config.sdr.datastreams.to_hash.each_pair do |ds_name, required|
    ds_name = ds_name.to_s
     = .join("#{ds_name}.xml")
     = get_datastream_content(dor_item, ds_name, required)
    .open('w') { |f| f <<  } if 
  end
  
end

.get_content_inventory(metadata_dir, druid, version_id) ⇒ Moab::FileInventory

Returns Parse the contentMetadata and generate a new version inventory object containing a content group.

Parameters:

  • metadata_dir (Pathname)

    The location of the the object’s metadata files

  • druid (String)

    The object identifier

  • version_id (Integer)

    The version number

Returns:

  • (Moab::FileInventory)

    Parse the contentMetadata and generate a new version inventory object containing a content group



122
123
124
125
126
127
128
129
# File 'lib/dor/services/sdr_ingest_service.rb', line 122

def self.get_content_inventory(, druid, version_id)
   = ()
  if 
    Stanford::ContentInventory.new.inventory_from_cm(, druid, 'preserve', version_id)
  else
    Moab::FileInventory.new(:type => 'version', :digital_object_id => druid, :version_id => version_id)
  end
end

.get_content_metadata(metadata_dir) ⇒ String

Return the contents of the contentMetadata.xml file from the content directory

Parameters:

  • metadata_dir (Pathname)

    The location of the the object’s metadata files

Returns:

  • (String)

    Return the contents of the contentMetadata.xml file from the content directory



133
134
135
136
# File 'lib/dor/services/sdr_ingest_service.rb', line 133

def self.()
   = .join('contentMetadata.xml')
  .read if .exist?
end

.get_datastream_content(dor_item, ds_name, required) ⇒ String

Return the xml text of the specified datastream if it exists. If not found, return nil unless it is a required datastream in which case raise exception

Parameters:

  • dor_item (Dor::Item)

    The representation of the digital object

  • ds_name (String)

    The name of the desired Fedora datastream

  • required (String)

    Enumeration: one of [‘required’, ‘optional’]

Returns:

  • (String)

    return the xml text of the specified datastream if it exists. If not found, return nil unless it is a required datastream in which case raise exception



70
71
72
73
74
75
76
77
78
79
# File 'lib/dor/services/sdr_ingest_service.rb', line 70

def self.get_datastream_content(dor_item, ds_name, required)
  ds = (ds_name == 'relationshipMetadata' ? 'RELS-EXT' : ds_name)
  if dor_item.datastreams.keys.include?(ds) && !dor_item.datastreams[ds].new?
    return dor_item.datastreams[ds].content
  elsif required == 'optional'
    return nil
  else
    raise "required datastream #{ds_name} not found in DOR"
  end
end

.get_metadata_file_group(metadata_dir) ⇒ Moab::FileGroup

Returns Traverse the metadata directory and generate a metadata group.

Parameters:

  • metadata_dir (Pathname)

    The location of the the object’s metadata files

Returns:

  • (Moab::FileGroup)

    Traverse the metadata directory and generate a metadata group



140
141
142
143
# File 'lib/dor/services/sdr_ingest_service.rb', line 140

def self.()
  file_group = Moab::FileGroup.new(:group_id => 'metadata').group_from_directory()
  file_group
end

.get_signature_catalog(druid) ⇒ Moab::SignatureCatalog

Returns the catalog of all files previously ingested.

Parameters:

  • druid (String)

    The object identifier

Returns:

  • (Moab::SignatureCatalog)

    the catalog of all files previously ingested



46
47
48
# File 'lib/dor/services/sdr_ingest_service.rb', line 46

def self.get_signature_catalog(druid)
  Sdr::Client.get_signature_catalog(druid)
end

.get_version_inventory(metadata_dir, druid, version_id) ⇒ Moab::FileInventory

Returns Generate and return a version inventory for the object.

Parameters:

  • metadata_dir (Pathname)

    The location of the the object’s metadata files

  • druid (String)

    The object identifier

  • version_id (Integer)

    The version number

Returns:

  • (Moab::FileInventory)

    Generate and return a version inventory for the object



111
112
113
114
115
# File 'lib/dor/services/sdr_ingest_service.rb', line 111

def self.get_version_inventory(, druid, version_id)
  version_inventory = get_content_inventory(, druid, version_id)
  version_inventory.groups << ()
  version_inventory
end

.transfer(dor_item, agreement_id = nil) ⇒ void

This method returns an undefined value.

Returns Create the moab manifests, export data to a BagIt bag, kick off the SDR ingest workflow.

Parameters:

  • dor_item (Dor::Item)

    The representation of the digital object

  • agreement_id (String) (defaults to: nil)

    depreciated, included for backward compatability with common-accessoning



9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# File 'lib/dor/services/sdr_ingest_service.rb', line 9

def self.transfer(dor_item, agreement_id = nil)
  druid = dor_item.pid
  workspace = DruidTools::Druid.new(druid, Dor::Config.sdr.local_workspace_root)
  signature_catalog = get_signature_catalog(druid)
  new_version_id = signature_catalog.version_id + 1
   = extract_datastreams(dor_item, workspace)
  (, new_version_id)
  version_inventory = get_version_inventory(, druid, new_version_id)
  version_addtions = signature_catalog.version_additions(version_inventory)
  content_addtions = version_addtions.group('content')
  if content_addtions.nil? || content_addtions.files.empty?
    content_dir = nil
  else
    new_file_list = content_addtions.path_list
    content_dir = workspace.find_filelist_parent('content', new_file_list)
  end
  content_group = version_inventory.group('content')
  unless content_group.nil? || content_group.files.empty?
    signature_catalog.normalize_group_signatures(content_group, content_dir)
  end
  # export the bag (in tar format)
  bag_dir = Pathname(Dor::Config.sdr.local_export_home).join(druid.sub('druid:', ''))
  bagger = Moab::Bagger.new(version_inventory, signature_catalog, bag_dir)
  bagger.reset_bag
  bagger.create_bag_inventory(:depositor)
  bagger.deposit_group('content', content_dir)
  bagger.deposit_group('metadata', )
  bagger.create_tagfiles
  verify_bag_structure(bag_dir)
  # Now bootstrap SDR workflow. but do not create the workflows datastream
  dor_item.create_workflow('sdrIngestWF', false)
rescue Exception => e
  raise Dor::Exception, 'Export failure'
end

.verify_bag_structure(bag_dir) ⇒ Boolean

Returns true if all required files exist, raises exception if not.

Parameters:

  • bag_dir (Pathname)

    the location of the bag to be verified

Returns:

  • (Boolean)

    true if all required files exist, raises exception if not



147
148
149
150
151
152
153
154
155
156
157
158
# File 'lib/dor/services/sdr_ingest_service.rb', line 147

def self.verify_bag_structure(bag_dir)
  verify_pathname(bag_dir)
  verify_pathname(bag_dir.join('data'))
  verify_pathname(bag_dir.join('bagit.txt'))
  verify_pathname(bag_dir.join('bag-info.txt'))
  verify_pathname(bag_dir.join('manifest-sha256.txt'))
  verify_pathname(bag_dir.join('tagmanifest-sha256.txt'))
  verify_pathname(bag_dir.join('versionAdditions.xml'))
  verify_pathname(bag_dir.join('versionInventory.xml'))
  verify_pathname(bag_dir.join('data', 'metadata', 'versionMetadata.xml'))
  true
end

.verify_pathname(pathname) ⇒ Boolean

Returns true if file exists, raises exception if not.

Parameters:

  • pathname (Pathname)

    The file whose existence should be verified

Returns:

  • (Boolean)

    true if file exists, raises exception if not



162
163
164
165
# File 'lib/dor/services/sdr_ingest_service.rb', line 162

def self.verify_pathname(pathname)
  raise "#{pathname.basename} not found at #{pathname}" unless pathname.exist?
  true
end

.verify_version_id(pathname, expected, found) ⇒ Object

Parameters:

  • pathname (Pathname)

    The location of the file containing a version number

  • expected (Integer)

    The version number that should be in the file

  • found (Integer)

    The version number that is actually in the file



92
93
94
95
# File 'lib/dor/services/sdr_ingest_service.rb', line 92

def self.verify_version_id(pathname, expected, found)
  raise "Version mismatch in #{pathname}, expected #{expected}, found #{found}" unless expected == found
  true
end

.verify_version_metadata(metadata_dir, expected) ⇒ Object

Parameters:

  • metadata_dir (Pathname)

    the location of the metadata directory in the workspace

  • expected (Integer)

    the version identifer expected to be used in the versionMetadata



83
84
85
86
87
# File 'lib/dor/services/sdr_ingest_service.rb', line 83

def self.(, expected)
  vmfile = .join('versionMetadata.xml')
  verify_version_id(vmfile, expected, vmfile_version_id(vmfile))
  true
end

.vmfile_version_id(pathname) ⇒ Integer

Returns the versionId found in the last version element, or nil if missing.

Parameters:

  • pathname (Pathname)

    the location of the versionMetadata file

Returns:

  • (Integer)

    the versionId found in the last version element, or nil if missing



99
100
101
102
103
104
105
# File 'lib/dor/services/sdr_ingest_service.rb', line 99

def self.vmfile_version_id(pathname)
  verify_pathname(pathname)
  doc = Nokogiri::XML(File.open(pathname.to_s))
  nodeset = doc.xpath('/versionMetadata/version')
  version_id = nodeset.last['versionId']
  version_id.nil? ? nil : version_id.to_i
end