Class: Moab::Bagger
- Inherits:
-
Object
- Object
- Moab::Bagger
- Defined in:
- lib/moab/bagger.rb
Overview
Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.
A class used to create a BagIt package from a version inventory and a set of source files. The #fill_bag method is called with a package_mode parameter that specifies whether the bag is being created for deposit into the repository or is to contain the output of a version reconstruction.
-
In :depositor mode, the version inventory is filtered using the digital object’s signature catalog so that only
new files are included
-
In :reconstructor mode, the version inventory and signature catalog are used together to regenerate the complete
set of files for the version.
Data Model
-
StorageRepository = represents a digital object repository storage node
-
StorageServices = supports application layer access to the repository’s objects, data, and metadata
-
StorageObject = represents a digital object’s repository storage location and ingest/dissemination methods
-
StorageObjectVersion [1..*] = represents a version subdirectory within an object’s home directory
-
Bagger [1] = utility for creating bagit packages for ingest or dissemination
-
-
-
Instance Attribute Summary collapse
-
#bag_inventory ⇒ FileInventory
The actual inventory of the files to be packaged (derived from @version_inventory in #fill_bag).
-
#bag_pathname ⇒ Pathname
The location of the Bagit bag to be created.
-
#package_mode ⇒ Symbol
The operational mode controlling what gets bagged #fill_bag and the full path of source files #fill_payload.
-
#signature_catalog ⇒ SignatureCatalog
The signature catalog, used to specify source paths (in :reconstructor mode), or to filter the version inventory (in :depositor mode).
-
#version_inventory ⇒ FileInventory
The complete inventory of the files comprising a digital object version.
Instance Method Summary collapse
-
#create_bag_info_txt ⇒ void
Generate the bag-info.txt tag file.
-
#create_bag_inventory(package_mode) ⇒ FileInventory
Create, write, and return the inventory of the files that will become the payload.
-
#create_bagit_txt ⇒ void
Generate the bagit.txt tag file.
-
#create_payload_manifests ⇒ void
Using the checksum information from the inventory, create BagIt manifest files for the payload.
-
#create_tagfile_manifests ⇒ void
Create BagIt tag manifest files containing checksums for all files in the bag’s root directory.
-
#create_tagfiles ⇒ Boolean
Create BagIt manifests and tag files.
-
#create_tarfile(tar_pathname = nil) ⇒ Boolean
Create a tar file containing the bag.
-
#delete_bag ⇒ NilClass
Delete the bagit files.
- #delete_tarfile ⇒ Object
-
#deposit_group(group_id, source_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag.
-
#fill_bag(package_mode, source_base_pathname) ⇒ Bagger
Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.
-
#fill_payload(source_base_pathname) ⇒ void
This method uses Unix hard links in order to greatly speed up the process.
-
#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger
constructor
A new instance of Bagger.
-
#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag.
-
#reset_bag ⇒ void
Delete any existing bag data and re-initialize the bag directory.
-
#shell_execute(command) ⇒ Object
Executes a system command in a subprocess if command isn’t successful, grabs stdout and stderr and puts them in ruby exception message.
Constructor Details
#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger
Returns a new instance of Bagger.
26 27 28 29 30 31 |
# File 'lib/moab/bagger.rb', line 26 def initialize(version_inventory, signature_catalog, bag_pathname) @version_inventory = version_inventory @signature_catalog = signature_catalog @bag_pathname = Pathname.new(bag_pathname) create_bagit_txt end |
Instance Attribute Details
#bag_inventory ⇒ FileInventory
Returns The actual inventory of the files to be packaged (derived from @version_inventory in #fill_bag).
44 45 46 |
# File 'lib/moab/bagger.rb', line 44 def bag_inventory @bag_inventory end |
#bag_pathname ⇒ Pathname
Returns The location of the Bagit bag to be created.
41 42 43 |
# File 'lib/moab/bagger.rb', line 41 def bag_pathname @bag_pathname end |
#package_mode ⇒ Symbol
Returns The operational mode controlling what gets bagged #fill_bag and the full path of source files #fill_payload.
48 49 50 |
# File 'lib/moab/bagger.rb', line 48 def package_mode @package_mode end |
#signature_catalog ⇒ SignatureCatalog
Returns The signature catalog, used to specify source paths (in :reconstructor mode), or to filter the version inventory (in :depositor mode).
38 39 40 |
# File 'lib/moab/bagger.rb', line 38 def signature_catalog @signature_catalog end |
#version_inventory ⇒ FileInventory
Returns The complete inventory of the files comprising a digital object version.
34 35 36 |
# File 'lib/moab/bagger.rb', line 34 def version_inventory @version_inventory end |
Instance Method Details
#create_bag_info_txt ⇒ void
This method returns an undefined value.
Returns Generate the bag-info.txt tag file.
219 220 221 222 223 224 225 |
# File 'lib/moab/bagger.rb', line 219 def create_bag_info_txt bag_pathname.join("bag-info.txt").open('w') do |f| f.puts "External-Identifier: #{bag_inventory.package_id}" f.puts "Payload-Oxum: #{bag_inventory.byte_count}.#{bag_inventory.file_count}" f.puts "Bag-Size: #{bag_inventory.human_size}" end end |
#create_bag_inventory(package_mode) ⇒ FileInventory
Returns Create, write, and return the inventory of the files that will become the payload.
107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/moab/bagger.rb', line 107 def create_bag_inventory(package_mode) @package_mode = package_mode bag_pathname.mkpath case package_mode when :depositor version_inventory.write_xml_file(bag_pathname, 'version') @bag_inventory = signature_catalog.version_additions(version_inventory) bag_inventory.write_xml_file(bag_pathname, 'additions') when :reconstructor @bag_inventory = version_inventory bag_inventory.write_xml_file(bag_pathname, 'version') end bag_inventory end |
#create_bagit_txt ⇒ void
This method returns an undefined value.
Returns Generate the bagit.txt tag file.
59 60 61 62 63 64 65 |
# File 'lib/moab/bagger.rb', line 59 def create_bagit_txt bag_pathname.mkpath bag_pathname.join("bagit.txt").open('w') do |f| f.puts "Tag-File-Character-Encoding: UTF-8" f.puts "BagIt-Version: 0.97" end end |
#create_payload_manifests ⇒ void
This method returns an undefined value.
Returns Using the checksum information from the inventory, create BagIt manifest files for the payload.
189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 |
# File 'lib/moab/bagger.rb', line 189 def create_payload_manifests manifest_pathname = {} manifest_file = {} DEFAULT_CHECKSUM_TYPES.each do |type| manifest_pathname[type] = bag_pathname.join("manifest-#{type}.txt") manifest_file[type] = manifest_pathname[type].open('w') end bag_inventory.groups.each do |group| group.files.each do |file| fixity = file.signature.fixity file.instances.each do |instance| data_path = File.join('data', group.group_id, instance.path) DEFAULT_CHECKSUM_TYPES.each do |type| manifest_file[type].puts("#{fixity[type]} #{data_path}") if fixity[type] end end end end ensure DEFAULT_CHECKSUM_TYPES.each do |type| if manifest_file[type] manifest_file[type].close manifest_pathname[type].delete if manifest_pathname[type].exist? && manifest_pathname[type].size == 0 end end end |
#create_tagfile_manifests ⇒ void
This method returns an undefined value.
Returns create BagIt tag manifest files containing checksums for all files in the bag’s root directory.
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 |
# File 'lib/moab/bagger.rb', line 229 def create_tagfile_manifests manifest_pathname = {} manifest_file = {} DEFAULT_CHECKSUM_TYPES.each do |type| manifest_pathname[type] = bag_pathname.join("tagmanifest-#{type}.txt") manifest_file[type] = manifest_pathname[type].open('w') end bag_pathname.children.each do |file| unless file.directory? || file.basename.to_s[0, 11] == 'tagmanifest' signature = FileSignature.new.signature_from_file(file) fixity = signature.fixity DEFAULT_CHECKSUM_TYPES.each do |type| manifest_file[type].puts("#{fixity[type]} #{file.basename}") if fixity[type] end end end ensure DEFAULT_CHECKSUM_TYPES.each do |type| if manifest_file[type] manifest_file[type].close manifest_pathname[type].delete if manifest_pathname[type].exist? && manifest_pathname[type].size == 0 end end end |
#create_tagfiles ⇒ Boolean
Returns create BagIt manifests and tag files. Return true if successful.
179 180 181 182 183 184 185 |
# File 'lib/moab/bagger.rb', line 179 def create_tagfiles create_payload_manifests create_bag_info_txt create_bagit_txt create_tagfile_manifests true end |
#create_tarfile(tar_pathname = nil) ⇒ Boolean
Returns Create a tar file containing the bag.
256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
# File 'lib/moab/bagger.rb', line 256 def create_tarfile(tar_pathname = nil) bag_name = bag_pathname.basename bag_parent = bag_pathname.parent tar_pathname ||= bag_parent.join("#{bag_name}.tar") tar_cmd = "cd '#{bag_parent}'; tar --dereference --force-local -cf '#{tar_pathname}' '#{bag_name}'" begin shell_execute(tar_cmd) rescue shell_execute(tar_cmd.sub('--force-local', '')) end raise(MoabRuntimeError, "Unable to create tarfile #{tar_pathname}") unless tar_pathname.exist? true end |
#delete_bag ⇒ NilClass
Returns Delete the bagit files.
68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/moab/bagger.rb', line 68 def delete_bag # make sure this looks like a bag before deleting if bag_pathname.join('bagit.txt').exist? if bag_pathname.join('data').exist? bag_pathname.rmtree else bag_pathname.children.each(&:delete) bag_pathname.rmdir end end nil end |
#delete_tarfile ⇒ Object
82 83 84 85 86 87 |
# File 'lib/moab/bagger.rb', line 82 def delete_tarfile bag_name = bag_pathname.basename bag_parent = bag_pathname.parent tar_pathname = bag_parent.join("#{bag_name}.tar") tar_pathname.delete if tar_pathname.exist? end |
#deposit_group(group_id, source_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag. Return true if successful or nil if the group was not found in the inventory
143 144 145 146 147 148 149 150 151 152 153 154 155 |
# File 'lib/moab/bagger.rb', line 143 def deposit_group(group_id, source_dir) group = bag_inventory.group(group_id) return nil? if group.nil? || group.files.empty? target_dir = bag_pathname.join('data', group_id) group.path_list.each do |relative_path| source = source_dir.join(relative_path) target = target_dir.join(relative_path) target.parent.mkpath FileUtils.symlink source, target end true end |
#fill_bag(package_mode, source_base_pathname) ⇒ Bagger
Returns Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.
96 97 98 99 100 101 |
# File 'lib/moab/bagger.rb', line 96 def fill_bag(package_mode, source_base_pathname) create_bag_inventory(package_mode) fill_payload(source_base_pathname) create_tagfiles self end |
#fill_payload(source_base_pathname) ⇒ void
This method returns an undefined value.
This method uses Unix hard links in order to greatly speed up the process. Hard links, however, require that the target bag must be created within the same filesystem as the source files
127 128 129 130 131 132 133 134 135 136 137 |
# File 'lib/moab/bagger.rb', line 127 def fill_payload(source_base_pathname) bag_inventory.groups.each do |group| group_id = group.group_id case package_mode when :depositor deposit_group(group_id, source_base_pathname.join(group_id)) when :reconstructor reconstuct_group(group_id, source_base_pathname) end end end |
#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag. Return true if successful or nil if the group was not found in the inventory
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/moab/bagger.rb', line 161 def reconstuct_group(group_id, storage_object_dir) group = bag_inventory.group(group_id) return nil? if group.nil? || group.files.empty? target_dir = bag_pathname.join('data', group_id) group.files.each do |file| catalog_entry = signature_catalog.signature_hash[file.signature] source = storage_object_dir.join(catalog_entry.storage_path) file.instances.each do |instance| target = target_dir.join(instance.path) target.parent.mkpath FileUtils.symlink source, target unless target.exist? end end true end |
#reset_bag ⇒ void
This method returns an undefined value.
Returns Delete any existing bag data and re-initialize the bag directory.
51 52 53 54 55 |
# File 'lib/moab/bagger.rb', line 51 def reset_bag delete_bag delete_tarfile create_bagit_txt end |
#shell_execute(command) ⇒ Object
Executes a system command in a subprocess if command isn’t successful, grabs stdout and stderr and puts them in ruby exception message
274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
# File 'lib/moab/bagger.rb', line 274 def shell_execute(command) require 'open3' stdout, stderr, status = Open3.capture3(command.chomp) if status.success? && status.exitstatus.zero? stdout else msg = "Shell command failed: [#{command}] caused by <STDERR = #{stderr}>" msg << " STDOUT = #{stdout}" if stdout&.length&.positive? raise(MoabStandardError, msg) end rescue SystemCallError => e msg = "Shell command failed: [#{command}] caused by #{e.inspect}" raise(MoabStandardError, msg) end |