Class: Moab::Bagger
- Inherits:
-
Object
- Object
- Moab::Bagger
- Defined in:
- lib/moab/bagger.rb
Overview
Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.
A class used to create a BagIt package from a version inventory and a set of source files. The #fill_bag method is called with a package_mode parameter that specifies whether the bag is being created for deposit into the repository or is to contain the output of a version reconstruction.
-
In :depositor mode, the version inventory is filtered using the digital object’s signature catalog so that only
new files are included
-
In :reconstructor mode, the version inventory and signature catalog are used together to regenerate the complete
set of files for the version.
Data Model
-
StorageRepository = represents a digital object repository storage node
-
StorageServices = supports application layer access to the repository’s objects, data, and metadata
-
StorageObject = represents a digital object’s repository storage location and ingest/dissemination methods
-
StorageObjectVersion [1..*] = represents a version subdirectory within an object’s home directory
-
Bagger [1] = utility for creating bagit packages for ingest or dissemination
-
-
-
Instance Attribute Summary collapse
-
#bag_inventory ⇒ FileInventory
The actual inventory of the files to be packaged (derived from @version_inventory in #fill_bag).
-
#bag_pathname ⇒ Pathname
The location of the Bagit bag to be created.
-
#package_mode ⇒ Symbol
The operational mode controlling what gets bagged #fill_bag and the full path of source files #fill_payload.
-
#signature_catalog ⇒ SignatureCatalog
The signature catalog, used to specify source paths (in :reconstructor mode), or to filter the version inventory (in :depositor mode).
-
#version_inventory ⇒ FileInventory
The complete inventory of the files comprising a digital object version.
Instance Method Summary collapse
-
#create_bag_info_txt ⇒ void
Generate the bag-info.txt tag file.
-
#create_bag_inventory(package_mode) ⇒ FileInventory
Create, write, and return the inventory of the files that will become the payload.
-
#create_bagit_txt ⇒ void
Generate the bagit.txt tag file.
-
#create_payload_manifests ⇒ void
Using the checksum information from the inventory, create BagIt manifest files for the payload.
-
#create_tagfile_manifests ⇒ void
Create BagIt tag manifest files containing checksums for all files in the bag’s root directory.
-
#create_tagfiles ⇒ Boolean
Create BagIt manifests and tag files.
-
#create_tarfile(tar_pathname = nil) ⇒ Boolean
Create a tar file containing the bag.
-
#delete_bag ⇒ NilClass
Delete the bagit files.
- #delete_tarfile ⇒ Object
-
#deposit_group(group_id, source_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag.
-
#fill_bag(package_mode, source_base_pathname) ⇒ Bagger
Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.
-
#fill_payload(source_base_pathname) ⇒ void
This method uses Unix hard links in order to greatly speed up the process.
- #include_in_tagfile_manifests?(file) ⇒ Boolean
-
#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger
constructor
A new instance of Bagger.
-
#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag.
-
#reset_bag ⇒ void
Delete any existing bag data and re-initialize the bag directory.
-
#shell_execute(command) ⇒ Object
Executes a system command in a subprocess if command isn’t successful, grabs stdout and stderr and puts them in ruby exception message.
Constructor Details
#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger
Returns a new instance of Bagger.
26 27 28 29 30 31 |
# File 'lib/moab/bagger.rb', line 26 def initialize(version_inventory, signature_catalog, bag_pathname) @version_inventory = version_inventory @signature_catalog = signature_catalog @bag_pathname = Pathname.new(bag_pathname) create_bagit_txt end |
Instance Attribute Details
#bag_inventory ⇒ FileInventory
Returns The actual inventory of the files to be packaged (derived from @version_inventory in #fill_bag).
44 45 46 |
# File 'lib/moab/bagger.rb', line 44 def bag_inventory @bag_inventory end |
#bag_pathname ⇒ Pathname
Returns The location of the Bagit bag to be created.
41 42 43 |
# File 'lib/moab/bagger.rb', line 41 def bag_pathname @bag_pathname end |
#package_mode ⇒ Symbol
Returns The operational mode controlling what gets bagged #fill_bag and the full path of source files #fill_payload.
48 49 50 |
# File 'lib/moab/bagger.rb', line 48 def package_mode @package_mode end |
#signature_catalog ⇒ SignatureCatalog
Returns The signature catalog, used to specify source paths (in :reconstructor mode), or to filter the version inventory (in :depositor mode).
38 39 40 |
# File 'lib/moab/bagger.rb', line 38 def signature_catalog @signature_catalog end |
#version_inventory ⇒ FileInventory
Returns The complete inventory of the files comprising a digital object version.
34 35 36 |
# File 'lib/moab/bagger.rb', line 34 def version_inventory @version_inventory end |
Instance Method Details
#create_bag_info_txt ⇒ void
This method returns an undefined value.
Returns Generate the bag-info.txt tag file.
212 213 214 215 216 217 218 |
# File 'lib/moab/bagger.rb', line 212 def create_bag_info_txt bag_pathname.join('bag-info.txt').open('w') do |f| f.puts "External-Identifier: #{bag_inventory.package_id}" f.puts "Payload-Oxum: #{bag_inventory.byte_count}.#{bag_inventory.file_count}" f.puts "Bag-Size: #{bag_inventory.human_size}" end end |
#create_bag_inventory(package_mode) ⇒ FileInventory
Returns Create, write, and return the inventory of the files that will become the payload.
100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
# File 'lib/moab/bagger.rb', line 100 def create_bag_inventory(package_mode) @package_mode = package_mode bag_pathname.mkpath case package_mode when :depositor version_inventory.write_xml_file(bag_pathname, 'version') @bag_inventory = signature_catalog.version_additions(version_inventory) bag_inventory.write_xml_file(bag_pathname, 'additions') when :reconstructor @bag_inventory = version_inventory bag_inventory.write_xml_file(bag_pathname, 'version') end bag_inventory end |
#create_bagit_txt ⇒ void
This method returns an undefined value.
Returns Generate the bagit.txt tag file.
59 60 61 62 63 64 65 |
# File 'lib/moab/bagger.rb', line 59 def create_bagit_txt bag_pathname.mkpath bag_pathname.join('bagit.txt').open('w') do |f| f.puts 'Tag-File-Character-Encoding: UTF-8' f.puts 'BagIt-Version: 0.97' end end |
#create_payload_manifests ⇒ void
This method returns an undefined value.
Returns Using the checksum information from the inventory, create BagIt manifest files for the payload.
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
# File 'lib/moab/bagger.rb', line 182 def create_payload_manifests manifest_pathname = {} manifest_file = {} DEFAULT_CHECKSUM_TYPES.each do |type| manifest_pathname[type] = bag_pathname.join("manifest-#{type}.txt") manifest_file[type] = manifest_pathname[type].open('w') end bag_inventory.groups.each do |group| group.files.each do |file| fixity = file.signature.fixity file.instances.each do |instance| data_path = File.join('data', group.group_id, instance.path) DEFAULT_CHECKSUM_TYPES.each do |type| manifest_file[type].puts("#{fixity[type]} #{data_path}") if fixity[type] end end end end ensure DEFAULT_CHECKSUM_TYPES.each do |type| if manifest_file[type] manifest_file[type].close manifest_pathname[type].delete if manifest_pathname[type].exist? && manifest_pathname[type].empty? end end end |
#create_tagfile_manifests ⇒ void
This method returns an undefined value.
Returns create BagIt tag manifest files containing checksums for all files in the bag’s root directory.
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
# File 'lib/moab/bagger.rb', line 222 def create_tagfile_manifests manifest_pathname = {} manifest_file = {} DEFAULT_CHECKSUM_TYPES.each do |type| manifest_pathname[type] = bag_pathname.join("tagmanifest-#{type}.txt") manifest_file[type] = manifest_pathname[type].open('w') end bag_pathname.children.each do |file| next unless include_in_tagfile_manifests?(file) signature = FileSignature.new.signature_from_file(file) fixity = signature.fixity DEFAULT_CHECKSUM_TYPES.each do |type| manifest_file[type].puts("#{fixity[type]} #{file.basename}") if fixity[type] end end ensure DEFAULT_CHECKSUM_TYPES.each do |type| if manifest_file[type] manifest_file[type].close manifest_pathname[type].delete if manifest_pathname[type].exist? && manifest_pathname[type].empty? end end end |
#create_tagfiles ⇒ Boolean
Returns create BagIt manifests and tag files. Return true if successful.
172 173 174 175 176 177 178 |
# File 'lib/moab/bagger.rb', line 172 def create_tagfiles create_payload_manifests create_bag_info_txt create_bagit_txt create_tagfile_manifests true end |
#create_tarfile(tar_pathname = nil) ⇒ Boolean
Returns Create a tar file containing the bag.
256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
# File 'lib/moab/bagger.rb', line 256 def create_tarfile(tar_pathname = nil) bag_name = bag_pathname.basename bag_parent = bag_pathname.parent tar_pathname ||= bag_parent.join("#{bag_name}.tar") tar_cmd = "cd '#{bag_parent}'; tar --dereference --force-local -cf '#{tar_pathname}' '#{bag_name}'" begin shell_execute(tar_cmd) rescue shell_execute(tar_cmd.sub('--force-local', '')) end raise(MoabRuntimeError, "Unable to create tarfile #{tar_pathname}") unless tar_pathname.exist? true end |
#delete_bag ⇒ NilClass
Returns Delete the bagit files.
68 69 70 71 72 |
# File 'lib/moab/bagger.rb', line 68 def delete_bag # make sure this looks like a bag before deleting bag_pathname.rmtree if bag_pathname.join('bagit.txt').exist? nil end |
#delete_tarfile ⇒ Object
75 76 77 78 79 80 |
# File 'lib/moab/bagger.rb', line 75 def delete_tarfile bag_name = bag_pathname.basename bag_parent = bag_pathname.parent tar_pathname = bag_parent.join("#{bag_name}.tar") tar_pathname.delete if tar_pathname.exist? end |
#deposit_group(group_id, source_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag. Return true if successful or nil if the group was not found in the inventory
136 137 138 139 140 141 142 143 144 145 146 147 148 |
# File 'lib/moab/bagger.rb', line 136 def deposit_group(group_id, source_dir) group = bag_inventory.group(group_id) return nil? if group.nil? || group.files.empty? target_dir = bag_pathname.join('data', group_id) group.path_list.each do |relative_path| source = source_dir.join(relative_path) target = target_dir.join(relative_path) target.parent.mkpath FileUtils.symlink source, target end true end |
#fill_bag(package_mode, source_base_pathname) ⇒ Bagger
Returns Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.
89 90 91 92 93 94 |
# File 'lib/moab/bagger.rb', line 89 def fill_bag(package_mode, source_base_pathname) create_bag_inventory(package_mode) fill_payload(source_base_pathname) create_tagfiles self end |
#fill_payload(source_base_pathname) ⇒ void
This method returns an undefined value.
This method uses Unix hard links in order to greatly speed up the process. Hard links, however, require that the target bag must be created within the same filesystem as the source files
120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/moab/bagger.rb', line 120 def fill_payload(source_base_pathname) bag_inventory.groups.each do |group| group_id = group.group_id case package_mode when :depositor deposit_group(group_id, source_base_pathname.join(group_id)) when :reconstructor reconstuct_group(group_id, source_base_pathname) end end end |
#include_in_tagfile_manifests?(file) ⇒ Boolean
248 249 250 251 252 253 |
# File 'lib/moab/bagger.rb', line 248 def include_in_tagfile_manifests?(file) basename = file.basename.to_s return false if file.directory? || basename.start_with?('tagmanifest') || basename.match?(/\A\.nfs\w+\z/) true end |
#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag. Return true if successful or nil if the group was not found in the inventory
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
# File 'lib/moab/bagger.rb', line 154 def reconstuct_group(group_id, storage_object_dir) group = bag_inventory.group(group_id) return nil? if group.nil? || group.files.empty? target_dir = bag_pathname.join('data', group_id) group.files.each do |file| catalog_entry = signature_catalog.signature_hash[file.signature] source = storage_object_dir.join(catalog_entry.storage_path) file.instances.each do |instance| target = target_dir.join(instance.path) target.parent.mkpath FileUtils.symlink source, target unless target.exist? end end true end |
#reset_bag ⇒ void
This method returns an undefined value.
Returns Delete any existing bag data and re-initialize the bag directory.
51 52 53 54 55 |
# File 'lib/moab/bagger.rb', line 51 def reset_bag delete_bag delete_tarfile create_bagit_txt end |
#shell_execute(command) ⇒ Object
Executes a system command in a subprocess if command isn’t successful, grabs stdout and stderr and puts them in ruby exception message
274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
# File 'lib/moab/bagger.rb', line 274 def shell_execute(command) require 'open3' stdout, stderr, status = Open3.capture3(command.chomp) if status.success? && status.exitstatus.zero? stdout else msg = "Shell command failed: [#{command}] caused by <STDERR = #{stderr}>" msg << " STDOUT = #{stdout}" if stdout&.length&.positive? raise(MoabStandardError, msg) end rescue SystemCallError => e msg = "Shell command failed: [#{command}] caused by #{e.inspect}" raise(MoabStandardError, msg) end |