Class: Moab::Bagger
- Inherits:
-
Object
- Object
- Moab::Bagger
- Defined in:
- lib/moab/bagger.rb
Overview
Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.
A class used to create a BagIt package from a version inventory and a set of source files. The #fill_bag method is called with a package_mode parameter that specifies whether the bag is being created for deposit into the repository or is to contain the output of a version reconstruction.
-
In :depositor mode, the version inventory is filtered using the digital object’s signature catalog so that only
new files are included
-
In :reconstructor mode, the version inventory and signature catalog are used together to regenerate the complete
set of files for the version.
Data Model
-
StorageRepository = represents a digital object repository storage node
-
StorageServices = supports application layer access to the repository’s objects, data, and metadata
-
StorageObject = represents a digital object’s repository storage location and ingest/dissemination methods
-
StorageObjectVersion [1..*] = represents a version subdirectory within an object’s home directory
-
Bagger [1] = utility for creating bagit packages for ingest or dissemination
-
-
-
Instance Attribute Summary collapse
-
#bag_inventory ⇒ FileInventory
The actual inventory of the files to be packaged (derived from @version_inventory in #fill_bag).
-
#bag_pathname ⇒ Pathname
The location of the Bagit bag to be created.
-
#package_mode ⇒ Symbol
The operational mode controlling what gets bagged #fill_bag and the full path of source files #fill_payload.
-
#signature_catalog ⇒ SignatureCatalog
The signature catalog, used to specify source paths (in :reconstructor mode), or to filter the version inventory (in :depositor mode).
-
#version_inventory ⇒ FileInventory
The complete inventory of the files comprising a digital object version.
Instance Method Summary collapse
-
#create_bag_info_txt ⇒ void
Generate the bag-info.txt tag file.
-
#create_bag_inventory(package_mode) ⇒ FileInventory
Create, write, and return the inventory of the files that will become the payload.
-
#create_bagit_txt ⇒ void
Generate the bagit.txt tag file.
-
#create_payload_manifests ⇒ void
Using the checksum information from the inventory, create BagIt manifest files for the payload.
-
#create_tagfile_manifests ⇒ void
Create BagIt tag manifest files containing checksums for all files in the bag’s root directory.
-
#create_tagfiles ⇒ Boolean
Create BagIt manifests and tag files.
-
#create_tarfile(tar_pathname = nil) ⇒ Boolean
Create a tar file containing the bag.
-
#delete_bag ⇒ NilClass
Delete the bagit files.
- #delete_tarfile ⇒ Object
-
#deposit_group(group_id, source_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag.
-
#fill_bag(package_mode, source_base_pathname) ⇒ Bagger
Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.
-
#fill_payload(source_base_pathname) ⇒ void
This method uses Unix hard links in order to greatly speed up the process.
-
#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger
constructor
A new instance of Bagger.
-
#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag.
-
#reset_bag ⇒ void
Delete any existing bag data and re-initialize the bag directory.
-
#shell_execute(command) ⇒ Object
Executes a system command in a subprocess if command isn’t successful, grabs stdout and stderr and puts them in ruby exception message.
Constructor Details
#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger
Returns a new instance of Bagger.
24 25 26 27 28 29 |
# File 'lib/moab/bagger.rb', line 24 def initialize(version_inventory, signature_catalog, bag_pathname) @version_inventory = version_inventory @signature_catalog = signature_catalog @bag_pathname = Pathname.new(bag_pathname) create_bagit_txt end |
Instance Attribute Details
#bag_inventory ⇒ FileInventory
Returns The actual inventory of the files to be packaged (derived from @version_inventory in #fill_bag).
42 43 44 |
# File 'lib/moab/bagger.rb', line 42 def bag_inventory @bag_inventory end |
#bag_pathname ⇒ Pathname
Returns The location of the Bagit bag to be created.
39 40 41 |
# File 'lib/moab/bagger.rb', line 39 def bag_pathname @bag_pathname end |
#package_mode ⇒ Symbol
Returns The operational mode controlling what gets bagged #fill_bag and the full path of source files #fill_payload.
46 47 48 |
# File 'lib/moab/bagger.rb', line 46 def package_mode @package_mode end |
#signature_catalog ⇒ SignatureCatalog
Returns The signature catalog, used to specify source paths (in :reconstructor mode), or to filter the version inventory (in :depositor mode).
36 37 38 |
# File 'lib/moab/bagger.rb', line 36 def signature_catalog @signature_catalog end |
#version_inventory ⇒ FileInventory
Returns The complete inventory of the files comprising a digital object version.
32 33 34 |
# File 'lib/moab/bagger.rb', line 32 def version_inventory @version_inventory end |
Instance Method Details
#create_bag_info_txt ⇒ void
This method returns an undefined value.
Returns Generate the bag-info.txt tag file.
215 216 217 218 219 220 221 |
# File 'lib/moab/bagger.rb', line 215 def create_bag_info_txt bag_pathname.join("bag-info.txt").open('w') do |f| f.puts "External-Identifier: #{bag_inventory.package_id}" f.puts "Payload-Oxum: #{bag_inventory.byte_count}.#{bag_inventory.file_count}" f.puts "Bag-Size: #{bag_inventory.human_size}" end end |
#create_bag_inventory(package_mode) ⇒ FileInventory
Returns Create, write, and return the inventory of the files that will become the payload.
105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/moab/bagger.rb', line 105 def create_bag_inventory(package_mode) @package_mode = package_mode bag_pathname.mkpath case package_mode when :depositor version_inventory.write_xml_file(bag_pathname, 'version') @bag_inventory = signature_catalog.version_additions(version_inventory) bag_inventory.write_xml_file(bag_pathname, 'additions') when :reconstructor @bag_inventory = version_inventory bag_inventory.write_xml_file(bag_pathname, 'version') end bag_inventory end |
#create_bagit_txt ⇒ void
This method returns an undefined value.
Returns Generate the bagit.txt tag file.
57 58 59 60 61 62 63 |
# File 'lib/moab/bagger.rb', line 57 def create_bagit_txt bag_pathname.mkpath bag_pathname.join("bagit.txt").open('w') do |f| f.puts "Tag-File-Character-Encoding: UTF-8" f.puts "BagIt-Version: 0.97" end end |
#create_payload_manifests ⇒ void
This method returns an undefined value.
Returns Using the checksum information from the inventory, create BagIt manifest files for the payload.
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
# File 'lib/moab/bagger.rb', line 185 def create_payload_manifests manifest_pathname = {} manifest_file = {} DEFAULT_CHECKSUM_TYPES.each do |type| manifest_pathname[type] = bag_pathname.join("manifest-#{type}.txt") manifest_file[type] = manifest_pathname[type].open('w') end bag_inventory.groups.each do |group| group.files.each do |file| fixity = file.signature.fixity file.instances.each do |instance| data_path = File.join('data', group.group_id, instance.path) DEFAULT_CHECKSUM_TYPES.each do |type| manifest_file[type].puts("#{fixity[type]} #{data_path}") if fixity[type] end end end end ensure DEFAULT_CHECKSUM_TYPES.each do |type| if manifest_file[type] manifest_file[type].close manifest_pathname[type].delete if manifest_pathname[type].exist? && manifest_pathname[type].size == 0 end end end |
#create_tagfile_manifests ⇒ void
This method returns an undefined value.
Returns create BagIt tag manifest files containing checksums for all files in the bag’s root directory.
225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 |
# File 'lib/moab/bagger.rb', line 225 def create_tagfile_manifests manifest_pathname = {} manifest_file = {} DEFAULT_CHECKSUM_TYPES.each do |type| manifest_pathname[type] = bag_pathname.join("tagmanifest-#{type}.txt") manifest_file[type] = manifest_pathname[type].open('w') end bag_pathname.children.each do |file| unless file.directory? || file.basename.to_s[0, 11] == 'tagmanifest' signature = FileSignature.new.signature_from_file(file) fixity = signature.fixity DEFAULT_CHECKSUM_TYPES.each do |type| manifest_file[type].puts("#{fixity[type]} #{file.basename}") if fixity[type] end end end ensure DEFAULT_CHECKSUM_TYPES.each do |type| if manifest_file[type] manifest_file[type].close manifest_pathname[type].delete if manifest_pathname[type].exist? && manifest_pathname[type].size == 0 end end end |
#create_tagfiles ⇒ Boolean
Returns create BagIt manifests and tag files. Return true if successful.
175 176 177 178 179 180 181 |
# File 'lib/moab/bagger.rb', line 175 def create_tagfiles create_payload_manifests create_bag_info_txt create_bagit_txt create_tagfile_manifests true end |
#create_tarfile(tar_pathname = nil) ⇒ Boolean
Returns Create a tar file containing the bag.
252 253 254 255 256 257 258 259 260 261 262 263 264 |
# File 'lib/moab/bagger.rb', line 252 def create_tarfile(tar_pathname = nil) bag_name = bag_pathname.basename bag_parent = bag_pathname.parent tar_pathname ||= bag_parent.join("#{bag_name}.tar") tar_cmd = "cd '#{bag_parent}'; tar --dereference --force-local -cf '#{tar_pathname}' '#{bag_name}'" begin shell_execute(tar_cmd) rescue shell_execute(tar_cmd.sub('--force-local', '')) end raise "Unable to create tarfile #{tar_pathname}" unless tar_pathname.exist? true end |
#delete_bag ⇒ NilClass
Returns Delete the bagit files.
66 67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/moab/bagger.rb', line 66 def delete_bag # make sure this looks like a bag before deleting if bag_pathname.join('bagit.txt').exist? if bag_pathname.join('data').exist? bag_pathname.rmtree else bag_pathname.children.each(&:delete) bag_pathname.rmdir end end nil end |
#delete_tarfile ⇒ Object
80 81 82 83 84 85 |
# File 'lib/moab/bagger.rb', line 80 def delete_tarfile bag_name = bag_pathname.basename bag_parent = bag_pathname.parent tar_pathname = bag_parent.join("#{bag_name}.tar") tar_pathname.delete if tar_pathname.exist? end |
#deposit_group(group_id, source_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag. Return true if successful or nil if the group was not found in the inventory
141 142 143 144 145 146 147 148 149 150 151 152 |
# File 'lib/moab/bagger.rb', line 141 def deposit_group(group_id, source_dir) group = bag_inventory.group(group_id) return nil? if group.nil? || group.files.empty? target_dir = bag_pathname.join('data', group_id) group.path_list.each do |relative_path| source = source_dir.join(relative_path) target = target_dir.join(relative_path) target.parent.mkpath FileUtils.symlink source, target end true end |
#fill_bag(package_mode, source_base_pathname) ⇒ Bagger
Returns Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.
94 95 96 97 98 99 |
# File 'lib/moab/bagger.rb', line 94 def fill_bag(package_mode, source_base_pathname) create_bag_inventory(package_mode) fill_payload(source_base_pathname) create_tagfiles self end |
#fill_payload(source_base_pathname) ⇒ void
This method returns an undefined value.
This method uses Unix hard links in order to greatly speed up the process. Hard links, however, require that the target bag must be created within the same filesystem as the source files
125 126 127 128 129 130 131 132 133 134 135 |
# File 'lib/moab/bagger.rb', line 125 def fill_payload(source_base_pathname) bag_inventory.groups.each do |group| group_id = group.group_id case package_mode when :depositor deposit_group(group_id, source_base_pathname.join(group_id)) when :reconstructor reconstuct_group(group_id, source_base_pathname) end end end |
#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean
Copy all the files listed in the group inventory to the bag. Return true if successful or nil if the group was not found in the inventory
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/moab/bagger.rb', line 158 def reconstuct_group(group_id, storage_object_dir) group = bag_inventory.group(group_id) return nil? if group.nil? || group.files.empty? target_dir = bag_pathname.join('data', group_id) group.files.each do |file| catalog_entry = signature_catalog.signature_hash[file.signature] source = storage_object_dir.join(catalog_entry.storage_path) file.instances.each do |instance| target = target_dir.join(instance.path) target.parent.mkpath FileUtils.symlink source, target end end true end |
#reset_bag ⇒ void
This method returns an undefined value.
Returns Delete any existing bag data and re-initialize the bag directory.
49 50 51 52 53 |
# File 'lib/moab/bagger.rb', line 49 def reset_bag delete_bag delete_tarfile create_bagit_txt end |
#shell_execute(command) ⇒ Object
Executes a system command in a subprocess if command isn’t successful, grabs stdout and stderr and puts them in ruby exception message
269 270 271 272 273 274 275 276 277 278 279 280 281 282 |
# File 'lib/moab/bagger.rb', line 269 def shell_execute(command) require 'open3' stdout, stderr, status = Open3.capture3(command.chomp) if status.success? && status.exitstatus.zero? stdout else msg = "Shell command failed: [#{command}] caused by <STDERR = #{stderr}>" msg << " STDOUT = #{stdout}" if stdout && stdout.length.positive? raise(StandardError, msg) end rescue SystemCallError => e msg = "Shell command failed: [#{command}] caused by #{e.inspect}" raise(StandardError, msg) end |