Class: Moab::Bagger

Inherits:
Object
  • Object
show all
Defined in:
lib/moab/bagger.rb

Overview

Note:

Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.

A class used to create a BagIt package from a version inventory and a set of source files. The #fill_bag method is called with a package_mode parameter that specifies whether the bag is being created for deposit into the repository or is to contain the output of a version reconstruction.

  • In :depositor mode, the version inventory is filtered using the digital object’s signature catalog so that only new files are included

  • In :reconstructor mode, the version inventory and signature catalog are used together to regenerate the complete set of files for the version.

Data Model

  • StorageRepository = represents a digital object repository storage node

    • StorageServices = supports application layer access to the repository’s objects, data, and metadata

    • StorageObject = represents a digital object’s repository storage location and ingest/dissemination methods

      • StorageObjectVersion [1..*] = represents a version subdirectory within an object’s home directory

        • Bagger [1] = utility for creating bagit packages for ingest or dissemination

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(version_inventory, signature_catalog, bag_pathname) ⇒ Bagger



27
28
29
30
31
32
# File 'lib/moab/bagger.rb', line 27

def initialize(version_inventory, signature_catalog, bag_pathname)
  @version_inventory = version_inventory
  @signature_catalog = signature_catalog
  @bag_pathname = Pathname.new(bag_pathname)
  create_bagit_txt()
end

Instance Attribute Details

#bag_inventoryFileInventory



45
46
47
# File 'lib/moab/bagger.rb', line 45

def bag_inventory
  @bag_inventory
end

#bag_pathnamePathname



42
43
44
# File 'lib/moab/bagger.rb', line 42

def bag_pathname
  @bag_pathname
end

#package_modeSymbol



49
50
51
# File 'lib/moab/bagger.rb', line 49

def package_mode
  @package_mode
end

#signature_catalogSignatureCatalog



39
40
41
# File 'lib/moab/bagger.rb', line 39

def signature_catalog
  @signature_catalog
end

#version_inventoryFileInventory



35
36
37
# File 'lib/moab/bagger.rb', line 35

def version_inventory
  @version_inventory
end

Instance Method Details

#create_bag_info_txtvoid

This method returns an undefined value.

Returns Generate the bag-info.txt tag file.



216
217
218
219
220
221
222
# File 'lib/moab/bagger.rb', line 216

def create_bag_info_txt
  @bag_pathname.join("bag-info.txt").open('w') do |f|
    f.puts "External-Identifier: #{@bag_inventory.package_id}"
    f.puts "Payload-Oxum: #{@bag_inventory.byte_count}.#{@bag_inventory.file_count}"
    f.puts "Bag-Size: #{@bag_inventory.human_size}"
  end
end

#create_bag_inventory(package_mode) ⇒ FileInventory

Returns Create, write, and return the inventory of the files that will become the payload.



105
106
107
108
109
110
111
112
113
114
115
116
117
118
# File 'lib/moab/bagger.rb', line 105

def create_bag_inventory(package_mode)
  @package_mode = package_mode
  @bag_pathname.mkpath
  case package_mode
    when :depositor
      @version_inventory.write_xml_file(@bag_pathname, 'version')
      @bag_inventory = @signature_catalog.version_additions(@version_inventory)
      @bag_inventory.write_xml_file(@bag_pathname, 'additions')
    when :reconstructor
      @bag_inventory = @version_inventory
      @bag_inventory.write_xml_file(@bag_pathname, 'version')
  end
  @bag_inventory
end

#create_bagit_txtvoid

This method returns an undefined value.

Returns Generate the bagit.txt tag file.



60
61
62
63
64
65
66
# File 'lib/moab/bagger.rb', line 60

def create_bagit_txt()
  @bag_pathname.mkpath
  @bag_pathname.join("bagit.txt").open('w') do |f|
    f.puts "Tag-File-Character-Encoding: UTF-8"
    f.puts "BagIt-Version: 0.97"
  end
end

#create_payload_manifestsvoid

This method returns an undefined value.

Returns Using the checksum information from the inventory, create BagIt manifest files for the payload.



185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
# File 'lib/moab/bagger.rb', line 185

def create_payload_manifests
  manifest_pathname = Hash.new
  manifest_file = Hash.new
  manifest_types =  [:md5, :sha1, :sha256]
  manifest_types.each do |type|
    manifest_pathname[type] = @bag_pathname.join("manifest-#{type.to_s}.txt")
    manifest_file[type] = manifest_pathname[type].open('w')
  end
  @bag_inventory.groups.each do |group|
    group.files.each do |file|
      fixity = file.signature.fixity
      file.instances.each do |instance|
        data_path = File.join('data', group.group_id, instance.path)
        manifest_types.each do |type|
          manifest_file[type].puts("#{fixity[type]} #{data_path}") if fixity[type]
        end
      end
    end
  end
ensure
  manifest_types.each do |type|
    if manifest_file[type]
      manifest_file[type].close
      manifest_pathname[type].delete if
          manifest_pathname[type].exist? and manifest_pathname[type].size == 0
    end
  end
end

#create_tagfile_manifestsvoid

This method returns an undefined value.

Returns create BagIt tag manifest files containing checksums for all files in the bag’s root directory.



226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
# File 'lib/moab/bagger.rb', line 226

def create_tagfile_manifests()
  manifest_pathname = Hash.new
  manifest_file = Hash.new
  manifest_types =  [:md5, :sha1, :sha256]
  manifest_types.each do |type|
    manifest_pathname[type] = @bag_pathname.join("tagmanifest-#{type.to_s}.txt")
    manifest_file[type] = manifest_pathname[type].open('w')
  end
  @bag_pathname.children.each do |file|
    unless file.directory? || file.basename.to_s[0, 11] == 'tagmanifest'
      signature = FileSignature.new.signature_from_file(file)
      fixity = signature.fixity
      manifest_types.each do |type|
        manifest_file[type].puts("#{fixity[type]} #{file.basename}") if fixity[type]
      end
    end
  end
ensure
  manifest_types.each do |type|
    if manifest_file[type]
      manifest_file[type].close
      manifest_pathname[type].delete if
          manifest_pathname[type].exist? and manifest_pathname[type].size == 0
    end
  end
end

#create_tagfilesBoolean



175
176
177
178
179
180
181
# File 'lib/moab/bagger.rb', line 175

def create_tagfiles
  create_payload_manifests
  create_bag_info_txt
  create_bagit_txt
  create_tagfile_manifests
  true
end

#create_tarfile(tar_pathname = nil) ⇒ Boolean



254
255
256
257
258
259
260
261
262
263
264
265
266
267
# File 'lib/moab/bagger.rb', line 254

def create_tarfile(tar_pathname=nil)
  bag_name = @bag_pathname.basename
  bag_parent = @bag_pathname.parent
  tar_pathname ||= bag_parent.join("#{bag_name}.tar")
  tar_cmd="cd '#{bag_parent}'; tar --dereference --force-local -cf  '#{tar_pathname}' '#{bag_name}'"
  begin
    shell_execute(tar_cmd)
  rescue
    shell_execute(tar_cmd.sub('--force-local',''))
  end
  raise "Unable to create tarfile #{tar_pathname}" unless tar_pathname.exist?
  return true

end

#delete_bagNilClass



69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/moab/bagger.rb', line 69

def delete_bag()
  # make sure this looks like a bag before deleting
  if @bag_pathname.join('bagit.txt').exist?
    if @bag_pathname.join('data').exist?
      @bag_pathname.rmtree
    else
      @bag_pathname.children.each {|file| file.delete}
      @bag_pathname.rmdir
    end
  end
  nil
end

#delete_tarfileObject



83
84
85
86
87
88
# File 'lib/moab/bagger.rb', line 83

def delete_tarfile()
  bag_name = @bag_pathname.basename
  bag_parent = @bag_pathname.parent
  tar_pathname = bag_parent.join("#{bag_name}.tar")
  tar_pathname.delete if tar_pathname.exist?
end

#deposit_group(group_id, source_dir) ⇒ Boolean



141
142
143
144
145
146
147
148
149
150
151
152
# File 'lib/moab/bagger.rb', line 141

def deposit_group(group_id, source_dir)
  group = @bag_inventory.group(group_id)
  return nil? if group.nil? or group.files.empty?
  target_dir = @bag_pathname.join('data',group_id)
  group.path_list.each do |relative_path|
    source = source_dir.join(relative_path)
    target = target_dir.join(relative_path)
    target.parent.mkpath
    FileUtils.symlink source, target
  end
  true
end

#fill_bag(package_mode, source_base_pathname) ⇒ Bagger

Returns Perform all the operations required to fill the bag payload, write the manifests and tagfiles, and checksum the tagfiles.

Examples:



95
96
97
98
99
100
# File 'lib/moab/bagger.rb', line 95

def fill_bag(package_mode, source_base_pathname)
  create_bag_inventory(package_mode)
  fill_payload(source_base_pathname)
  create_tagfiles
  self
end

#fill_payload(source_base_pathname) ⇒ void

This method returns an undefined value.

This method uses Unix hard links in order to greatly speed up the process. Hard links, however, require that the target bag must be created within the same filesystem as the source files



125
126
127
128
129
130
131
132
133
134
135
# File 'lib/moab/bagger.rb', line 125

def fill_payload(source_base_pathname)
  @bag_inventory.groups.each do |group|
    group_id = group.group_id
    case @package_mode
      when :depositor
        deposit_group(group_id, source_base_pathname.join(group_id))
      when :reconstructor
        reconstuct_group(group_id, source_base_pathname)
    end
  end
end

#reconstuct_group(group_id, storage_object_dir) ⇒ Boolean



158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# File 'lib/moab/bagger.rb', line 158

def reconstuct_group(group_id, storage_object_dir)
  group = @bag_inventory.group(group_id)
  return nil? if group.nil? or group.files.empty?
  target_dir = @bag_pathname.join('data',group_id)
  group.files.each do |file|
    catalog_entry = @signature_catalog.signature_hash[file.signature]
    source = storage_object_dir.join(catalog_entry.storage_path)
    file.instances.each do |instance|
      target = target_dir.join(instance.path)
      target.parent.mkpath
      FileUtils.symlink source, target
    end
  end
  true
end

#reset_bagvoid



52
53
54
55
56
# File 'lib/moab/bagger.rb', line 52

def reset_bag
  delete_bag
  delete_tarfile
  create_bagit_txt
end

#shell_execute(command) ⇒ String

Executes a system command in a subprocess. The method will return stdout from the command if execution was successful. The method will raise an exception if if execution fails. The exception’s message will contain the explaination of the failure.



275
276
277
278
279
280
281
282
283
284
285
# File 'lib/moab/bagger.rb', line 275

def shell_execute(command)
  status, stdout, stderr = systemu(command)
  if (status.exitstatus != 0)
    raise stderr
  end
  return stdout
rescue
  msg = "Command failed to execute: [#{command}] caused by <STDERR = #{stderr.split($/).join('; ')}>"
  msg << " STDOUT = #{stdout.split($/).join('; ')}" if (stdout && (stdout.length > 0))
  raise msg
end