Class: Moab::SignatureCatalog
- Inherits:
-
Manifest
- Object
- Manifest
- Moab::SignatureCatalog
- Includes:
- HappyMapper
- Defined in:
- lib/moab/signature_catalog.rb
Overview
Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.
A digital object’s Signature Catalog is derived from an filtered aggregation of the file inventories of a digital object’s set of versions. (see #update) It has an entry for every file (identified by FileSignature) found in any of the versions, along with a record of the SDR storage location that was used to preserve a single file instance. Once this catalog has been populated, it has multiple uses:
-
The signature index is used to determine which files of a newly submitted object version are new additions and which are duplicates of files previously ingested. (See #version_additions) (When a new version contains a mixture of added files and files carried over from the previous version we only need to store the files from the new version that have unique file signatures.)
-
Reconstruction of an object version (see Moab::StorageObject#reconstruct_version) requires a combination of a full version’s FileInventory and the SignatureCatalog.
-
The catalog can also be used for performing consistency checks between manifest files and storage
Data Model
-
SignatureCatalog = lookup table containing a cumulative collection of all files ever ingested
-
SignatureCatalogEntry [1..*] = an row in the lookup table containing storage information about a single file
-
FileSignature [1] = file fixity information
-
-
Instance Attribute Summary collapse
-
#block_count ⇒ Integer
The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).
-
#byte_count ⇒ Integer
The total size (in bytes) of all data files (dynamically calculated).
-
#catalog_datetime ⇒ String
The datetime at which the catalog was updated.
-
#digital_object_id ⇒ String
The object ID (druid).
-
#entries ⇒ Array<SignatureCatalogEntry>
The set of data groups comprising the version.
-
#file_count ⇒ Integer
The total number of data files (dynamically calculated).
-
#signature_hash ⇒ Hash
An index having FileSignature objects as keys and SignatureCatalogEntry objects as values.
-
#version_id ⇒ Integer
The ordinal version number.
Instance Method Summary collapse
-
#add_entry(entry) ⇒ void
Add a new entry to the catalog and to the #signature_hash index.
-
#catalog_filepath(file_signature) ⇒ String
The object-relative path of the file having the specified signature.
-
#composite_key ⇒ String
The unique identifier concatenating digital object id with version id.
-
#initialize(opts = {}) ⇒ SignatureCatalog
constructor
A new instance of SignatureCatalog.
-
#normalize_group_signatures(group, group_pathname = nil) ⇒ void
Inspect and upgrade the group’s signature data to include all desired checksums.
-
#summary_fields ⇒ Array<String>
The data fields to include in summary reports.
-
#update(version_inventory, data_pathname) ⇒ void
Compares the FileSignature entries in the new versions FileInventory against the signatures in this catalog and create new SignatureCatalogEntry addtions to the catalog.
-
#version_additions(version_inventory) ⇒ FileInventory
Retrurns a filtered copy of the input inventory containing only those files that were added in this version.
Constructor Details
#initialize(opts = {}) ⇒ SignatureCatalog
35 36 37 38 39 |
# File 'lib/moab/signature_catalog.rb', line 35 def initialize(opts={}) @entries = Array.new @signature_hash = Hash.new super(opts) end |
Instance Attribute Details
#block_count ⇒ Integer
84 |
# File 'lib/moab/signature_catalog.rb', line 84 attribute :block_count, Integer, :tag => 'blockCount', :on_save => Proc.new {|t| t.to_s} |
#byte_count ⇒ Integer
76 |
# File 'lib/moab/signature_catalog.rb', line 76 attribute :byte_count, Integer, :tag => 'byteCount', :on_save => Proc.new {|t| t.to_s} |
#catalog_datetime ⇒ String
56 |
# File 'lib/moab/signature_catalog.rb', line 56 attribute :catalog_datetime, Time, :tag => 'catalogDatetime' |
#digital_object_id ⇒ String
43 |
# File 'lib/moab/signature_catalog.rb', line 43 attribute :digital_object_id, String, :tag => 'objectId' |
#entries ⇒ Array<SignatureCatalogEntry>
98 |
# File 'lib/moab/signature_catalog.rb', line 98 has_many :entries, SignatureCatalogEntry, :tag => 'entry' |
#file_count ⇒ Integer
68 |
# File 'lib/moab/signature_catalog.rb', line 68 attribute :file_count, Integer, :tag => 'fileCount', :on_save => Proc.new {|t| t.to_s} |
#signature_hash ⇒ Hash
107 108 109 |
# File 'lib/moab/signature_catalog.rb', line 107 def signature_hash @signature_hash end |
#version_id ⇒ Integer
47 |
# File 'lib/moab/signature_catalog.rb', line 47 attribute :version_id, Integer, :tag => 'versionId', :key => true, :on_save => Proc.new {|n| n.to_s} |
Instance Method Details
#add_entry(entry) ⇒ void
This method returns an undefined value.
Returns Add a new entry to the catalog and to the #signature_hash index.
112 113 114 115 |
# File 'lib/moab/signature_catalog.rb', line 112 def add_entry(entry) @signature_hash[entry.signature] = entry entries << entry end |
#catalog_filepath(file_signature) ⇒ String
Returns The object-relative path of the file having the specified signature.
119 120 121 122 123 |
# File 'lib/moab/signature_catalog.rb', line 119 def catalog_filepath(file_signature) catalog_entry = @signature_hash[file_signature] raise FileNotFoundException, "catalog entry not found for #{file_signature.fixity.inspect} in #{@digital_object_id} - #{@version_id}" if catalog_entry.nil? catalog_entry.storage_path end |
#composite_key ⇒ String
50 51 52 |
# File 'lib/moab/signature_catalog.rb', line 50 def composite_key @digital_object_id + '-' + StorageObject.version_dirname(@version_id) end |
#normalize_group_signatures(group, group_pathname = nil) ⇒ void
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/moab/signature_catalog.rb', line 128 def normalize_group_signatures(group, group_pathname=nil) unless group_pathname.nil? group_pathname = Pathname(group_pathname) raise "Could not locate #{group_pathname}" unless group_pathname.exist? end group.files.each do |file| unless file.signature.complete? if @signature_hash.has_key?(file.signature) file.signature = @signature_hash.find {|k,v| k == file.signature}[0] elsif group_pathname file_pathname = group_pathname.join(file.instances[0].path) file.signature = file.signature.normalized_signature(file_pathname) end end end end |
#summary_fields ⇒ Array<String>
92 93 94 |
# File 'lib/moab/signature_catalog.rb', line 92 def summary_fields %w{digital_object_id version_id catalog_datetime file_count byte_count block_count} end |
#update(version_inventory, data_pathname) ⇒ void
This method returns an undefined value.
Returns Compares the FileSignature entries in the new versions FileInventory against the signatures in this catalog and create new Moab::SignatureCatalogEntry addtions to the catalog.
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
# File 'lib/moab/signature_catalog.rb', line 151 def update(version_inventory, data_pathname) version_inventory.groups.each do |group| group.files.each do |file| unless @signature_hash.has_key?(file.signature) entry = SignatureCatalogEntry.new entry.version_id = version_inventory.version_id entry.group_id = group.group_id entry.path = file.instances[0].path if file.signature.complete? entry.signature = file.signature else file_pathname = data_pathname.join(group.group_id,entry.path) entry.signature = file.signature.normalized_signature(file_pathname) end add_entry(entry) end end end @version_id = version_inventory.version_id @catalog_datetime = Time.now end |
#version_additions(version_inventory) ⇒ FileInventory
Returns Retrurns a filtered copy of the input inventory containing only those files that were added in this version.
178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
# File 'lib/moab/signature_catalog.rb', line 178 def version_additions(version_inventory) version_additions = FileInventory.new(:type=>'additions') version_additions.copy_ids(version_inventory) version_inventory.groups.each do |group| group_addtions = FileGroup.new(:group_id => group.group_id) group.files.each do |file| unless @signature_hash.has_key?(file.signature) group_addtions.add_file_instance(file.signature,file.instances[0]) end end version_additions.groups << group_addtions if group_addtions.files.size > 0 end version_additions end |