Class: Moab::SignatureCatalog

Inherits:
Serializer::Manifest show all
Includes:
HappyMapper
Defined in:
lib/moab/signature_catalog.rb

Overview

Note:

Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.

A digital object’s Signature Catalog is derived from an filtered aggregation of the file inventories of a digital object’s set of versions. (see #update) It has an entry for every file (identified by FileSignature) found in any of the versions, along with a record of the SDR storage location that was used to preserve a single file instance. Once this catalog has been populated, it has multiple uses:

  • The signature index is used to determine which files of a newly submitted object version are new additions and which are duplicates of files previously ingested. (See #version_additions) (When a new version contains a mixture of added files and files carried over from the previous version we only need to store the files from the new version that have unique file signatures.)

  • Reconstruction of an object version (see Moab::StorageObject#reconstruct_version) requires a combination of a full version’s FileInventory and the SignatureCatalog.

  • The catalog can also be used for performing consistency checks between manifest files and storage

Data Model

Examples:

See Also:

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from Serializer::Manifest

read_xml_file, write_xml_file, #write_xml_file, xml_filename, xml_pathname, xml_pathname_exist?

Methods inherited from Serializer::Serializable

#array_to_hash, deep_diff, #diff, #key, #key_name, #summary, #to_hash, #to_json, #to_yaml, #variable_names, #variables

Constructor Details

#initialize(opts = {}) ⇒ SignatureCatalog

Returns a new instance of SignatureCatalog.



34
35
36
37
38
# File 'lib/moab/signature_catalog.rb', line 34

def initialize(opts = {})
  @entries = []
  @signature_hash = {}
  super(opts)
end

Instance Attribute Details

#block_countInteger

Returns The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).

Returns:

  • (Integer)

    The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated)



83
# File 'lib/moab/signature_catalog.rb', line 83

attribute :block_count, Integer, :tag => 'blockCount', :on_save => proc { |t| t.to_s }

#byte_countInteger

Returns The total size (in bytes) of all data files (dynamically calculated).

Returns:

  • (Integer)

    The total size (in bytes) of all data files (dynamically calculated)



75
# File 'lib/moab/signature_catalog.rb', line 75

attribute :byte_count, Integer, :tag => 'byteCount', :on_save => proc { |t| t.to_s }

#catalog_datetimeString

Returns The datetime at which the catalog was updated.

Returns:

  • (String)

    The datetime at which the catalog was updated



55
# File 'lib/moab/signature_catalog.rb', line 55

attribute :catalog_datetime, Time, :tag => 'catalogDatetime'

#digital_object_idString

Returns The object ID (druid).

Returns:

  • (String)

    The object ID (druid)



42
# File 'lib/moab/signature_catalog.rb', line 42

attribute :digital_object_id, String, :tag => 'objectId'

#entriesArray<SignatureCatalogEntry>

Returns The set of data groups comprising the version.

Returns:



97
# File 'lib/moab/signature_catalog.rb', line 97

has_many :entries, SignatureCatalogEntry, :tag => 'entry'

#file_countInteger

Returns The total number of data files (dynamically calculated).

Returns:

  • (Integer)

    The total number of data files (dynamically calculated)



67
# File 'lib/moab/signature_catalog.rb', line 67

attribute :file_count, Integer, :tag => 'fileCount', :on_save => proc { |t| t.to_s }

#signature_hashHash

Returns An index having FileSignature objects as keys and Moab::SignatureCatalogEntry objects as values.

Returns:



106
107
108
# File 'lib/moab/signature_catalog.rb', line 106

def signature_hash
  @signature_hash
end

#version_idInteger

Returns The ordinal version number.

Returns:

  • (Integer)

    The ordinal version number



46
# File 'lib/moab/signature_catalog.rb', line 46

attribute :version_id, Integer, :tag => 'versionId', :key => true, :on_save => proc { |n| n.to_s }

Instance Method Details

#add_entry(entry) ⇒ void

This method returns an undefined value.

Returns Add a new entry to the catalog and to the #signature_hash index.

Parameters:



111
112
113
114
# File 'lib/moab/signature_catalog.rb', line 111

def add_entry(entry)
  @signature_hash[entry.signature] = entry
  entries << entry
end

#catalog_filepath(file_signature) ⇒ String

Returns The object-relative path of the file having the specified signature.

Parameters:

  • file_signature (FileSignature)

    The signature of the file whose path is sought

Returns:

  • (String)

    The object-relative path of the file having the specified signature



118
119
120
121
122
123
124
125
# File 'lib/moab/signature_catalog.rb', line 118

def catalog_filepath(file_signature)
  catalog_entry = @signature_hash[file_signature]
  if catalog_entry.nil?
    msg = "catalog entry not found for #{file_signature.fixity.inspect} in #{@digital_object_id} - #{@version_id}"
    raise FileNotFoundException, msg
  end
  catalog_entry.storage_path
end

#composite_keyString

Returns The unique identifier concatenating digital object id with version id.

Returns:

  • (String)

    The unique identifier concatenating digital object id with version id



49
50
51
# File 'lib/moab/signature_catalog.rb', line 49

def composite_key
  @digital_object_id + '-' + StorageObject.version_dirname(@version_id)
end

#normalize_group_signatures(group, group_pathname = nil) ⇒ void

This method returns an undefined value.

Returns Inspect and upgrade the group’s signature data to include all desired checksums.

Parameters:

  • group (FileGroup)

    A group of the files from a file inventory

  • group_pathname (Pathname) (defaults to: nil)

    The location of the directory containing the group’s files



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# File 'lib/moab/signature_catalog.rb', line 130

def normalize_group_signatures(group, group_pathname = nil)
  unless group_pathname.nil?
    group_pathname = Pathname(group_pathname)
    raise(MoabRuntimeError, "Could not locate #{group_pathname}") unless group_pathname.exist?
  end
  group.files.each do |file|
    unless file.signature.complete?
      if @signature_hash.key?(file.signature)
        file.signature = @signature_hash.find { |k, _v| k == file.signature }[0]
      elsif group_pathname
        file_pathname = group_pathname.join(file.instances[0].path)
        file.signature = file.signature.normalized_signature(file_pathname)
      end
    end
  end
end

#summary_fieldsArray<String>

Returns The data fields to include in summary reports.

Returns:

  • (Array<String>)

    The data fields to include in summary reports



91
92
93
# File 'lib/moab/signature_catalog.rb', line 91

def summary_fields
  %w[digital_object_id version_id catalog_datetime file_count byte_count block_count]
end

#update(version_inventory, data_pathname) ⇒ void

This method returns an undefined value.

Returns Compares the FileSignature entries in the new versions FileInventory against the signatures in this catalog and create new Moab::SignatureCatalogEntry addtions to the catalog.

Examples:

Parameters:

  • version_inventory (FileInventory)

    The complete inventory of the files comprising a digital object version

  • data_pathname (Pathname)

    The location of the object’s data directory



153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# File 'lib/moab/signature_catalog.rb', line 153

def update(version_inventory, data_pathname)
  version_inventory.groups.each do |group|
    group.files.each do |file|
      unless @signature_hash.key?(file.signature)
        entry = SignatureCatalogEntry.new
        entry.version_id = version_inventory.version_id
        entry.group_id = group.group_id
        entry.path = file.instances[0].path
        if file.signature.complete?
          entry.signature = file.signature
        else
          file_pathname = data_pathname.join(group.group_id, entry.path)
          entry.signature = file.signature.normalized_signature(file_pathname)
        end
        add_entry(entry)
      end
    end
  end
  @version_id = version_inventory.version_id
  @catalog_datetime = Time.now
end

#version_additions(version_inventory) ⇒ FileInventory

Returns Retrurns a filtered copy of the input inventory containing only those files that were added in this version.

Examples:

Parameters:

  • version_inventory (FileInventory)

    The complete inventory of the files comprising a digital object version

Returns:

  • (FileInventory)

    Retrurns a filtered copy of the input inventory containing only those files that were added in this version



180
181
182
183
184
185
186
187
188
189
190
191
# File 'lib/moab/signature_catalog.rb', line 180

def version_additions(version_inventory)
  version_additions = FileInventory.new(:type => 'additions')
  version_additions.copy_ids(version_inventory)
  version_inventory.groups.each do |group|
    group_addtions = FileGroup.new(:group_id => group.group_id)
    group.files.each do |file|
      group_addtions.add_file_instance(file.signature, file.instances[0]) unless @signature_hash.key?(file.signature)
    end
    version_additions.groups << group_addtions unless group_addtions.files.empty?
  end
  version_additions
end