Class: Moab::FileGroup

Inherits:
Serializer::Serializable show all
Includes:
HappyMapper
Defined in:
lib/moab/file_group.rb

Overview

Note:

Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.

A container for a standard subset of a digital objects FileManifestation objects Used to segregate depositor content from repository metadata files This is a child element of FileInventory, which contains a full example

Data Model

  • FileInventory = container for recording information about a collection of related files

    • FileGroup [1..*] = subset allow segregation of content and metadata files

      • FileManifestation [1..*] = snapshot of a file’s filesystem characteristics

        • FileSignature [1] = file fixity information

        • FileInstance [1..*] = filepath and timestamp of any physical file having that signature

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from Serializer::Serializable

#array_to_hash, deep_diff, #diff, #key, #key_name, #summary, #to_hash, #to_json, #to_yaml, #variable_names, #variables

Constructor Details

#initialize(opts = {}) ⇒ FileGroup

Returns a new instance of FileGroup.



24
25
26
27
28
29
# File 'lib/moab/file_group.rb', line 24

def initialize(opts = {})
  @signature_hash = {}
  @data_source = ""
  @signatures_from_bag = nil # prevents later warning: instance variable @signatures_from_bag not initialized
  super(opts)
end

Instance Attribute Details

#base_directoryObject

Returns the value of attribute base_directory.



159
160
161
# File 'lib/moab/file_group.rb', line 159

def base_directory
  @base_directory
end

#block_countInteger

Returns The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).

Returns:

  • (Integer)

    The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated)



57
# File 'lib/moab/file_group.rb', line 57

attribute :block_count, Integer, :tag => 'blockCount', :on_save => proc { |i| i.to_s }

#byte_countInteger

Returns The total size (in bytes) of all data files (dynamically calculated).

Returns:

  • (Integer)

    The total size (in bytes) of all data files (dynamically calculated)



49
# File 'lib/moab/file_group.rb', line 49

attribute :byte_count, Integer, :tag => 'byteCount', :on_save => proc { |i| i.to_s }

#data_sourceString

Returns The directory location or other source of this groups file data.

Returns:

  • (String)

    The directory location or other source of this groups file data



37
# File 'lib/moab/file_group.rb', line 37

attribute :data_source, String, :tag => 'dataSource'

#file_countInteger

Returns The total number of data files (dynamically calculated).

Returns:

  • (Integer)

    The total number of data files (dynamically calculated)



41
# File 'lib/moab/file_group.rb', line 41

attribute :file_count, Integer, :tag => 'fileCount', :on_save => proc { |i| i.to_s }

#filesArray<FileManifestation>

Returns The set of files comprising the group.

Returns:



70
# File 'lib/moab/file_group.rb', line 70

has_many :files, FileManifestation, :tag => 'file'

#group_idString

Returns The name of the file group.

Returns:

  • (String)

    The name of the file group



33
# File 'lib/moab/file_group.rb', line 33

attribute :group_id, String, :tag => 'groupId', :key => true

#signature_hashHash<FileSignature, FileManifestation>

Returns The actual in-memory store for the collection of Moab::FileManifestation objects that are contained in this file group.

Returns:



78
79
80
# File 'lib/moab/file_group.rb', line 78

def signature_hash
  @signature_hash
end

Instance Method Details

#add_file(manifestation) ⇒ void

This method returns an undefined value.

Returns Add a single Moab::FileManifestation object to this group.

Parameters:



124
125
126
127
128
# File 'lib/moab/file_group.rb', line 124

def add_file(manifestation)
  manifestation.instances.each do |instance|
    add_file_instance(manifestation.signature, instance)
  end
end

#add_file_instance(signature, instance) ⇒ void

This method returns an undefined value.

Returns Add a single Moab::FileSignature,Moab::FileInstance key/value pair to this group. Data is actually stored in the #signature_hash.

Parameters:

  • signature (FileSignature)

    The signature of the file instance to be added

  • instance (FileInstance)

    The pathname and datetime of the file instance to be added



135
136
137
138
139
140
141
142
143
144
# File 'lib/moab/file_group.rb', line 135

def add_file_instance(signature, instance)
  if signature_hash.key?(signature)
    manifestation = signature_hash[signature]
  else
    manifestation = FileManifestation.new
    manifestation.signature = signature
    signature_hash[signature] = manifestation
  end
  manifestation.instances << instance
end

#add_physical_file(pathname, _validated = nil) ⇒ void

This method returns an undefined value.

Returns Add a single physical file’s data to the array of files in this group. If fixity data was supplied in bag manifests, then utilize that data.

Parameters:

  • pathname (Pathname, String)

    The location of the file to be added

  • _validated (unused; kept here for backwards compatibility) (defaults to: nil)


225
226
227
228
229
230
231
232
233
234
235
# File 'lib/moab/file_group.rb', line 225

def add_physical_file(pathname, _validated = nil)
  pathname = Pathname.new(pathname).expand_path
  instance = FileInstance.new.instance_from_file(pathname, @base_directory)
  if @signatures_from_bag && @signatures_from_bag[pathname]
    signature = @signatures_from_bag[pathname]
    signature = signature.normalized_signature(pathname) unless signature.complete?
  else
    signature = FileSignature.new.signature_from_file(pathname)
  end
  add_file_instance(signature, instance)
end

#group_from_bagit_subdir(directory, signatures_from_bag, recursive = true) ⇒ FileGroup

Returns Harvest a directory (using digest hash for fixity data) and add all files to the file group.

Parameters:

  • directory (Pathame, String)

    The directory whose children are to be added to the file group

  • signatures_from_bag (Hash<Pathname,Signature>)

    The fixity data already calculated for the files

  • recursive (Boolean) (defaults to: true)

    if true, descend into child directories

Returns:

  • (FileGroup)

    Harvest a directory (using digest hash for fixity data) and add all files to the file group



178
179
180
181
# File 'lib/moab/file_group.rb', line 178

def group_from_bagit_subdir(directory, signatures_from_bag, recursive = true)
  @signatures_from_bag = signatures_from_bag
  group_from_directory(directory, recursive)
end

#group_from_directory(directory, recursive = true) ⇒ FileGroup

Returns Harvest a directory and add all files to the file group.

Parameters:

  • directory (Pathname, String)

    The location of the files to harvest

  • recursive (Boolean) (defaults to: true)

    if true, descend into child directories

Returns:

  • (FileGroup)

    Harvest a directory and add all files to the file group



187
188
189
190
191
192
193
194
195
# File 'lib/moab/file_group.rb', line 187

def group_from_directory(directory, recursive = true)
  self.base_directory = directory
  @data_source = @base_directory.to_s
  harvest_directory(directory, recursive)
  self
rescue Exception # Errno::ENOENT
  @data_source = directory.to_s
  self
end

#harvest_directory(path, recursive, validated = nil) ⇒ void

This method returns an undefined value.

Returns Traverse a directory tree and add all files to the file group Note that unlike Find.find and Dir.glob, Pathname passes through symbolic links.

Parameters:

  • path (Pathname, String)

    pathname of the directory to be harvested

  • recursive (Boolean)

    if true, also harvest subdirectories

  • validated (Boolean) (defaults to: nil)

    if true, path is verified to be descendant of (#base_directory)

See Also:



205
206
207
208
209
210
211
212
213
214
215
216
217
218
# File 'lib/moab/file_group.rb', line 205

def harvest_directory(path, recursive, validated = nil)
  pathname = Pathname.new(path).expand_path
  validated ||= is_descendent_of_base?(pathname)
  pathname.children.sort.each do |child|
    next if child.basename.to_s == '.DS_Store'

    if child.directory?
      harvest_directory(child, recursive, validated) if recursive
    else
      add_physical_file(child, validated)
    end
  end
  nil
end

#is_descendent_of_base?(pathname) ⇒ Boolean

Returns Test whether the given path is contained within the #base_directory.

Parameters:

  • pathname (Pathname)

    The file path to be tested

Returns:

  • (Boolean)

    Test whether the given path is contained within the #base_directory

Raises:



164
165
166
167
168
169
170
171
172
# File 'lib/moab/file_group.rb', line 164

def is_descendent_of_base?(pathname)
  raise(MoabRuntimeError, "base_directory has not been set") if @base_directory.nil?

  is_descendent = false
  pathname.expand_path.ascend { |ancestor| is_descendent ||= (ancestor == @base_directory) }
  raise(MoabRuntimeError, "#{pathname} is not a descendent of #{@base_directory}") unless is_descendent

  is_descendent
end

#path_hashHash<String,FileSignature>

Returns An index of file paths, used to test for existence of a filename in this file group.

Returns:

  • (Hash<String,FileSignature>)

    An index of file paths, used to test for existence of a filename in this file group



83
84
85
86
87
88
89
90
91
# File 'lib/moab/file_group.rb', line 83

def path_hash
  path_hash = {}
  signature_hash.each do |signature, manifestation|
    manifestation.instances.each do |instance|
      path_hash[instance.path] = signature
    end
  end
  path_hash
end

#path_hash_subset(signature_subset) ⇒ Hash<String,FileSignature>

Returns A pathname,signature hash containing a subset of the filenames in this file group.

Parameters:

  • signature_subset (Array<FileSignature>)

    The signatures used to select the entries to return

Returns:

  • (Hash<String,FileSignature>)

    A pathname,signature hash containing a subset of the filenames in this file group



101
102
103
104
105
106
107
108
109
110
# File 'lib/moab/file_group.rb', line 101

def path_hash_subset(signature_subset)
  path_hash = {}
  signature_subset.each do |signature|
    manifestation = signature_hash[signature]
    manifestation.instances.each do |instance|
      path_hash[instance.path] = signature
    end
  end
  path_hash
end

#path_listArray<String>

Returns The list of file paths in this group.

Returns:

  • (Array<String>)

    The list of file paths in this group



94
95
96
# File 'lib/moab/file_group.rb', line 94

def path_list
  files.collect { |file| file.instances.collect(&:path) }.flatten
end

#remove_file_having_path(path) ⇒ void

This method returns an undefined value.

for example, the manifest inventory does not contain a file entry for itself

Parameters:

  • path (String)

    The path of the file to be removed



149
150
151
152
# File 'lib/moab/file_group.rb', line 149

def remove_file_having_path(path)
  signature = path_hash[path]
  signature_hash.delete(signature)
end

#summary_fieldsArray<String>

Returns The data fields to include in summary reports.

Returns:

  • (Array<String>)

    The data fields to include in summary reports



64
65
66
# File 'lib/moab/file_group.rb', line 64

def summary_fields
  %w[group_id file_count byte_count block_count]
end