Module: OcflTools::Utils

Defined in:
lib/ocfl_tools/utils.rb,
lib/ocfl_tools/utils_file.rb,
lib/ocfl_tools/utils_inventory.rb

Defined Under Namespace

Modules: Files, Inventory

Class Method Summary collapse

Class Method Details

.compare_hash_checksums(disk_checksums:, inventory_checksums:, results: OcflTools::OcflResults.new, context: 'verify_checksums') ⇒ Object

Parameters:

  • disk_checksums (Hash)

    first hash of [ filepath => digest ] to compare.

  • inventory_checksums (Hash)

    second hash of [ filepath => digest ] to compare.

  • results (OcflTools::OcflResults) (defaults to: OcflTools::OcflResults.new)

    optional results instance to put results into.



82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/ocfl_tools/utils.rb', line 82

def self.compare_hash_checksums(disk_checksums:, inventory_checksums:, results: OcflTools::OcflResults.new, context: 'verify_checksums')
  unless results.is_a?(OcflTools::OcflResults)
    raise 'You need to give me a results instance!'
  end

  # 1st check! If everything is perfect, these two Hashs SHOULD BE IDENTICAL!
  if inventory_checksums == disk_checksums
    results.ok('O200', context, 'All digests successfully verified.')
    return results
  end

  # If they are NOT the same, we have to increment thru the Hashes to work out what's up.
  # It might be a file in the manifest that's not found on disk
  # Or a file on disk that's not in the manifest.
  # Or a file that is on disk and in the manifest, but the checksums don't match.

  disk_files       = disk_checksums.keys
  inventory_files  = inventory_checksums.keys

  missing_from_inventory = disk_files - inventory_files
  missing_from_disk      = inventory_files - disk_files

  unless missing_from_inventory.empty?
    missing_from_inventory.each do |missing|
      results.error('E111', context, "#{missing} found on disk but missing from inventory.json.")
    end
  end

  unless missing_from_disk.empty?
    missing_from_disk.each do |missing|
      results.error('E111', context, "#{missing} in inventory but not found on disk.")
    end
  end

  # checksum mismatches; requires the file to be in both hashes, so.
  inventory_checksums.each do |file, digest|
    next unless disk_checksums.key?(file)

    if disk_checksums[file] != digest
      results.error('E111', context, "#{file} digest in inventory does not match digest computed from disk")
    end
  end
  results
end

.deep_copy(o) ⇒ Object

We sometimes need to make deep (not shallow) copies of objects, mostly hashes. When we are copying state from a prior version, we don’t want our copy to still be mutable by that prior version hash. So a deep (serialized) copy is called for.

Parameters:

  • o (Object)

    object to make a deep copy of.

Returns:

  • (Object)

    a new object with no links to the previous one.



26
27
28
29
# File 'lib/ocfl_tools/utils.rb', line 26

def self.deep_copy(o)
  # We need this serialize Hashes so they don't shallow'y refer to each other.
  Marshal.load(Marshal.dump(o))
end

.generate_file_digest(file, digest) ⇒ String

Given a fully-resolvable file path, calculate and return @digest.

Parameters:

  • file (String)

    fully-resolvable filesystem path to a file.

  • digest (String)

    to encode file with.

Returns:

  • (String)

    checksum of requested file using specified digest algorithm.



35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# File 'lib/ocfl_tools/utils.rb', line 35

def self.generate_file_digest(file, digest)
  case digest
  when 'md5'
  # checksum = Digest::MD5.hexdigest(File.read(file))
    computed_hash = Digest::MD5.new
    open(file) do |s|
      while chunk=s.read(8096)
        computed_hash.update chunk
      end
    end
    return "#{computed_hash}" # return as a String, not a Digest object.
  when 'sha1'
  #  checksum = Digest::SHA1.hexdigest(File.read(file))
    computed_hash = Digest::SHA1.new
    open(file) do |s|
      while chunk=s.read(8096)
        computed_hash.update chunk
      end
    end
    return "#{computed_hash}" # return as a String, not a Digest object.
  when 'sha256'
  # checksum = Digest::SHA256.hexdigest(File.read(file))
    computed_hash = Digest::SHA256.new
    open(file) do |s|
      while chunk=s.read(8096)
        computed_hash.update chunk
      end
    end
    return "#{computed_hash}" # return as a String, not a Digest object.
  when 'sha512'
  #  checksum = Digest::SHA512.hexdigest(File.read(file))
    computed_hash = Digest::SHA512.new
    open(file) do |s|
      while chunk=s.read(8096)
        computed_hash.update chunk
      end
    end
    return "#{computed_hash}" # return as a String, not a Digest object.
  else
    raise 'Unknown digest type!'
  end
  checksum
end

.version_int_to_string(version) ⇒ String

converts [Integer] version to [String] v0001 format. Adjust VERSION_FORMAT to format string version to local needs.

Returns:

  • (String)

    of version in desired format, starting with ‘v’.



8
9
10
# File 'lib/ocfl_tools/utils.rb', line 8

def self.version_int_to_string(version)
  result = OcflTools.config.version_format % version.to_i
end

.version_string_to_int(version_name) ⇒ Integer

converts [String] version name to [Integer]. OCFL spec requires string versions to start with ‘v’. Chop off the ‘v’ at th start, make into integer.

Parameters:

  • version_name (String)

    string to convert to an integer.

Returns:

  • (Integer)

    the version as an integer.



17
18
19
# File 'lib/ocfl_tools/utils.rb', line 17

def self.version_string_to_int(version_name)
  result = version_name.split('v')[1].to_i
end