Class: ROF::Utility

Inherits:
Object
  • Object
show all
Defined in:
lib/rof/utility.rb

Overview

A few common utility methods

Constant Summary collapse

WORK_TYPE_WITH_PREFIX_PATTERN =
/^[Ww]ork(-(.+))?/
WORK_TYPES =

Strictly speaking, a Collection is not a Work- it’s included here to cull out and pass down the batch processing pipeline

{
  # csv name => af-model
  'article' => 'Article',
  'dataset' => 'Dataset',
  'document' => 'Document',
  'collection' => 'Collection',
  'etd' => 'Etd',
  'image' => 'Image',
  'gtar' => 'Gtar',
  'osfarchive' => 'OsfArchive'
}.freeze

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeUtility

Returns a new instance of Utility.



9
10
11
12
# File 'lib/rof/utility.rb', line 9

def initialize
  @seq = 0
  @workdir = '.'
end

Instance Attribute Details

#workdirObject (readonly)

give base directory of given file for workdir



38
39
40
# File 'lib/rof/utility.rb', line 38

def workdir
  @workdir
end

Class Method Details

.check_solr_for_previous(config, osf_project_identifier) ⇒ Object

query SOLR for Previous version of OSF Project. Return its fedora pid if it is found, nil otherwise



92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# File 'lib/rof/utility.rb', line 92

def self.check_solr_for_previous(config, osf_project_identifier)
  solr_url = config.fetch('solr_url', nil)
  return nil if solr_url.nil?
  solr = RSolr.connect url: "#{solr_url}"
  query = solr.get 'select', params: {
    q: "desc_metadata__osf_project_identifier_ssi:#{osf_project_identifier}",
    rows: 1,
    sort_by: 'date_archived',
    fl: ['id'],
    wt: 'json'
  }
  return nil if (query['response']['numFound']).zero?
  # should only be 1 SOLR doc (the most recent) in docs[0]
  query['response']['docs'][0]['id']
end

.file_from_targz(targzfile, file_name) ⇒ Object

read file from gzipped tar archive



109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/rof/utility.rb', line 109

def self.file_from_targz(targzfile, file_name)
  File.open(targzfile, 'rb') do |file|
    Zlib::GzipReader.wrap(file) do |gz|
      Gem::Package::TarReader.new(gz) do |tar|
        tar.seek(file_name) do |file_entry|
          file_dest_dir = File.join(File.dirname(targzfile),
                                    File.dirname(file_entry.full_name))
          FileUtils.mkdir_p(file_dest_dir)
          File.open(File.join(file_dest_dir, File.basename(file_name)), 'wb') do |file_handle|
            file_handle.write(file_entry.read)
          end
        end
        tar.close
      end
    end
  end
end

.has_embargo_date?(embargo_xml) ⇒ Boolean

test for embargo xml cases

Returns:

  • (Boolean)


68
69
70
71
72
# File 'lib/rof/utility.rb', line 68

def self.has_embargo_date?(embargo_xml)
  return false if embargo_xml == '' || embargo_xml.nil?
  return false unless embargo_xml.elements['machine'].has_elements? && embargo_xml.elements['machine'].elements['date'].has_text?
  true
end

.load_items_from_json_file(fname, outfile = STDERR) ⇒ Array

Returns The items in the JSON document, coerced into an Array (if a single item was encountered).

Parameters:

  • fname (String)

    Path to filename

  • outfile (#puts) (defaults to: STDERR)

    Where to write exceptions

Returns:

  • (Array)

    The items in the JSON document, coerced into an Array (if a single item was encountered)



78
79
80
81
82
83
84
85
86
87
88
# File 'lib/rof/utility.rb', line 78

def self.load_items_from_json_file(fname, outfile = STDERR)
  items = nil
  File.open(fname, 'r:UTF-8') do |f|
    items = JSON.parse(f.read)
  end
  items = [items] unless items.is_a? Array
  items
rescue JSON::ParserError => e
  outfile.puts("Error reading #{fname}:#{e}")
  exit!(1)
end

.prop_ds(owner, representative = nil) ⇒ Object

set ‘properties’



58
59
60
61
62
63
64
65
# File 'lib/rof/utility.rb', line 58

def self.prop_ds(owner, representative = nil)
  s = "<fields><depositor>batch_ingest</depositor>\n<owner>#{owner}</owner>\n"
  if representative
    s += "<representative>#{representative}</representative>\n"
  end
  s += "</fields>\n"
  s
end

Instance Method Details

#decode_work_type(obj) ⇒ Object

Given an object’s type, detrmine and return its af-model



41
42
43
44
45
46
47
48
49
50
# File 'lib/rof/utility.rb', line 41

def decode_work_type(obj)
  if obj['type'] =~ WORK_TYPE_WITH_PREFIX_PATTERN
    return 'GenericWork' if Regexp.last_match(2).nil?
    Regexp.last_match(2)
  else
    # this will return nil if key t does not exist
    work_type = obj['type'].downcase
    WORK_TYPES[work_type]
  end
end

#next_labelObject

Issue pid label



53
54
55
# File 'lib/rof/utility.rb', line 53

def next_label
  "$(pid--#{@seq})".tap { |_| @seq += 1 }
end

#set_workdir(filename) ⇒ Object

use base directory of given file for workdir



33
34
35
# File 'lib/rof/utility.rb', line 33

def set_workdir(filename)
  @workdir = File.dirname(filename)
end