Module: PennMARC::Util

Included in:
Helper
Defined in:
lib/pennmarc/util.rb

Overview

class to hold “utility” methods used in MARC parsing methods

Instance Method Summary collapse

Instance Method Details

#datafield_and_linked_alternate(record, tag) ⇒ Array

Returns the non-6,8 subfields from a datafield and its 880 link.



154
155
156
157
158
# File 'lib/pennmarc/util.rb', line 154

def datafield_and_linked_alternate(record, tag)
  record.fields(tag).filter_map { |field|
    join_subfields(field, &subfield_not_in?(%w[6 8]))
  } + linked_alternate_not_6_or_8(record, tag)
end

#join_and_squish(array) ⇒ String

Join array and normalizing extraneous spaces



179
180
181
# File 'lib/pennmarc/util.rb', line 179

def join_and_squish(array)
  array.join(' ').squish
end

#join_subfields(field, &selector) ⇒ String

Join subfields from a field selected based on a provided proc



12
13
14
15
16
17
18
19
20
21
# File 'lib/pennmarc/util.rb', line 12

def join_subfields(field, &selector)
  return '' unless field

  field.select(&selector).filter_map { |sf|
    value = sf.value&.strip
    next if value.blank?

    value
  }.join(' ').squish
end

#linked_alternate(record, subfield6_value, &selector) ⇒ Array

MARC 880 field “Alternate Graphic Representation” contains text “linked” to another field (e.g., 254 [Title]) used as an alternate representation. Often used to hold translations of title values. A common need is to extract subfields as selected by passed-in block from 880 datafield that has a particular subfield 6 value. See: www.loc.gov/marc/bibliographic/bd880.html



129
130
131
132
133
134
135
# File 'lib/pennmarc/util.rb', line 129

def linked_alternate(record, subfield6_value, &selector)
  record.fields('880').filter_map do |field|
    next unless subfield_value?(field, '6', /^#{Array.wrap(subfield6_value).join('|')}/)

    field.select(&selector).map(&:value).join(' ')
  end
end

#linked_alternate_not_6_or_8(record, subfield6_value) ⇒ Array

Common case of wanting to extract all the subfields besides 6 or 8, from 880 datafield that has a particular subfield 6 value. We exclude 6 because that value is the linkage ID itself and 8 because… IDK



143
144
145
146
147
148
# File 'lib/pennmarc/util.rb', line 143

def linked_alternate_not_6_or_8(record, subfield6_value)
  excluded_subfields = %w[6 8]
  linked_alternate(record, subfield6_value) do |sf|
    excluded_subfields.exclude?(sf.code)
  end
end

#prefixed_subject_and_alternate(record, prefix) ⇒ Array

Note:

11/2018: do not display $5 in PRO or CHR subjs

Get 650 & 880 for Provenance and Chronology: prefix should be ‘PRO’ or ‘CHR’ and may be preceded by a ‘%’



217
218
219
220
221
222
223
224
225
226
227
228
229
# File 'lib/pennmarc/util.rb', line 217

def prefixed_subject_and_alternate(record, prefix)
  record.fields(%w[650 880]).filter_map do |field|
    next unless field.indicator2 == '4'

    next if field.tag == '880' && subfield_values(field, '6').exclude?('650')

    next unless field.any? { |sf| sf.code == 'a' && sf.value =~ /^(#{prefix}|%#{prefix})/ }

    elements = field.select(&subfield_in?(%w[a])).map { |sf| sf.value.gsub(/^%?#{prefix}/, '') }
    elements << join_subfields(field, &subfield_not_in?(%w[a 6 8 e w 5]))
    join_and_squish elements
  end
end

#remove_paren_value_from_subfield_i(field) ⇒ String

If there’s a subfield i, extract its value, and if there’s something in parentheses in that value, extract that.



187
188
189
190
191
192
193
194
195
196
197
198
199
# File 'lib/pennmarc/util.rb', line 187

def remove_paren_value_from_subfield_i(field)
  val = field.filter_map { |sf|
    next unless sf.code == 'i'

    match = /\((.+?)\)/.match(sf.value)
    if match
      sf.value.sub("(#{match[1]})", '')
    else
      sf.value
    end
  }.first || ''
  trim_trailing(:colon, trim_trailing(:period, val))
end

#subfield_defined?(field, subfield) ⇒ TrueClass, FalseClass

Check if a field has a given subfield defined



71
72
73
# File 'lib/pennmarc/util.rb', line 71

def subfield_defined?(field, subfield)
  field.any? { |sf| sf.code == subfield.to_s }
end

#subfield_in?(array) ⇒ Proc

returns a lambda checking if passed-in subfield’s code is a member of array



56
57
58
# File 'lib/pennmarc/util.rb', line 56

def subfield_in?(array)
  ->(subfield) { array.member?(subfield.code) }
end

#subfield_not_in?(array) ⇒ Proc

returns a lambda checking if passed-in subfield’s code is NOT a member of array



63
64
65
# File 'lib/pennmarc/util.rb', line 63

def subfield_not_in?(array)
  ->(subfield) { !array.member?(subfield.code) }
end

#subfield_undefined?(field, subfield) ⇒ TrueClass, FalseClass

Check if a field does not have a given subfield defined



79
80
81
# File 'lib/pennmarc/util.rb', line 79

def subfield_undefined?(field, subfield)
  field.none? { |sf| sf.code == subfield.to_s }
end

#subfield_value?(field, subfield, regex) ⇒ TrueClass, FalseClass

TODO:

example usage

returns true if field has a value that matches passed-in regex and passed in subfield



30
31
32
# File 'lib/pennmarc/util.rb', line 30

def subfield_value?(field, subfield, regex)
  field&.any? { |sf| sf.code == subfield.to_s && sf.value =~ regex }
end

#subfield_value_in?(field, subfield, array) ⇒ TrueClass, FalseClass

returns true if a given field has a given subfield value in a given array TODO: example usage



40
41
42
# File 'lib/pennmarc/util.rb', line 40

def subfield_value_in?(field, subfield, array)
  field.any? { |sf| sf.code == subfield.to_s && sf.value.in?(array) }
end

#subfield_value_not_in?(field, subfield, array) ⇒ TrueClass, FalseClass

returns true if a given field does not have a given subfield value in a given array



49
50
51
# File 'lib/pennmarc/util.rb', line 49

def subfield_value_not_in?(field, subfield, array)
  field.none? { |sf| sf.code == subfield.to_s && sf.value.in?(array) }
end

#subfield_values(field, subfield) ⇒ Array

Gets all subfield values for a subfield in a given field



87
88
89
90
91
92
93
94
95
# File 'lib/pennmarc/util.rb', line 87

def subfield_values(field, subfield)
  field.filter_map do |sf|
    next unless sf.code == subfield.to_s

    next if sf.value.blank?

    sf.value
  end
end

#subfield_values_for(tag:, subfield:, record:) ⇒ Array

Get all subfield values for a provided subfield from any occurrence of a provided tag/tags



102
103
104
105
106
# File 'lib/pennmarc/util.rb', line 102

def subfield_values_for(tag:, subfield:, record:)
  record.fields(tag).flat_map do |field|
    subfield_values field, subfield
  end
end

#substring_after(string, target) ⇒ String (frozen)

Get the substring of a string after the first occurrence of a target character



172
173
174
# File 'lib/pennmarc/util.rb', line 172

def substring_after(string, target)
  string.scan(target).present? ? string.split(target, 2).second : ''
end

#substring_before(string, target) ⇒ String (frozen)

Get the substring of a string up to a given target character



164
165
166
# File 'lib/pennmarc/util.rb', line 164

def substring_before(string, target)
  string.scan(target).present? ? string.split(target, 2).first : ''
end

#translate_relator(relator_code, mapping) ⇒ String, NilClass

TODO:

handle case of receiving a URI? E.g., loc.gov/relator/aut

Translate a relator code using mapping



206
207
208
209
210
# File 'lib/pennmarc/util.rb', line 206

def translate_relator(relator_code, mapping)
  return if relator_code.blank?

  mapping[relator_code.to_sym]
end

#trim_trailing(trailer, string) ⇒ Object



110
111
112
113
114
115
116
117
118
# File 'lib/pennmarc/util.rb', line 110

def trim_trailing(trailer, string)
  map = { semicolon: /\s*;\s*$/,
          colon: /\s*:\s*$/,
          equal: /=$/,
          slash: %r{\s*/\s*$},
          comma: /\s*,\s*$/,
          period: /\.\s*$/ } # TODO: revise to exclude "etc."
  string.sub map[trailer.to_sym], ''
end

#valid_subject_genre_source_code?(field) ⇒ Boolean

Does the given field specify an allowed source code?



235
236
237
# File 'lib/pennmarc/util.rb', line 235

def valid_subject_genre_source_code?(field)
  subfield_value_in?(field, '2', PennMARC::HeadingControl::ALLOWED_SOURCE_CODES)
end