Module: PennMARC::Util
- Included in:
- Helper
- Defined in:
- lib/pennmarc/util.rb
Overview
class to hold “utility” methods used in MARC parsing methods
Instance Method Summary collapse
-
#datafield_and_linked_alternate(record, tag) ⇒ Array
Returns the non-6,8 subfields from a datafield and its 880 link.
-
#join_and_squish(array) ⇒ String
Join array and normalizing extraneous spaces.
-
#join_subfields(field, &selector) ⇒ String
Join subfields from a field selected based on a provided proc.
-
#linked_alternate(record, subfield6_value, &selector) ⇒ Array
MARC 880 field “Alternate Graphic Representation” contains text “linked” to another field (e.g., 254 [Title]) used as an alternate representation.
-
#linked_alternate_not_6_or_8(record, subfield6_value) ⇒ Array
Common case of wanting to extract all the subfields besides 6 or 8, from 880 datafield that has a particular subfield 6 value.
-
#prefixed_subject_and_alternate(record, prefix) ⇒ Array
Get 650 & 880 for Provenance and Chronology: prefix should be ‘PRO’ or ‘CHR’ and may be preceded by a ‘%’.
-
#remove_paren_value_from_subfield_i(field) ⇒ String
If there’s a subfield i, extract its value, and if there’s something in parentheses in that value, extract that.
-
#subfield_defined?(field, subfield) ⇒ TrueClass, FalseClass
Check if a field has a given subfield defined.
-
#subfield_in?(array) ⇒ Proc
returns a lambda checking if passed-in subfield’s code is a member of array.
-
#subfield_not_in?(array) ⇒ Proc
returns a lambda checking if passed-in subfield’s code is NOT a member of array.
-
#subfield_undefined?(field, subfield) ⇒ TrueClass, FalseClass
Check if a field does not have a given subfield defined.
-
#subfield_value?(field, subfield, regex) ⇒ TrueClass, FalseClass
returns true if field has a value that matches passed-in regex and passed in subfield.
-
#subfield_value_in?(field, subfield, array) ⇒ TrueClass, FalseClass
returns true if a given field has a given subfield value in a given array TODO: example usage.
-
#subfield_value_not_in?(field, subfield, array) ⇒ TrueClass, FalseClass
returns true if a given field does not have a given subfield value in a given array.
-
#subfield_values(field, subfield) ⇒ Array
Gets all subfield values for a subfield in a given field.
-
#subfield_values_for(tag:, subfield:, record:) ⇒ Array
Get all subfield values for a provided subfield from any occurrence of a provided tag/tags.
-
#substring_after(string, target) ⇒ String (frozen)
Get the substring of a string after the first occurrence of a target character.
-
#substring_before(string, target) ⇒ String (frozen)
Get the substring of a string up to a given target character.
-
#translate_relator(relator_code, mapping) ⇒ String, NilClass
Translate a relator code using mapping.
- #trim_trailing(trailer, string) ⇒ Object
-
#valid_subject_genre_source_code?(field) ⇒ Boolean
Does the given field specify an allowed source code?.
Instance Method Details
#datafield_and_linked_alternate(record, tag) ⇒ Array
Returns the non-6,8 subfields from a datafield and its 880 link.
154 155 156 157 158 |
# File 'lib/pennmarc/util.rb', line 154 def datafield_and_linked_alternate(record, tag) record.fields(tag).filter_map { |field| join_subfields(field, &subfield_not_in?(%w[6 8])) } + linked_alternate_not_6_or_8(record, tag) end |
#join_and_squish(array) ⇒ String
Join array and normalizing extraneous spaces
179 180 181 |
# File 'lib/pennmarc/util.rb', line 179 def join_and_squish(array) array.join(' ').squish end |
#join_subfields(field, &selector) ⇒ String
Join subfields from a field selected based on a provided proc
12 13 14 15 16 17 18 19 20 21 |
# File 'lib/pennmarc/util.rb', line 12 def join_subfields(field, &selector) return '' unless field field.select(&selector).filter_map { |sf| value = sf.value&.strip next if value.blank? value }.join(' ').squish end |
#linked_alternate(record, subfield6_value, &selector) ⇒ Array
MARC 880 field “Alternate Graphic Representation” contains text “linked” to another field (e.g., 254 [Title]) used as an alternate representation. Often used to hold translations of title values. A common need is to extract subfields as selected by passed-in block from 880 datafield that has a particular subfield 6 value. See: www.loc.gov/marc/bibliographic/bd880.html
129 130 131 132 133 134 135 |
# File 'lib/pennmarc/util.rb', line 129 def linked_alternate(record, subfield6_value, &selector) record.fields('880').filter_map do |field| next unless subfield_value?(field, '6', /^#{Array.wrap(subfield6_value).join('|')}/) field.select(&selector).map(&:value).join(' ') end end |
#linked_alternate_not_6_or_8(record, subfield6_value) ⇒ Array
Common case of wanting to extract all the subfields besides 6 or 8, from 880 datafield that has a particular subfield 6 value. We exclude 6 because that value is the linkage ID itself and 8 because… IDK
143 144 145 146 147 148 |
# File 'lib/pennmarc/util.rb', line 143 def linked_alternate_not_6_or_8(record, subfield6_value) excluded_subfields = %w[6 8] linked_alternate(record, subfield6_value) do |sf| excluded_subfields.exclude?(sf.code) end end |
#prefixed_subject_and_alternate(record, prefix) ⇒ Array
11/2018: do not display $5 in PRO or CHR subjs
Get 650 & 880 for Provenance and Chronology: prefix should be ‘PRO’ or ‘CHR’ and may be preceded by a ‘%’
217 218 219 220 221 222 223 224 225 226 227 228 229 |
# File 'lib/pennmarc/util.rb', line 217 def prefixed_subject_and_alternate(record, prefix) record.fields(%w[650 880]).filter_map do |field| next unless field.indicator2 == '4' next if field.tag == '880' && subfield_values(field, '6').exclude?('650') next unless field.any? { |sf| sf.code == 'a' && sf.value =~ /^(#{prefix}|%#{prefix})/ } elements = field.select(&subfield_in?(%w[a])).map { |sf| sf.value.gsub(/^%?#{prefix}/, '') } elements << join_subfields(field, &subfield_not_in?(%w[a 6 8 e w 5])) join_and_squish elements end end |
#remove_paren_value_from_subfield_i(field) ⇒ String
If there’s a subfield i, extract its value, and if there’s something in parentheses in that value, extract that.
187 188 189 190 191 192 193 194 195 196 197 198 199 |
# File 'lib/pennmarc/util.rb', line 187 def remove_paren_value_from_subfield_i(field) val = field.filter_map { |sf| next unless sf.code == 'i' match = /\((.+?)\)/.match(sf.value) if match sf.value.sub("(#{match[1]})", '') else sf.value end }.first || '' trim_trailing(:colon, trim_trailing(:period, val)) end |
#subfield_defined?(field, subfield) ⇒ TrueClass, FalseClass
Check if a field has a given subfield defined
71 72 73 |
# File 'lib/pennmarc/util.rb', line 71 def subfield_defined?(field, subfield) field.any? { |sf| sf.code == subfield.to_s } end |
#subfield_in?(array) ⇒ Proc
returns a lambda checking if passed-in subfield’s code is a member of array
56 57 58 |
# File 'lib/pennmarc/util.rb', line 56 def subfield_in?(array) ->(subfield) { array.member?(subfield.code) } end |
#subfield_not_in?(array) ⇒ Proc
returns a lambda checking if passed-in subfield’s code is NOT a member of array
63 64 65 |
# File 'lib/pennmarc/util.rb', line 63 def subfield_not_in?(array) ->(subfield) { !array.member?(subfield.code) } end |
#subfield_undefined?(field, subfield) ⇒ TrueClass, FalseClass
Check if a field does not have a given subfield defined
79 80 81 |
# File 'lib/pennmarc/util.rb', line 79 def subfield_undefined?(field, subfield) field.none? { |sf| sf.code == subfield.to_s } end |
#subfield_value?(field, subfield, regex) ⇒ TrueClass, FalseClass
example usage
returns true if field has a value that matches passed-in regex and passed in subfield
30 31 32 |
# File 'lib/pennmarc/util.rb', line 30 def subfield_value?(field, subfield, regex) field&.any? { |sf| sf.code == subfield.to_s && sf.value =~ regex } end |
#subfield_value_in?(field, subfield, array) ⇒ TrueClass, FalseClass
returns true if a given field has a given subfield value in a given array TODO: example usage
40 41 42 |
# File 'lib/pennmarc/util.rb', line 40 def subfield_value_in?(field, subfield, array) field.any? { |sf| sf.code == subfield.to_s && sf.value.in?(array) } end |
#subfield_value_not_in?(field, subfield, array) ⇒ TrueClass, FalseClass
returns true if a given field does not have a given subfield value in a given array
49 50 51 |
# File 'lib/pennmarc/util.rb', line 49 def subfield_value_not_in?(field, subfield, array) field.none? { |sf| sf.code == subfield.to_s && sf.value.in?(array) } end |
#subfield_values(field, subfield) ⇒ Array
Gets all subfield values for a subfield in a given field
87 88 89 90 91 92 93 94 95 |
# File 'lib/pennmarc/util.rb', line 87 def subfield_values(field, subfield) field.filter_map do |sf| next unless sf.code == subfield.to_s next if sf.value.blank? sf.value end end |
#subfield_values_for(tag:, subfield:, record:) ⇒ Array
Get all subfield values for a provided subfield from any occurrence of a provided tag/tags
102 103 104 105 106 |
# File 'lib/pennmarc/util.rb', line 102 def subfield_values_for(tag:, subfield:, record:) record.fields(tag).flat_map do |field| subfield_values field, subfield end end |
#substring_after(string, target) ⇒ String (frozen)
Get the substring of a string after the first occurrence of a target character
172 173 174 |
# File 'lib/pennmarc/util.rb', line 172 def substring_after(string, target) string.scan(target).present? ? string.split(target, 2).second : '' end |
#substring_before(string, target) ⇒ String (frozen)
Get the substring of a string up to a given target character
164 165 166 |
# File 'lib/pennmarc/util.rb', line 164 def substring_before(string, target) string.scan(target).present? ? string.split(target, 2).first : '' end |
#translate_relator(relator_code, mapping) ⇒ String, NilClass
handle case of receiving a URI? E.g., loc.gov/relator/aut
Translate a relator code using mapping
206 207 208 209 210 |
# File 'lib/pennmarc/util.rb', line 206 def translate_relator(relator_code, mapping) return if relator_code.blank? mapping[relator_code.to_sym] end |
#trim_trailing(trailer, string) ⇒ Object
110 111 112 113 114 115 116 117 118 |
# File 'lib/pennmarc/util.rb', line 110 def trim_trailing(trailer, string) map = { semicolon: /\s*;\s*$/, colon: /\s*:\s*$/, equal: /=$/, slash: %r{\s*/\s*$}, comma: /\s*,\s*$/, period: /\.\s*$/ } # TODO: revise to exclude "etc." string.sub map[trailer.to_sym], '' end |
#valid_subject_genre_source_code?(field) ⇒ Boolean
Does the given field specify an allowed source code?
235 236 237 |
# File 'lib/pennmarc/util.rb', line 235 def valid_subject_genre_source_code?(field) subfield_value_in?(field, '2', PennMARC::HeadingControl::ALLOWED_SOURCE_CODES) end |