Method: Traject::MarcExtractor.parse_string_spec

Defined in:
lib/traject/marc_extractor.rb

.parse_string_spec(spec_string) ⇒ Object

Converts from a string marc spec like "008[35]:245abc:700a" to a hash used internally to represent the specification. See comments at head of class for documentation of string specification format.

Return value

The hash returned is keyed by tag, and has as values an array of 0 or or more MarcExtractor::Spec objects representing the specified extraction operations for that tag.

It's an array of possibly more than one, because you can specify multiple extractions on the same tag: for instance "245a:245abc"

See tests for more examples.



197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
# File 'lib/traject/marc_extractor.rb', line 197

def self.parse_string_spec(spec_string)
  # hash defaults to []
  hash = Hash.new

  spec_strings = spec_string.is_a?(Array) ? spec_string.map{|s| s.split(/\s*:\s*/)}.flatten : spec_string.split(/s*:\s*/)

  spec_strings.each do |part|
    if (part =~ /\A([a-zA-Z0-9]{3})(\|([a-z0-9\ \*]{2})\|)?([a-z0-9]*)?\Z/)
      # variable field
      tag, indicators, subfields = $1, $3, $4

      spec = Spec.new(:tag => tag)

      if subfields and !subfields.empty?
        spec.subfields = subfields.split('')
      end

      if indicators
       # if specified as '*', leave nil
       spec.indicator1 = indicators[0] if indicators[0] != "*"
       spec.indicator2 = indicators[1] if indicators[1] != "*"
      end

      hash[spec.tag] ||= []
      hash[spec.tag] << spec

    elsif (part =~ /\A([a-zA-Z0-9]{3})(\[(\d+)(-(\d+))?\])\Z/) # control field, "005[4-5]"
      tag, byte1, byte2 = $1, $3, $5

      spec = Spec.new(:tag => tag)

      if byte1 && byte2
        spec.bytes = ((byte1.to_i)..(byte2.to_i))
      elsif byte1
       spec.bytes = byte1.to_i
      end

      hash[spec.tag] ||= []
      hash[spec.tag] << spec
    else
      raise ArgumentError.new("Unrecognized marc extract specification: #{part}")
    end
  end

  return hash
end