Module: StdNum::LCCN

Defined in:
lib/library_stdnums.rb

Overview

Validate and and normalize LCCNs

Class Method Summary collapse

Class Method Details

.normalize(rawlccn) ⇒ String?

Parameters:

  • rawlccn (String)

    The possible LCCN to normalize

Returns:

  • (String, nil)

    the normalized LCCN, or nil if it looks malformed



282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
# File 'lib/library_stdnums.rb', line 282

def self.normalize rawlccn
  lccn = reduce_to_basic(rawlccn)
  # If there's a dash in it, deal with that.
  if lccn =~ /^(.*?)\-(.+)/
    pre =  $1
    post = $2
    return nil unless post =~ /^\d+$/ # must be all digits
    lccn = "%s%06d" % [pre, post.to_i]
  end

  if valid?(lccn, true)
    return lccn
  else
    return nil
  end
end

.reduce_to_basic(str) ⇒ String

Get a string ready for processing as an LCCN

Parameters:

  • str (String)

    The possible lccn

Returns:

  • (String)

    The munged string, ready for normalization



272
273
274
275
276
277
# File 'lib/library_stdnums.rb', line 272

def self.reduce_to_basic str
  rv = str.gsub(/\s/, '')  # ditch spaces
  rv.gsub!('http://lccn.loc.gov/', '') # remove URI prefix
  rv.gsub!(/\/.*$/, '') # ditch everything after the first '/' (including the slash)
  return rv
end

.valid?(lccn, preprocessed = false) ⇒ Boolean

The rules for validity according to http://www.loc.gov/marc/lccn-namespace.html#syntax:

A normalized LCCN is a character string eight to twelve characters in length. (For purposes of this description characters are ordered from left to right -- "first" means "leftmost".) The rightmost eight characters are always digits. If the length is 9, then the first character must be alphabetic. If the length is 10, then the first two characters must be either both digits or both alphabetic. If the length is 11, then the first character must be alphabetic and the next two characters must be either both digits or both alphabetic. If the length is 12, then the first two characters must be alphabetic and the remaining characters digits.

Parameters:

  • lccn (String)

    The lccn to attempt to validate

  • preprocessed (Boolean) (defaults to: false)

    Set to true if the number has already been normalized

Returns:

  • (Boolean)

    Whether or not the syntax seems ok



312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
# File 'lib/library_stdnums.rb', line 312

def self.valid? lccn, preprocessed = false
  lccn = normalize(lccn) unless preprocessed
  return false unless lccn
  clean = lccn.gsub(/\-/, '')
  suffix = clean[-8..-1] # "the rightmost eight characters are always digits"
  return false unless suffix and suffix =~ /^\d+$/
  case clean.size # "...is a character string eight to twelve digits in length"
  when 8
    return true
  when 9
    return true if clean =~ /^[A-Za-z]/
  when 10
    return true if clean =~ /^\d{2}/ or clean =~ /^[A-Za-z]{2}/
  when 11
    return true if clean =~ /^[A-Za-z](\d{2}|[A-Za-z]{2})/
  when 12
    return true if clean =~ /^[A-Za-z]{2}\d{2}/
  else
    return false
  end

  return false
end