Module: StdNum::LCCN

Defined in:
lib/library_stdnums.rb

Overview

Validate and and normalize LCCNs

Class Method Summary collapse

Class Method Details

.normalize(rawlccn) ⇒ String?

Parameters:

  • rawlccn (String)

    The possible LCCN to normalize

Returns:

  • (String, nil)

    the normalized LCCN, or nil if it looks malformed



259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
# File 'lib/library_stdnums.rb', line 259

def self.normalize rawlccn
  lccn = reduce_to_basic(rawlccn)
  # If there's a dash in it, deal with that.
  if lccn =~ /^(.*?)\-(.+)/
    pre =  $1
    post = $2
    return nil unless post =~ /^\d+$/ # must be all digits
    lccn = "%s%06d" % [pre, post.to_i]
  end

  if valid?(lccn, true)
    return lccn
  else
    return nil
  end
end

.reduce_to_basic(str) ⇒ String

Get a string ready for processing as an LCCN

Parameters:

  • str (String)

    The possible lccn

Returns:

  • (String)

    The munged string, ready for normalization



249
250
251
252
253
254
# File 'lib/library_stdnums.rb', line 249

def self.reduce_to_basic str
  rv = str.gsub(/\s/, '')  # ditch spaces
  rv.gsub!('http://lccn.loc.gov/', '') # remove URI prefix
  rv.gsub!(/\/.*$/, '') # ditch everything after the first '/' (including the slash)
  return rv
end

.valid?(lccn, preprocessed = false) ⇒ Boolean

The rules for validity according to http://www.loc.gov/marc/lccn-namespace.html#syntax:

A normalized LCCN is a character string eight to twelve characters in length. (For purposes of this description characters are ordered from left to right -- "first" means "leftmost".) The rightmost eight characters are always digits. If the length is 9, then the first character must be alphabetic. If the length is 10, then the first two characters must be either both digits or both alphabetic. If the length is 11, then the first character must be alphabetic and the next two characters must be either both digits or both alphabetic. If the length is 12, then the first two characters must be alphabetic and the remaining characters digits.

Parameters:

  • lccn (String)

    The lccn to attempt to validate

  • preprocessed (Boolean) (defaults to: false)

    Set to true if the number has already been normalized

Returns:

  • (Boolean)

    Whether or not the syntax seems ok



289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
# File 'lib/library_stdnums.rb', line 289

def self.valid? lccn, preprocessed = false
  lccn = normalize(lccn) unless preprocessed
  return false unless lccn
  clean = lccn.gsub(/\-/, '')
  suffix = clean[-8..-1] # "the rightmost eight characters are always digits"
  return false unless suffix and suffix =~ /^\d+$/
  case clean.size # "...is a character string eight to twelve digits in length"
  when 8
    return true
  when 9
    return true if clean =~ /^[A-Za-z]/
  when 10
    return true if clean =~ /^\d{2}/ or clean =~ /^[A-Za-z]{2}/
  when 11
    return true if clean =~ /^[A-Za-z](\d{2}|[A-Za-z]{2})/
  when 12
    return true if clean =~ /^[A-Za-z]{2}\d{2}/
  else
    return false
  end

  return false
end