marc4j4r

A ruby wrapper around the marc4j.jar (as forked by javamarc) java library for dealing with library MARC data.

Note: rdoc.info doesn’t do a great job with this; I think it’s getting confused by the java stuff. Here’s a list of links for all the classes:

Getting a MARC reader

marc4j4r provides three readers out of the box: :strictmarc (binary), :permissivemarc (:binary), :marcxml (MARC-XML), or :alephsequential (Ex Libris’s AlephSequential format).

You can pass either a filename or an open IO object (either ruby or java.io.inputstream)

require 'marc4j4r'

binreader = MARC4J4R::Reader.new('test.mrc') # defaults to :strictmarc
binreader = MARC4J4R::Reader.new('test.mrc', :strictmarc)

permissivereader =  MARC4J4R::Reader.new('test.mrc', :permissivemarc)

xmlreader = MARC4J4R::Reader.new('test.xml', :marcxml)
asreader = MARC4J4R::Reader.new('test.seq', :alephsequential)

# Or use a file object

reader = MARC4J4R::Reader.new(File.open('test.mrc'))

# Or a java.io.inputstream

jurl = Java::java.net.URL.new('http://my.machine.com/test.mrc')
istream = jurl.openConnection.getInputStream
reader = MARC4J4R::Reader.new(istream)

Using the reader

A MARC4J4R::Reader is an Enumerable, so you can do:

reader.each do |record|
  # do stuff with the record
end

Or, if you’re using threach:

reader.threach(2) do |record|
  # do stuff with records in two threads
end

Using the writer

binaryWriter = MARC4J4R::Writer.new(filename, :strictmarc)
xmlWriter    = MARC4J4R::Writer.new(filename, :marcxml)

writer.write(record)
# repeat
writer.close

Working with records and fields

In addition to all the normal marc4j methods, MARC4J4R::Record exposes some additional methods and syntaxes.

See the classes themselves and/or the specs for more examples.

  • MARC4J4R::Reader

  • MARC4J4R::Writer

  • MARC4J4R::Record

  • MARC4J4R::ControlField

  • MARC4J4R::DataField

  • MARC4J4R::SubField

    leader = record.leader
    
    # All fields are available via #each or #fields
    
    fields = record.fields
    
    record.each do |field|
      # do something with each controlfield/datafield; returned in the order they were added
    end
    
    # Controlfields have a tag and a value
    
    idfield = record['001']
    idfield.tag # => '001'
    id = idfield.value # or idfield.data, same thing
    
    # Get the first datafield with a given tag
    first700 = record['700'] # Note: need to use strings, not integers
    
    # Stringify a field to get all the subfields joined with spaces
    
    fullTitle = record['245'].to_s
    
    all700s  = record.find_by_tag '700'
    all700and856s = record.find_by_tag ['700', '856']
    
    # Construct and add a controlfield
    record << MARC4J4R::ControlField.new('001', '0000333234')
    
    # Construct and add a datafield
    df = MARC4J4R::DataField.new(tag, ind1, ind2)
    
    ind1 = df.ind1
    ind2 = df.ind2
    
    df << MARC4J4R::Subfield.new('a', 'the $a value')
    df << MARC4J4R::Subfield.new('b', 'the $b value')   
    
    # Add it to a record
    
    record << df
    
    # Get subfields or their values
    
    firstSubfieldAValue = df['a']
    
    allSubfields = df.subs
    allSubfieldAs = df.subs('a')
    allSubfieldAorBs = df.subs(['a', 'b'])
    
    allSubfieldAorBValues = df.sub_values(['a', 'b'])
    

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 BillDueber. See LICENSE for details.