marc4j4r

A ruby wrapper around the marc4j.jar (as forked by javamarc) java library for dealing with library MARC data.

JRuby version alert

MARC4J4R::Record#each throws an error in JRuby versions bfore 1.7.1 when in --1.9 mode Your best bet is to use JRuby 1.7.1 (or higher). [The error in question is, I think JRUBY-6581]

Deprecation alert

I'm giving up on this standalone module and focusing my efforts into making a marc4j add-on for the standard ruby-marc distribution.

Getting a MARC reader

marc4j4r provides three readers out of the box: :strictmarc (binary), :permissivemarc (:binary), :marcxml (MARC-XML), or :alephsequential (Ex Libris's AlephSequential format).

You can pass either a filename or an open IO object (either ruby or java.io.inputstream)

require 'marc4j4r'

binreader = MARC4J4R::Reader.new('test.mrc') # defaults to :strictmarc
binreader = MARC4J4R::Reader.new('test.mrc', :strictmarc)

permissivereader =  MARC4J4R::Reader.new('test.mrc', :permissivemarc)

xmlreader = MARC4J4R::Reader.new('test.xml', :marcxml)
asreader = MARC4J4R::Reader.new('test.seq', :alephsequential)

# Or use a file object

reader = MARC4J4R::Reader.new(File.open('test.mrc'))

# Or a java.io.inputstream

jurl = Java::java.net.URL.new('http://my.machine.com/test.mrc')
istream = jurl.openConnection.getInputStream
reader = MARC4J4R::Reader.new(istream)

Using the reader

A MARC4J4R::Reader is an Enumerable, so you can do:

reader.each do |record|
  # do stuff with the record
end

Or, if you're using jruby_threach:

reader.threach(2) do |record|
  # do stuff with records in two threads
end

Using the writer

  binaryWriter = MARC4J4R::Writer.new(filename, :strictmarc)
  xmlWriter    = MARC4J4R::Writer.new(filename, :marcxml)

  writer.write(record)
  # repeat
  writer.close

Working with records and fields

In addition to all the normal marc4j methods, MARC4J4R::Record exposes some additional methods and syntaxes.

See the classes themselves and/or the specs for more examples.

  • MARC4J4R::Reader
  • MARC4J4R::Writer
  • MARC4J4R::Record
  • MARC4J4R::ControlField
  • MARC4J4R::DataField
  • MARC4J4R::SubField

    leader = record.leader

    # All fields are available via #each or #fields

    fields = record.fields

    record.each do |field| # do something with each controlfield/datafield; returned in the order they were added end

    # Controlfields have a tag and a value

    idfield = record['001'] idfield.tag # => '001' id = idfield.value # or idfield.data, same thing

    # Get the first datafield with a given tag first700 = record['700'] # Note: need to use strings, not integers

    # Stringify a field to get all the subfields joined with spaces

    fullTitle = record['245'].to_s

    all700s = record.find_by_tag '700' all700and856s = record.find_by_tag ['700', '856']

    # Construct and add a controlfield record << MARC4J4R::ControlField.new('001', '0000333234')

    # Construct and add a datafield df = MARC4J4R::DataField.new(tag, ind1, ind2)

    ind1 = df.ind1 ind2 = df.ind2

    df << MARC4J4R::Subfield.new('a', 'the $a value') df << MARC4J4R::Subfield.new('b', 'the $b value')

    # Add it to a record

    record << df

    # Get subfields or their values

    firstSubfieldAValue = df['a']

    allSubfields = df.subs allSubfieldAs = df.subs('a') allSubfieldAorBs = df.subs(['a', 'b'])

    allSubfieldAorBValues = df.sub_values(['a', 'b'])

Install

$ gem install marc4j4r

Note on Patches/Pull Requests

  • Fork the project.
  • Make your feature addition or bug fix.
  • Add tests for it. This is important so I don't break it in a future version unintentionally.
  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
  • Send me a pull request. Bonus points for topic branches.

Copyright (c) 2012 Bill Dueber

See LICENSE for details.