Class: MARC::UnsafeXMLWriter

Inherits:
XMLWriter show all
Defined in:
lib/marc/unsafe_xmlwriter.rb

Overview

UnsafeXMLWriter bypasses real xml handlers like REXML or Nokogiri and just concatenates strings to produce the XML document. This has no guarantees of validity if the MARC record you’re encoding isn’t valid and won’t do things like entity expansion, but it does escape using ruby’s String#encode(xml: :text) and it’s much, much faster – 4-5 times faster than using Nokogiri, and 15-20 times faster than the REXML version.

Constant Summary collapse

XML_HEADER =
'<?xml version="1.0" encoding="UTF-8"?>'
NS_ATTRS =
%(xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/MARC21/slim" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd")
NS_COLLECTION =
"<collection #{NS_ATTRS}>".freeze
COLLECTION =
"<collection>".freeze
NS_RECORD =
"<record #{NS_ATTRS}>".freeze
RECORD =
"<record>".freeze

Constants inherited from XMLWriter

XMLWriter::COLLECTION_TAG

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from XMLWriter

#close, fix_leader, #initialize, #stylesheet_tag

Constructor Details

This class inherits a constructor from MARC::XMLWriter

Class Method Details

.encode(record, include_namespace: true) ⇒ String

Take a record and turn it into a valid MARC-XML string. Note that this is an XML snippet, without an XML header or <collection> enclosure.

Parameters:

Returns:

  • (String)

    The XML snippet of the record in MARC-XML



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/marc/unsafe_xmlwriter.rb', line 58

def encode(record, include_namespace: true)
  xml = open_record(include_namespace: include_namespace).dup

  # MARCXML only allows alphanumerics or spaces in the leader
  lead = fix_leader(record.leader)

  xml << "<leader>" << lead.encode(xml: :text) << "</leader>"
  record.each do |f|
    if f.instance_of?(MARC::DataField)
      xml << open_datafield(f.tag, f.indicator1, f.indicator2)
      f.each do |sf|
        xml << open_subfield(sf.code) << sf.value.encode(xml: :text) << "</subfield>"
      end
      xml << "</datafield>"
    elsif f.instance_of?(MARC::ControlField)
      xml << open_controlfield(f.tag) << f.value.encode(xml: :text) << "</controlfield>"
    end
  end
  xml << "</record>"
  xml.force_encoding("utf-8")
end

.open_collection(include_namespace: true) ⇒ Object

Open ‘collection` tag, w or w/o namespace



26
27
28
29
30
31
32
# File 'lib/marc/unsafe_xmlwriter.rb', line 26

def open_collection(include_namespace: true)
  if include_namespace
    NS_COLLECTION
  else
    COLLECTION
  end
end

.open_controlfield(tag) ⇒ Object



88
89
90
# File 'lib/marc/unsafe_xmlwriter.rb', line 88

def open_controlfield(tag)
  "<controlfield tag=\"#{tag}\">"
end

.open_datafield(tag, ind1, ind2) ⇒ Object



80
81
82
# File 'lib/marc/unsafe_xmlwriter.rb', line 80

def open_datafield(tag, ind1, ind2)
  "<datafield tag=\"#{tag}\" ind1=\"#{ind1}\" ind2=\"#{ind2}\">"
end

.open_record(include_namespace: true) ⇒ Object



34
35
36
37
38
39
40
# File 'lib/marc/unsafe_xmlwriter.rb', line 34

def open_record(include_namespace: true)
  if include_namespace
    NS_RECORD
  else
    RECORD
  end
end

.open_subfield(code) ⇒ Object



84
85
86
# File 'lib/marc/unsafe_xmlwriter.rb', line 84

def open_subfield(code)
  "<subfield code=\"#{code}\">"
end

.single_record_document(record, include_namespace: true) ⇒ Object

Produce an XML string with a single document in a collection

Parameters:

  • record (MARC::Record)
  • include_namespace (Boolean) (defaults to: true)

    Whether to namespace the resulting XML



45
46
47
48
49
50
51
# File 'lib/marc/unsafe_xmlwriter.rb', line 45

def single_record_document(record, include_namespace: true)
  xml = XML_HEADER.dup
  xml << open_collection(include_namespace: include_namespace)
  xml << encode(record, include_namespace: false)
  xml << "</collection>".freeze
  xml
end

Instance Method Details

#write(record) ⇒ Object

Write the record to the target

Parameters:



20
21
22
# File 'lib/marc/unsafe_xmlwriter.rb', line 20

def write(record)
  @fh.write(self.class.encode(record))
end