XML::Mixup: A mixin for XML markup

require 'xml-mixup'

class Anything
  include XML::Mixup
end

something = Anything.new

# generate a structure
node = something.markup spec: [
  { '#pi'   => 'xml-stylesheet', type: 'text/xsl', href: '/transform' },
  { '#dtd'  => :html },
  { '#html' => [
    { '#head' => [
      { '#title' => 'look ma, title' },
      { '#elem'  => :base, href: 'http://the.base/url' },
    ] },
    { '#body' => [
      { '#h1' => 'Illustrious Heading' },
      { '#p'  => :lolwut },
    ] },
  ], xmlns: 'http://www.w3.org/1999/xhtml' }
]

# `node` will correspond to the last thing generated. In this
# case, it will be a text node containing 'lolwut'.

doc = node.document
puts doc.to_xml

Yet another XML markup generator?

Some time ago, I wrote a Perl module called Role::Markup::XML. I did this because I had a lot of XML to generate, and was dissatisfied with what was currently on offer. Now I have a lot of XML to generate using Ruby, and found a lot of the same things:

Structure is generated by procedure calls

Granted it's a lot nicer to do this sort of thing in Ruby, but at the end of the day, the thing generating the XML is a nested list of method calls — not a declarative data structure.

Document has to be generated all in one shot

It's not super-easy to generate a piece of the target document and then go back and generate some more (although Nokogiri::XML::Builder.with is a nice start). This plus the last point leads to all sorts of cockamamy constructs which are almost as life-sucking as writing raw DOM routines.

Hard to do surgery on existing documents

This comes up a lot: you have an existing document and you want to add even just a single node to it — say, in between two nodes just for fun. Good luck with that.

Enter XML::Mixup

  • The input consists of ordinary Ruby data objects so you can build them up ahead of time, in bulk, transform them, etc.,
  • Sprinkle pre-built XML subtrees anywhere into the spec so you can memoize repeating elements, or otherwise compile a document incrementally,
  • Attach new generated content anywhere: underneath a parent node, or before, after, or instead of a node at the sibling level.

The spec format

At the heart of this module is a single method called markup, which, among other things, takes a :spec. The spec can be any composite

Hashes

The principal construct in XML::Mixup is the Hash. You can generate pretty much any node with it:

Elements

{ '#tag' => 'foo' }                 # => <foo/>

# or
{ '#elem' => 'foo' }                # => <foo/>

# or, with the element name as a symbol
{ '#element' => :foo }              # => <foo/>

# or, with nil as a key
{ nil => :foo }                     # => <foo/>

# or, with attributes
{ nil => :foo, bar: :hi }           # => <foo bar="hi"/>

# or, with namespaces
{ nil => :foo, xmlns: 'urn:x-bar' } # => <foo xmlns="urn:x-bar"/>

# or, with more namespaces
{ nil => :foo, xmlns: 'urn:x-bar', 'xmlns:hurr' => 'urn:x-durr' }
# => <foo xmlns="urn:x-bar" xmlns:hurr="urn:x-durr"/>

# or, with content
{ nil => [:foo, :hi] }              # => <foo>hi</foo>

# or, shove your child nodes into an otherwise content-less key
{ [:hi] => :foo, bar: :hurr }       # => <foo bar="hurr">hi</foo>

Attributes are sorted lexically. Composite attribute values get flattened like this:

{ nil => :foo, array: [:a, :b], hash: { e: :f, c: :d } }
# => <foo array="a b" hash="c: d e: f"/>

Processing instructions

{ '#pi' => 'xml-stylesheet', type: 'text/xsl', href: '/transform' }
# => <?xml-stylesheet type="text/xsl" href="/transform"?>

# or, if you like typing
{ '#processing-instruction' => :hurr } # => <?hurr?>

DOCTYPE declarations

{ '#dtd' => :html } # => <!DOCTYPE html>

# or (note either :public or :system can be nil)
{ '#dtd' => [:html, :public, :system] }
# => <!DOCTYPE html PUBLIC "public" SYSTEM "system">

# or, same thing
{ '#doctype' => :html, public: :public, system: :system }

Comments and CDATA sections

Comments and CDATA are flattened into string literals:

{ '#comment' => :whatever }     # => <!-- whatever -->

{ '#cdata' => '<what-everrr>' } # => <![CDATA[<what-everrr>]]>

Pretty straight forward?

Arrays

Parts of a spec that are arrays (or really anything that can be turned into one) are attached at the same level of the document in the sequence given, as you might expect.

Nokogiri::XML::Node objects

These are automatically cloned, but otherwise passed in as-is.

Procs, lambdas etc.

These are executed with any supplied :args, and then markup is run again over the result. (Take care not to supply a Proc that produces another Proc.)

Everything else

Turned into a text node.

Installation

Come on, you know how to do this:

$ gem install xml-mixup

Contributing

Bug reports and pull requests are welcome at https://github.com/doriantaylor/rb-xml-mixup.

License

This software is provided under the Apache License, 2.0.