XML::Mixup: A mixin for XML markup
“by require ‘xml-mixup’
class Anything include XML::Mixup end
something = Anything.new
generate a structure
node = something.markup spec: [{ #pi => xml-stylesheet, type: text/xsl, href: /transform }, { #dtd => :html }, { #html => { #head => { #title => look ma, title }, { #elem => :base, href: http://the.base/url }, }, { #body => { #h1 => Illustrious Heading }, { #p => :lolwut }, },, xmlns: http://www.w3.org/1999/xhtml }]
node will correspond to the last thing generated. In this
case, it will be a text node containing ‘lolwut’.
doc = node.document puts doc.to_xml
=>
=>
=> <!DOCTYPE html>
=> Maruku could not parse this XML/HTML:
<html xmlns="http://www.w3.org/1999/xhtml">
Maruku could not parse this XML/HTML:
<html xmlns="http://www.w3.org/1999/xhtml">
=> Maruku could not parse this XML/HTML:
<head>
Maruku could not parse this XML/HTML:
<head>
=> look ma, title
=>
=> </head>
=> Maruku could not parse this XML/HTML:
<body>
Maruku could not parse this XML/HTML:
<body>
=> Illustrious Heading
=>
lolwut
=> </body>
=> </html>
“
Yet another XML markup generator?
Some time ago, I wrote a Perl module called Role::Markup::XML. I did this because I had a lot of XML to generate, and was dissatisfied with what was currently on offer. Now I have a lot of XML to generate using Ruby, and found a lot of the same things:
Structure is generated by procedure calls
Granted it’s a lot nicer to do this sort of thing in Ruby, but at the end of the day, the thing generating the XML is a nested list of method calls — not a declarative data structure.
Document has to be generated all in one shot
It’s not super-easy to generate a piece of the target document and then go back and generate some more (although Nokogiri::XML::Builder.with is a nice start). This plus the last point leads to all sorts of cockamamy constructs which are almost as life-sucking as writing raw DOM routines.
Hard to do surgery on existing documents
This comes up a lot: you have an existing document and you want to add even just a single node to it — say, in between two nodes just for fun. Good luck with that.
Enter XML::Mixup
- The input consists of ordinary Ruby data objects so you can build them up ahead of time, in bulk, transform them using familiar operations, etc.,
- Sprinkle pre-built XML subtrees anywhere into the spec so you can memoize repeating elements, or otherwise compile a document incrementally,
- Attach new generated content anywhere: underneath a parent node, or before, after, or instead of a node at the sibling level.
The tree spec format
At the heart of this module is a single method called markup, which, among other things, takes a :spec. The spec can be any composite of these objects, and will behave as described:
Hashes
The principal construct in XML::Mixup is the Hash. You can generate pretty much any node with it:
Elements
“by { ‘#tag’ => ‘foo’ } # =>
or, with the element name as a symbol
{ ‘#element’ => :foo } # =>
or
{ ‘#elem’ => ‘foo’ } # =>
or, with nil as a key
{ nil => :foo } # =>
or, with attributes
{ nil => :foo, bar: :hi } # =>
or, with namespaces
{ nil => :foo, xmlns: ‘urn:x-bar’ } # =>
or, with more namespaces
{ nil => :foo, xmlns: ‘urn:x-bar’, ‘xmlns:hurr’ => ‘urn:x-durr’ }
=>
or, with content
{ nil => [:foo, :hi] } # =>
or, shove your child nodes into an otherwise content-less key
{ [:hi] => :foo, bar: :hurr } # =>
or, if you have content and the element name is not a reserved word
{ ‘#html’ => { ‘#head’ => { ‘#title’ => :hi } } }
=> hi
also works with namespaces
{ ‘#atom:feed’ => nil, ‘xmlns:atom’ => ‘http://www.w3.org/2005/Atom’ }
=> atom:feed xmlns:atom="http://www.w3.org/2005/Atom"/
“
Reserved hash keywords are: #comment, #cdata, #doctype, #dtd, #elem, #element, #pi, #processing-instruction, #tag. Note that the constructs { nil => :foo }, { nil => 'foo' }, and {
'#foo' => nil }, plus [] anywhere you see nil, are all equivalent.
Attributes are sorted lexically. Composite attribute values get flattened like this:
“by { nil => :foo, array: [:a, :b], hash: { e: :f, c: :d } }
=>
“
Note that attribute values can also be a Proc, which are fed arbitrary arguments from the markup method. The Proc is expected to return something which can subsequently flattened. If an attribute value is nil or ultimately resolves to nil, or an empty Array or Hash, that attribute will be omitted. nil values in arrays or hashes will also be skipped, as will empty-string values in arrays. This is different behaviour from versions prior to 0.1.10, where nil (or, e.g., []) would produce an attribute containing the empty string.
This change was made to eliminate a lot of clunky logic in application code to determine whether or not to include a given attribute. If you need to render attributes explicitly with empty strings, then explicitly pass in the empty string.
Processing instructions
“by { ‘#pi’ => ‘xml-stylesheet’, type: ‘text/xsl’, href: ‘/transform’ }
=>
or, if you like typing
{ ‘#processing-instruction’ => :hurr } # =>
“
DOCTYPE declarations
“by { ‘#dtd’ => :html } # => <!DOCTYPE html>
or (note either :public or :system can be nil)
{ ‘#dtd’ => [:html, :public, :system] }
=> <!DOCTYPE html PUBLIC “public” SYSTEM “system”>
or, same thing
{ ‘#doctype’ => :html, public: :public, system: :system }
“
Comments and CDATA sections
Comments and CDATA are flattened into string literals:
“by { ‘#comment’ => :whatever } # =>
{ ‘#cdata’ => ‘
Maruku could not parse this XML/HTML:
<what-everrr>' } # => <![CDATA[<what-everrr>]]>
“
Pretty straight forward?
Arrays
Parts of a spec that are arrays (or really anything that can be turned into one) are attached at the same level of the document in the sequence given, as you might expect.
Nokogiri::XML::Node objects
These are automatically cloned, but otherwise passed in as-is.
Procs, lambdas etc.
These are executed with any supplied :args, and then markup is run again over the result. (Take care not to supply a Proc that produces another Proc.)
Everything else
Turned into a text node.
Documentation
Generated and deposited in the usual place.
Installation
Come on, you know how to do this:
$ gem install xml-mixup
Or, download it off rubygems.org.
Contributing
Bug reports and pull requests are welcome at the GitHub repository.
The Future
As mentioned, this is pretty much a straight-across port of Role::Markup::XML, where it makes sense in Perl to bolt a bunch of related pseudo-private _FOO-looking instance methods onto an object so you can use them to make more streamlined methods. This may or may not make the same kind of sense with Ruby.
In particular, these methods do not touch the calling object’s state. In fact they should be completely stateless and side-effect free. Likewise, they are really meant to be private. As such, it may make sense to simply bundle them as class methods and use them as such. I don’t know, I haven’t decided yet.
License
This software is provided under the Apache License, 2.0.