Class: Libis::Tools::XmlDocument

Inherits:
Object
  • Object
show all
Defined in:
lib/libis/tools/xml_document.rb

Overview

This class embodies most used features of Nokogiri, Nori and Gyoku in one convenience class. The Nokogiri document is stored in the class variable ‘document’ and can be accessed and manipulated directly - if required.

In the examples we assume the following XML code:

<?xml version="1.0" encoding="utf-8"?>
<patron>
    <name>Harry Potter</name>
    <barcode library='Hogwarts Library'>1234567890</barcode>
    <access_level>student</access_level>
    <email>[email protected]</email>
    <email>[email protected]</email>
</patron>

Direct Known Subclasses

Metadata::DublinCoreRecord

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(encoding = 'utf-8') ⇒ XmlDocument

Create new XmlDocument instance. The object will contain a new and emtpy Nokogiri XML Document. The object will not be valid until a root node is added.

Parameters:

  • encoding (String) (defaults to: 'utf-8')

    character encoding for the XML content; default value is ‘utf-8’



45
46
47
48
# File 'lib/libis/tools/xml_document.rb', line 45

def initialize(encoding = 'utf-8')
  @document = Nokogiri::XML::Document.new
  @document.encoding = encoding
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(method, *args, &block) ⇒ Object

Node access by method name.

Nodes can be accessed through a method with signature the tag name of the node. There are several ways to use this shorthand method:

* without arguments it simply returns the first node found
* with one argument it retrieves the node's attribute
* with one argument and '=' sign it sets the content of the node
* with two arguments it sets the value of the node's attribute
* with a code block it implements the build pattern

Examples:

     xml_doc.email
     # => "[email protected]"
     p xml_doc.barcode 'library'
     # => "Hogwarts Library"
     xml_doc.access_level = 'postgraduate'
     xml_doc.barcode 'library', 'Hogwarts Dumbledore Library'
     xml_doc.dates do |dates|
       dates.birth_date '1980-07-31'
       dates.member_since '1991-09-01'
     end
     p xml_doc.to_xml
     # =>  <patron>
               ...
               <barcode library='Hogwarts Dumbledore Library'>1234567890</barcode>
               <access_level>postgraduate</access_level>
               ...
               <dates>
                 <birth_date>1980-07-31</birth_date>
                 <member_since>1991-09-01</member_since>
               </dates>
           </patron>


544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
# File 'lib/libis/tools/xml_document.rb', line 544

def method_missing(method, *args, &block)
  super unless method.to_s =~ /^([a-z_][a-z_0-9]*)(!|=)?$/i
  node = get_node($1)
  node = add_node($1) if node.nil? || $2 == '!'
  case args.size
    when 0
      if block_given?
        build(node, &block)
      end
    when 1
      if $2.blank?
        return node[args.first.to_s]
      else
        node.content = args.first.to_s
      end
    when 2
      node[args.first.to_s] = args[1].to_s
      return node[args.first.to_s]
    else
      raise ArgumentError, 'Too many arguments.'
  end
  node
end

Instance Attribute Details

#documentObject

Returns the value of attribute document.



29
30
31
# File 'lib/libis/tools/xml_document.rb', line 29

def document
  @document
end

Class Method Details

.add_attributes(node, attributes) ⇒ {Nokogiri::XML::Node}

Note:

The Nokogiri method Node#[]= is probably easier to use if you only want to add a single attribute ;the main purpose of this method is to make it easier to add attributes in bulk or if you have them already available as a Hash

Add attributes to a node. Example:

xml_doc.add_attributes xml_doc.root, status: 'active', id: '123456'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron id="123456" status="active">
        ...
    </patron>

Parameters:

  • node (Nokogiri::XML::Node)

    node to add the attributes to

  • attributes (Hash)

    a Hash with tag - value pairs for each attribute

Returns:

  • ({Nokogiri::XML::Node})

    the node



342
343
344
345
346
347
348
349
350
# File 'lib/libis/tools/xml_document.rb', line 342

def self.add_attributes(node, attributes)

  attributes.each do |name, value|
    node.set_attribute name.to_s, value
  end

  node

end

.add_namespaces(node, namespaces) ⇒ Object

Add namespace information to a node

Example:

xml_doc.add_namespaces xml_doc.root, jkr: 'http://JKRowling.com', node_ns: 'jkr'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <jkr:patron xmlns:jkr="http://JKRowling.com">
        ...
    </jkr:patron>

xml_doc.add_namespaces xml_doc.root, nil => 'http://JKRowling.com'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron xmlns="http://JKRowling.com">
        ...
    </patron>

Example:

node = xml_doc.create_text_node 'address', 'content'
xml_doc.add_namespaces node, node_ns: 'abc', abc: 'http://abc.org', xyz: 'http://xyz.org'
# node => <abc:sample abc="http://abc.org" xyz="http://xyz.org">content</abc:sample>

Parameters:

  • node (Nokogiri::XML::Node)

    the node where the namespace info should be added to

  • namespaces (Hash)

    a Hash with prefix - URI pairs for each namespace definition that should be added. The special key :node_ns is reserved for specifying the prefix for the node itself. To set the default namespace, use the prefix nil



385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
# File 'lib/libis/tools/xml_document.rb', line 385

def self.add_namespaces(node, namespaces)

  node_ns = namespaces.delete :node_ns
  default_ns = namespaces.delete nil

  namespaces.each do |prefix, prefix_uri|
    node.add_namespace prefix.to_s, prefix_uri
  end

  node.namespace_scopes.each do |ns|
    node.namespace = ns if ns.prefix == node_ns.to_s
  end if node_ns

  node.default_namespace = default_ns if default_ns

  node

end

.build(&block) ⇒ XmlDocument

Creates a new XML document with contents supplied in Nokogiri build short syntax.

Example:

xml_doc = ::Libis::Tools::XmlDocument.build do
  patron {
    name 'Harry Potter'
    barcode( '1234567890', library: 'Hogwarts Library')
    access_level 'student'
    email '[email protected]'
    email '[email protected]'
    books {
      book title: 'Quidditch Through the Ages', author: 'Kennilworthy Whisp', due_date: '1992-4-23'
    }
  }
end
p xml_doc.to_xml
# =>
      <?xml version="1.0" encoding="utf-8"?>
          <patron>
      <name>Harry Potter</name>
        <barcode library="Hogwarts Library">1234567890</barcode>
      <access_level>student</access_level>
        <email>[email protected]</email>
      <email>[email protected]</email>
        <books>
          <book title="Quidditch Through the Ages" author="Kennilworthy Whisp" due_date="1992-4-23"/>
      </books>
      </patron>

Parameters:

  • block (Code block)

    Build instructions

Returns:



257
258
259
# File 'lib/libis/tools/xml_document.rb', line 257

def self.build(&block)
  self.new.build(nil, &block)
end

.from_hash(hash, options = {}) ⇒ XmlDocument

Note:

The Hash will be converted with Gyoku. See the Gyoku documentation for the Hash format requirements.

Create a new instance initialized with a Hash.

Parameters:

  • hash (Hash)

    the content

  • options (Hash) (defaults to: {})

    options passed to Gyoku upon parsing the Hash into XML

Returns:



75
76
77
78
79
80
# File 'lib/libis/tools/xml_document.rb', line 75

def self.from_hash(hash, options = {})
  doc = XmlDocument.new
  doc.document = Nokogiri::XML(Gyoku.xml(hash, options))
  doc.document.encoding = 'utf-8'
  doc
end

.get_content(nodelist) ⇒ String

Return the content of the first element in the set of nodes.

Example:

::Libis::Tools::XmlDocument.get_content(xml_doc.xpath('//email')) # => "[email protected]"

Parameters:

  • nodelist ({Nokogiri::XML::NodeSet})

    set of nodes to get content from

Returns:

  • (String)

    content of the first node; always returns at least an empty string



477
478
479
# File 'lib/libis/tools/xml_document.rb', line 477

def self.get_content(nodelist)
  (nodelist.first && nodelist.first.content) || ''
end

.open(file) ⇒ XmlDocument

Create a new instance initialized with the content of an XML file.

Parameters:

  • file (String)

    path to the XML file

Returns:



53
54
55
56
57
58
# File 'lib/libis/tools/xml_document.rb', line 53

def self.open(file)
  doc = XmlDocument.new
  doc.document = Nokogiri::XML(File.open(file))
  # doc.document = Nokogiri::XML(File.open(file), &:noblanks)
  doc
end

.parse(xml) ⇒ XmlDocument

Create a new instance initialized with an XML String.

Parameters:

Returns:



63
64
65
66
67
68
# File 'lib/libis/tools/xml_document.rb', line 63

def self.parse(xml)
  doc = XmlDocument.new
  doc.document = Nokogiri::XML.parse(xml)
  # doc.document = Nokogiri::XML.parse(xml, &:noblanks)
  doc
end

Instance Method Details

#[](path) ⇒ String

Return the content of the first element found.

Example:

xml_doc['email'] # => "[email protected]"

Parameters:

  • path (String)

    the name or XPath term to search the node(s)

Returns:

  • (String)

    content or nil if not found



454
455
456
# File 'lib/libis/tools/xml_document.rb', line 454

def [](path)
  xpath(path).first.content rescue nil
end

#[]=(path, value) ⇒ String

Find a node and set its content.

Example:

xml_doc['//access_level'] = 'postgraduate'
p xml_doc.to_xml
# =>
      <?xml version="1.0" encoding="utf-8"?>
      <patron>
        ...
        <access_level>postgraduate</access_level>
        ...
      </patron>

Parameters:

  • value (String)

    the content

  • path (String)

    the name or XPath term to search the node(s)

  • parent (Node)

    parent node; document if nil

Returns:



498
499
500
501
502
503
504
505
# File 'lib/libis/tools/xml_document.rb', line 498

def []=(path, value)
  begin
    nodes = xpath(path)
    nodes.first.content = value
  rescue
    # ignored
  end
end

#add_attributes(node, attributes) ⇒ {Nokogiri::XML::Node}

Note:

The Nokogiri method Node#[]= is probably easier to use if you only want to add a single attribute ;the main purpose of this method is to make it easier to add attributes in bulk or if you have them already available as a Hash

Add attributes to a node. Example:

xml_doc.add_attributes xml_doc.root, status: 'active', id: '123456'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron id="123456" status="active">
        ...
    </patron>

Parameters:

  • node (Nokogiri::XML::Node)

    node to add the attributes to

  • attributes (Hash)

    a Hash with tag - value pairs for each attribute

Returns:

  • ({Nokogiri::XML::Node})

    the node



337
338
339
# File 'lib/libis/tools/xml_document.rb', line 337

def add_attributes(node, attributes)
  XmlDocument.add_attributes node, attributes
end

#add_namespaces(node, namespaces) ⇒ Object

Add namespace information to a node

Example:

xml_doc.add_namespaces xml_doc.root, jkr: 'http://JKRowling.com', node_ns: 'jkr'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <jkr:patron xmlns:jkr="http://JKRowling.com">
        ...
    </jkr:patron>

xml_doc.add_namespaces xml_doc.root, nil => 'http://JKRowling.com'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron xmlns="http://JKRowling.com">
        ...
    </patron>

Example:

node = xml_doc.create_text_node 'address', 'content'
xml_doc.add_namespaces node, node_ns: 'abc', abc: 'http://abc.org', xyz: 'http://xyz.org'
# node => <abc:sample abc="http://abc.org" xyz="http://xyz.org">content</abc:sample>

Parameters:

  • node (Nokogiri::XML::Node)

    the node where the namespace info should be added to

  • namespaces (Hash)

    a Hash with prefix - URI pairs for each namespace definition that should be added. The special key :node_ns is reserved for specifying the prefix for the node itself. To set the default namespace, use the prefix nil



380
381
382
# File 'lib/libis/tools/xml_document.rb', line 380

def add_namespaces(node, namespaces)
  XmlDocument.add_namespaces node, namespaces
end

#add_node(*args) ⇒ Nokogiri::XML::Node

Adds a new XML node to the document.

Example:

xml_doc = ::Libis::Tools::XmlDocument.new
xml_doc.valid? # => false
xml_doc.add_node :patron
xml_doc.add_node :name, 'Harry Potter'
books = xml_doc.add_node :books, nil, nil, namespaces: { jkr: 'http://JKRowling.com', node_ns: 'jkr' }
xml_doc.add_node :book, nil, books,
    title: 'Quidditch Through the Ages', author: 'Kennilworthy Whisp', due_date: '1992-4-23',
    namespaces: { node_ns: 'jkr' }
p xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron>
        <name>Harry Potter</name>
        <jkr:books xmlns:jkr="http://JKRowling.com">
          <jkr:book author="Kennilworthy Whisp" due_date="1992-4-23" title="Quidditch Through the Ages"/>
        </jkr:books>
    </patron>

Parameters:

  • args (Array)

    arguments being:

    • tag for the new node

    • optional content for new node; empty if nil or not present

    • optional parent node for new node; root if nil or not present; xml document if root is not defined

    • a Hash containing tag-value pairs for each attribute; the special key ‘:namespaces’ contains a Hash of namespace definitions as in #add_namespaces

Returns:

  • (Nokogiri::XML::Node)

    the new node



290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
# File 'lib/libis/tools/xml_document.rb', line 290

def add_node(*args)
  attributes = {}
  attributes = args.pop if args.last.is_a? Hash
  name, value, parent = *args

  return nil if name.nil?

  node = Nokogiri::XML::Node.new name.to_s, @document
  node.content = value

  if !parent.nil?
    parent << node
  elsif !self.root.nil?
    self.root << node
  else
    self.root = node
  end

  return node if attributes.empty?

  namespaces = attributes.delete :namespaces
  add_namespaces(node, namespaces) if namespaces

  add_attributes(node, attributes) if attributes

  node

end

#add_processing_instruction(name, content) ⇒ Nokogiri::XML::Node

Note:

unlike regular nodes, these nodes are automatically added to the document.

Add a processing instruction to the current XML document.

Parameters:

  • name (String)

    instruction name

  • content (String)

    instruction content

Returns:

  • (Nokogiri::XML::Node)

    the processing instruction node



151
152
153
154
155
# File 'lib/libis/tools/xml_document.rb', line 151

def add_processing_instruction(name, content)
  processing_instruction = Nokogiri::XML::ProcessingInstruction.new(@document, name, content)
  @document.root.add_previous_sibling processing_instruction
  processing_instruction
end

#build(at_node = nil, options = {}, &block) ⇒ XmlDocument

Creates nodes using the Nokogiri build short syntax.

Example:

xml_doc.build(xml_doc.root) do |xml|
  xml.books do
    xml.book title: 'Quidditch Through the Ages', author: 'Kennilworthy Whisp', due_date: '1992-4-23'
  end
end
p xml_doc.to_xml
# =>
      <?xml version="1.0" encoding="utf-8"?>
      <patron>
          ...
          <books>
            <book title="Quidditch Through the Ages" author="Kennilworthy Whisp" due_date="1992-4-23"/>
          </books>
      </patron>

Parameters:

  • at_node (Nokogiri::XML::Node) (defaults to: nil)

    the node to attach the new nodes to; optional - if missing or nil the new nodes will replace the entire document

  • block (Code block)

    Build instructions

Returns:



215
216
217
218
219
220
221
222
223
224
# File 'lib/libis/tools/xml_document.rb', line 215

def build(at_node = nil, options = {}, &block)
  options = {encoding: 'utf-8' }.merge options
  if at_node
      Nokogiri::XML::Builder.new(options,at_node, &block)
  else
    xml = Nokogiri::XML::Builder.new(options, &block)
    @document = xml.doc
  end
  self
end

#get_node(tag, parent = nil) ⇒ Object

Get the first node matching the tag. The node will be seached with XPath search term = “//#tag”.

Parameters:

  • tag (String)

    XML tag to look for; XPath syntax is allowed

  • parent (Node) (defaults to: nil)


572
573
574
# File 'lib/libis/tools/xml_document.rb', line 572

def get_node(tag, parent = nil)
  get_nodes(tag, parent).first
end

#get_nodes(tag, parent = nil) ⇒ Object

Get all the nodes matching the tag. The node will be seached with XPath search term = “//#tag”.

Parameters:

  • tag (String)

    XML tag to look for; XPath syntax is allowed

  • parent (Node) (defaults to: nil)


580
581
582
583
584
# File 'lib/libis/tools/xml_document.rb', line 580

def get_nodes(tag, parent = nil)
  parent ||= root
  term = "#{tag.to_s =~ /^\// ? '' : '//'}#{tag.to_s}"
  parent.xpath(term)
end

#has_element?(element_name) ⇒ Integer

Check if the XML document contains certain element(s) anywhere in the XML document.

Example:

xml_doc.has_element? 'barcode[@library="Hogwarts Library"]' # => true

Parameters:

  • element_name (String)

    name of the element(s) to search

Returns:

  • (Integer)

    number of elements found



427
428
429
430
# File 'lib/libis/tools/xml_document.rb', line 427

def has_element?(element_name)
  list = xpath("//#{element_name}")
  list.nil? ? 0 : list.size
end

#invalid?Boolean

Check if the embedded XML document is not present or invalid.

Returns:

  • (Boolean)


32
33
34
# File 'lib/libis/tools/xml_document.rb', line 32

def invalid?
  @document.nil? or !document.is_a?(::Nokogiri::XML::Document) or @document.root.nil?
end

#root{Nokogiri::XML::Node}

Get the root node of the XML Document.

Example:

puts xml_doc.root.to_xml
# =>
    <patron>
      ...
    </patron>

Returns:

  • ({Nokogiri::XML::Node})

    the root node of the XML Document

Raises:

  • (ArgumentError)


168
169
170
171
# File 'lib/libis/tools/xml_document.rb', line 168

def root
  raise ArgumentError, 'XML document not valid.' if @document.nil?
  @document.root
end

#root=(node) ⇒ {Nokogiri::XML::Node}

Set the root node of the XML Document.

Example:

patron = ::Nokogiri::XML::Node.new 'patron', xml_doc.document
xml_doc.root = patron
puts xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron/>

Parameters:

  • node ({Nokogiri::XML::Node})

    new root node

Returns:

  • ({Nokogiri::XML::Node})

    the new root node

Raises:

  • (ArgumentError)


186
187
188
189
190
# File 'lib/libis/tools/xml_document.rb', line 186

def root=(node)
  raise ArgumentError, 'XML document not valid.' if @document.nil?
  #noinspection RubyArgCount
  @document.root = node
end

#save(file, indent = 2, encoding = 'utf-8') ⇒ nil

Save the XML Document to a given XML file.

Parameters:

  • file (String)

    name of the file to save to

  • indent (Integer) (defaults to: 2)

    amount of space for indenting; default 2

  • encoding (String) (defaults to: 'utf-8')

    character encoding; default ‘utf-8’

Returns:

  • (nil)


87
88
89
90
91
# File 'lib/libis/tools/xml_document.rb', line 87

def save(file, indent = 2, encoding = 'utf-8')
  fd = File.open(file, 'w')
  @document.write_xml_to(fd, :indent => indent, :encoding => encoding)
  fd.close
end

#to_hash(options = {}) ⇒ Hash

Note:

The hash is generated using the Nori gem. The options passed to this call are used to configure Nori in the constructor call. For content and syntax see the Nori documentation. Nori also uses an enhanced String class with an extra method #attributes that will return a Hash containing tag-value pairs for each attribute of the XML element.

Export the XML Document to a Hash.

Example:

h = xml_doc.to_hash
# => { "patron" =>
        { "name" => "Harry Potter",
          "barcode" => "1234567890",
          "access_level" => "student",
          "email" => ["[email protected]", "[email protected]"],
     }  }
h['patron']['barcode']
# => "12345678890"
h['patron']['barcode'].attributes
# => {"library" => "Hogwarts Library"}
h['patron']['barcode'].attributes['library']
# => "Hogwarts Library"

Parameters:

  • options (Hash) (defaults to: {})

Returns:



127
128
129
# File 'lib/libis/tools/xml_document.rb', line 127

def to_hash(options = {})
  Nori.new(options).parse(to_xml)
end

#to_xml(options = {}) ⇒ String

Export the XML Document to an XML string.

Parameters:

  • options (Hash) (defaults to: {})

    options passed to the underlying Nokogiri::XML::Document#to_xml; default is: 2, encoding: ‘utf-8’

Returns:



97
98
99
100
# File 'lib/libis/tools/xml_document.rb', line 97

def to_xml(options = {})
  options = {indent: 2, encoding: 'utf-8', save_with: Nokogiri::XML::Node::SaveOptions::DEFAULT_XML}.merge(options)
  @document.to_xml(options)
end

#valid?Boolean

Check if the embedded XML document is present and valid

Returns:

  • (Boolean)


37
38
39
# File 'lib/libis/tools/xml_document.rb', line 37

def valid?
  !invalid?
end

#validate(schema) ⇒ Array<{Nokogiri::XML::SyntaxError}>

Check if the document validates against a given XML schema file.

Returns:

  • (Array<{Nokogiri::XML::SyntaxError}>)

    a list of validation errors



141
142
143
144
# File 'lib/libis/tools/xml_document.rb', line 141

def validate(schema)
  schema_doc = Nokogiri::XML::Schema.new(File.open(schema))
  schema_doc.validate(@document)
end

#validates_against?(schema) ⇒ Boolean

Check if the document validates against a given XML schema file.

Parameters:

  • schema (String)

    the file path of the XML schema

Returns:

  • (Boolean)


134
135
136
137
# File 'lib/libis/tools/xml_document.rb', line 134

def validates_against?(schema)
  schema_doc = Nokogiri::XML::Schema.new(File.open(schema))
  schema_doc.valid?(@document)
end

#value(path, parent = nil) ⇒ String

Return the content of the first element found.

Example:

xml_doc.value('//email') # => "[email protected]"

Parameters:

  • path (String)

    the name or XPath term to search the node(s)

  • parent (Node) (defaults to: nil)

    parent node; document if nil

Returns:

  • (String)

    content or nil if not found



441
442
443
444
# File 'lib/libis/tools/xml_document.rb', line 441

def value(path, parent = nil)
  parent ||= document
  parent.xpath(path).first.content rescue nil
end

#values(path) ⇒ Array<String>

Return the content of all elements found. Example:

xml_doc.values('//email') # => [ "[email protected]", "[email protected]" ]

Parameters:

  • path (String)

    the name or XPath term to search the node(s)

  • parent (Node)

    parent node; document if nil

Returns:



465
466
467
# File 'lib/libis/tools/xml_document.rb', line 465

def values(path)
  xpath(path).map &:content
end

#xpath(path) ⇒ {Nokogiri::XML::NodeSet}

Search for nodes in the current document root.

Example:

nodes = xml_doc.xpath('//email')
nodes.size # => 2
nodes.map(&:content) # => ["[email protected]", "[email protected]"]

Parameters:

  • path (String)

    XPath search string

Returns:

  • ({Nokogiri::XML::NodeSet})

    set of nodes found

Raises:

  • (ArgumentError)


414
415
416
417
# File 'lib/libis/tools/xml_document.rb', line 414

def xpath(path)
  raise ArgumentError, 'XML document not valid.' if self.invalid?
  @document.xpath(path.to_s)
end