Class: Libis::Tools::XmlDocument

Inherits:
Object
  • Object
show all
Defined in:
lib/libis/tools/xml_document.rb

Overview

This class embodies most used features of Nokogiri, Nori and Gyoku in one convenience class. The Nokogiri document is stored in the class variable ‘document’ and can be accessed and manipulated directly - if required.

In the examples we assume the following XML code:

<?xml version="1.0" encoding="utf-8"?>
<patron>
  <name>Harry Potter</name>
  <barcode library='Hogwarts Library'>1234567890</barcode>
  <access_level>student</access_level>
  <email>[email protected]</email>
  <email>[email protected]</email>
</patron>

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(encoding = 'utf-8') ⇒ XmlDocument

Create new XmlDocument instance. The object will contain a new and emtpy Nokogiri XML Document. The object will not be valid until a root node is added.

Parameters:

  • encoding (String) (defaults to: 'utf-8')

    character encoding for the XML content; default value is ‘utf-8’



44
45
46
47
# File 'lib/libis/tools/xml_document.rb', line 44

def initialize(encoding = 'utf-8')
  @document = Nokogiri::XML::Document.new
  @document.encoding = encoding
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(method, *args, &block) ⇒ Object

Node access by method name.

Nodes can be accessed through a method with signature the tag name of the node. There are several ways to use this shorthand method:

* without arguments it simply returns the first node found
* with one argument it retrieves the node's attribute
* with one argument and '=' sign it sets the content of the node
* with two arguments it sets the value of the node's attribute
* with a code block it implements the build pattern

Examples:

     xml_doc.email
     # => "[email protected]"
     p xml_doc.barcode 'library'
     # => "Hogwarts Library"
     xml_doc.access_level = 'postgraduate'
     xml_doc.barcode 'library', 'Hogwarts Dumbledore Library'
     xml_doc.dates do |dates|
       dates.birth_date '1980-07-31'
       dates.member_since '1991-09-01'
     end
     p xml_doc.to_xml
     # =>  <patron>
               ...
               <barcode library='Hogwarts Dumbledore Library'>1234567890</barcode>
               <access_level>postgraduate</access_level>
               ...
               <dates>
                 <birth_date>1980-07-31</birth_date>
                 <member_since>1991-09-01</member_since>
               </dates>
           </patron>


536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
# File 'lib/libis/tools/xml_document.rb', line 536

def method_missing(method, *args, &block)
  super unless method.to_s =~ /^([a-z_][a-z_0-9]*)(!|=)?$/i
  node = get_node($1)
  node = add_node($1) if node.nil? || $2 == '!'
  case args.size
    when 0
      if block_given?
        build(node, &block)
      end
    when 1
      if $2.blank?
        return node[args.first.to_s]
      else
        node.content = args.first.to_s
      end
    when 2
      node[args.first.to_s] = args[1].to_s
      return node[args.first.to_s]
    else
      raise ArgumentError, 'Too many arguments.'
  end
  node
end

Instance Attribute Details

#documentObject

Returns the value of attribute document.



27
28
29
# File 'lib/libis/tools/xml_document.rb', line 27

def document
  @document
end

Class Method Details

.add_attributes(node, **attributes) ⇒ Nokogiri::XML::Node

Note:

The Nokogiri method Node#[]= is probably easier to use if you only want to add a single attribute ;the main purpose of this method is to make it easier to add attributes in bulk or if you have them already available as a Hash

Add attributes to a node. Example:

xml_doc.add_attributes xml_doc.root, status: 'active', id: '123456'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron id="123456" status="active">
        ...
    </patron>

Parameters:

  • node (Nokogiri::XML::Node)

    node to add the attributes to

  • attributes (Hash)

    a Hash with tag - value pairs for each attribute

Returns:

  • (Nokogiri::XML::Node)

    the node



338
339
340
341
342
343
344
345
346
# File 'lib/libis/tools/xml_document.rb', line 338

def self.add_attributes(node, **attributes)

  attributes.each do |name, value|
    node.set_attribute name.to_s, value
  end

  node

end

.add_namespaces(node, **namespaces) ⇒ Object

Add namespace information to a node

Example:

xml_doc.add_namespaces xml_doc.root, jkr: 'http://JKRowling.com', node_ns: 'jkr'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <jkr:patron xmlns:jkr="http://JKRowling.com">
        ...
    </jkr:patron>

xml_doc.add_namespaces xml_doc.root, nil => 'http://JKRowling.com'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron xmlns="http://JKRowling.com">
        ...
    </patron>

Parameters:

  • node (Nokogiri::XML::Node)

    the node where the namespace info should be added to

  • namespaces (Hash)

    a Hash with prefix - URI pairs for each namespace definition that should be added. The special key :node_ns is reserved for specifying the prefix for the node itself. To set the default namespace, use the prefix nil



377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
# File 'lib/libis/tools/xml_document.rb', line 377

def self.add_namespaces(node, **namespaces)

  node_ns = namespaces.delete :node_ns
  default_ns = namespaces.delete nil

  namespaces.each do |prefix, prefix_uri|
    node.add_namespace prefix.to_s, prefix_uri
  end

  node.namespace_scopes.each do |ns|
    node.namespace = ns if ns.prefix == node_ns.to_s
  end if node_ns

  node.default_namespace = default_ns if default_ns

  node

end

.build(&block) ⇒ XmlDocument

Creates a new XML document with contents supplied in Nokogiri build short syntax.

Example:

xml_doc = ::Libis::Tools::XmlDocument.build do
  patron {
    name 'Harry Potter'
    barcode( '1234567890', library: 'Hogwarts Library')
    access_level 'student'
    email '[email protected]'
    email '[email protected]'
    books {
      book title: 'Quidditch Through the Ages', author: 'Kennilworthy Whisp', due_date: '1992-4-23'
    }
  }
end
p xml_doc.to_xml
# =>
      <?xml version="1.0" encoding="utf-8"?>
          <patron>
      <name>Harry Potter</name>
        <barcode library="Hogwarts Library">1234567890</barcode>
      <access_level>student</access_level>
        <email>[email protected]</email>
      <email>[email protected]</email>
        <books>
          <book title="Quidditch Through the Ages" author="Kennilworthy Whisp" due_date="1992-4-23"/>
      </books>
      </patron>

Parameters:

  • block (Code block)

    Build instructions

Returns:



255
256
257
# File 'lib/libis/tools/xml_document.rb', line 255

def self.build(&block)
  self.new.build(nil, &block)
end

.from_hash(hash, options = {}) ⇒ XmlDocument

Note:

The Hash will be converted with Gyoku. See the Gyoku documentation for the Hash format requirements.

Create a new instance initialized with a Hash.

Parameters:

  • hash (Hash)

    the content

  • options (Hash) (defaults to: {})

    options passed to Gyoku upon parsing the Hash into XML

Returns:



72
73
74
75
76
77
# File 'lib/libis/tools/xml_document.rb', line 72

def self.from_hash(hash, options = {})
  doc = XmlDocument.new
  doc.document = Nokogiri::XML(Gyoku.xml(hash, **options))
  doc.document.encoding = 'utf-8'
  doc
end

.get_content(nodelist) ⇒ String

Return the content of the first element in the set of nodes.

Example:

::Libis::Tools::XmlDocument.get_content(xml_doc.xpath('//email')) # => "[email protected]"

Parameters:

  • nodelist ({Nokogiri::XML::NodeSet})

    set of nodes to get content from

Returns:

  • (String)

    content of the first node; always returns at least an empty string



469
470
471
# File 'lib/libis/tools/xml_document.rb', line 469

def self.get_content(nodelist)
  (nodelist.first && nodelist.first.content) || ''
end

.open(file) ⇒ XmlDocument

Create a new instance initialized with the content of an XML file.

Parameters:

  • file (String)

    path to the XML file

Returns:



52
53
54
55
56
# File 'lib/libis/tools/xml_document.rb', line 52

def self.open(file)
  doc = XmlDocument.new
  doc.document = Nokogiri::XML(File.open(file))
  doc
end

.parse(xml) ⇒ XmlDocument

Create a new instance initialized with an XML String.

Parameters:

Returns:



61
62
63
64
65
# File 'lib/libis/tools/xml_document.rb', line 61

def self.parse(xml)
  doc = XmlDocument.new
  doc.document = Nokogiri::XML.parse(xml)
  doc
end

Instance Method Details

#[](path) ⇒ String

Return the content of the first element found.

Example:

xml_doc['email'] # => "[email protected]"

Parameters:

  • path (String)

    the name or XPath term to search the node(s)

Returns:

  • (String)

    content or nil if not found



446
447
448
# File 'lib/libis/tools/xml_document.rb', line 446

def [](path)
  xpath(path).first.content rescue nil
end

#[]=(path, value) ⇒ String

Find a node and set its content.

Example:

xml_doc['//access_level'] = 'postgraduate'
p xml_doc.to_xml
# =>
      <?xml version="1.0" encoding="utf-8"?>
      <patron>
        ...
        <access_level>postgraduate</access_level>
        ...
      </patron>

Parameters:

  • value (String)

    the content

  • path (String)

    the name or XPath term to search the node(s)

  • parent (Node)

    parent node; document if nil

Returns:



490
491
492
493
494
495
496
497
# File 'lib/libis/tools/xml_document.rb', line 490

def []=(path, value)
  begin
    nodes = xpath(path)
    nodes.first.content = value
  rescue
    # ignored
  end
end

#add_attributes(node, **attributes) ⇒ Nokogiri::XML::Node

Note:

The Nokogiri method Node#[]= is probably easier to use if you only want to add a single attribute ;the main purpose of this method is to make it easier to add attributes in bulk or if you have them already available as a Hash

Add attributes to a node. Example:

xml_doc.add_attributes xml_doc.root, status: 'active', id: '123456'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron id="123456" status="active">
        ...
    </patron>

Parameters:

  • node (Nokogiri::XML::Node)

    node to add the attributes to

  • attributes (Hash)

    a Hash with tag - value pairs for each attribute

Returns:

  • (Nokogiri::XML::Node)

    the node



333
334
335
# File 'lib/libis/tools/xml_document.rb', line 333

def add_attributes(node, **attributes)
  XmlDocument.add_attributes node, **attributes
end

#add_namespaces(node, **namespaces) ⇒ Object

Add namespace information to a node

Example:

xml_doc.add_namespaces xml_doc.root, jkr: 'http://JKRowling.com', node_ns: 'jkr'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <jkr:patron xmlns:jkr="http://JKRowling.com">
        ...
    </jkr:patron>

xml_doc.add_namespaces xml_doc.root, nil => 'http://JKRowling.com'
xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron xmlns="http://JKRowling.com">
        ...
    </patron>

Parameters:

  • node (Nokogiri::XML::Node)

    the node where the namespace info should be added to

  • namespaces (Hash)

    a Hash with prefix - URI pairs for each namespace definition that should be added. The special key :node_ns is reserved for specifying the prefix for the node itself. To set the default namespace, use the prefix nil



372
373
374
# File 'lib/libis/tools/xml_document.rb', line 372

def add_namespaces(node, **namespaces)
  XmlDocument.add_namespaces node, **namespaces
end

#add_node(*args, **attributes) ⇒ Nokogiri::XML::Node

Adds a new XML node to the document.

Example:

xml_doc = ::Libis::Tools::XmlDocument.new
xml_doc.valid? # => false
xml_doc.add_node :patron
xml_doc.add_node :name, 'Harry Potter'
books = xml_doc.add_node :books, nil, nil, namespaces: { jkr: 'http://JKRowling.com', node_ns: 'jkr' }
xml_doc.add_node :book, nil, books,
    title: 'Quidditch Through the Ages', author: 'Kennilworthy Whisp', due_date: '1992-4-23',
    namespaces: { node_ns: 'jkr' }
p xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron>
        <name>Harry Potter</name>
        <jkr:books xmlns:jkr="http://JKRowling.com">
          <jkr:book author="Kennilworthy Whisp" due_date="1992-4-23" title="Quidditch Through the Ages"/>
        </jkr:books>
    </patron>

Parameters:

  • args (Array)

    arguments being:

    • tag for the new node

    • optional content for new node; empty if nil or not present

    • optional parent node for new node; root if nil or not present; xml document if root is not defined

    • a Hash containing tag-value pairs for each attribute; the special key ‘:namespaces’ contains a Hash of namespace definitions as in #add_namespaces

Returns:

  • (Nokogiri::XML::Node)

    the new node



288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
# File 'lib/libis/tools/xml_document.rb', line 288

def add_node(*args, **attributes)
  name, value, parent = *args

  return nil if name.nil?

  node = Nokogiri::XML::Node.new name.to_s, @document
  node.content = value

  if !parent.nil?
    parent << node
  elsif !self.root.nil?
    self.root << node
  else
    self.root = node
  end

  return node if attributes.empty?

  namespaces = attributes.delete :namespaces
  add_namespaces(node, **namespaces) if namespaces

  add_attributes(node, **attributes) if attributes

  node

end

#add_processing_instruction(name, content) ⇒ Nokogiri::XML::Node

Note:

unlike regular nodes, these nodes are automatically added to the document.

Add a processing instruction to the current XML document.

Parameters:

  • name (String)

    instruction name

  • content (String)

    instruction content

Returns:

  • (Nokogiri::XML::Node)

    the processing instruction node



148
149
150
151
152
# File 'lib/libis/tools/xml_document.rb', line 148

def add_processing_instruction(name, content)
  processing_instruction = Nokogiri::XML::ProcessingInstruction.new(@document, name, content)
  @document.root.add_previous_sibling processing_instruction
  processing_instruction
end

#build(at_node = nil, options = {}, &block) ⇒ XmlDocument

Creates nodes using the Nokogiri build short syntax.

Example:

xml_doc.build(xml_doc.root) do |xml|
  xml.books do
    xml.book title: 'Quidditch Through the Ages', author: 'Kennilworthy Whisp', due_date: '1992-4-23'
  end
end
p xml_doc.to_xml
# =>
      <?xml version="1.0" encoding="utf-8"?>
      <patron>
          ...
          <books>
            <book title="Quidditch Through the Ages" author="Kennilworthy Whisp" due_date="1992-4-23"/>
          </books>
      </patron>

Parameters:

  • at_node (Nokogiri::XML::Node) (defaults to: nil)

    the node to attach the new nodes to; optional - if missing or nil the new nodes will replace the entire document

  • block (Code block)

    Build instructions

Returns:



212
213
214
215
216
217
218
219
220
221
# File 'lib/libis/tools/xml_document.rb', line 212

def build(at_node = nil, options = {}, &block)
  options = {encoding: 'utf-8' }.merge options
  if at_node
      Nokogiri::XML::Builder.new(options, at_node, &block)
  else
    xml = Nokogiri::XML::Builder.new(options, &block)
    @document = xml.doc
  end
  self
end

#get_node(tag, parent = nil) ⇒ Object

Get the first node matching the tag. The node will be seached with XPath search term = “//#{tag}”.

Parameters:

  • tag (String)

    XML tag to look for; XPath syntax is allowed

  • parent (Node) (defaults to: nil)


564
565
566
# File 'lib/libis/tools/xml_document.rb', line 564

def get_node(tag, parent = nil)
  get_nodes(tag, parent).first
end

#get_nodes(tag, parent = nil) ⇒ Object

Get all the nodes matching the tag. The node will be seached with XPath search term = “//#{tag}”.

Parameters:

  • tag (String)

    XML tag to look for; XPath syntax is allowed

  • parent (Node) (defaults to: nil)


572
573
574
575
576
# File 'lib/libis/tools/xml_document.rb', line 572

def get_nodes(tag, parent = nil)
  parent ||= root
  term = "#{tag.to_s =~ /^\// ? '' : '//'}#{tag.to_s}"
  parent.xpath(term)
end

#has_element?(element_name) ⇒ Integer

Check if the XML document contains certain element(s) anywhere in the XML document.

Example:

xml_doc.has_element? 'barcode[@library="Hogwarts Library"]' # => true

Parameters:

  • element_name (String)

    name of the element(s) to search

Returns:

  • (Integer)

    number of elements found



419
420
421
422
# File 'lib/libis/tools/xml_document.rb', line 419

def has_element?(element_name)
  list = xpath("//#{element_name}")
  list.nil? ? 0 : list.size
end

#invalid?Boolean

Check if the embedded XML document is not present or invalid.

Returns:

  • (Boolean)


30
31
32
# File 'lib/libis/tools/xml_document.rb', line 30

def invalid?
  @document.nil? or !document.is_a?(::Nokogiri::XML::Document) or @document.root.nil?
end

#root{Nokogiri::XML::Node}

Get the root node of the XML Document.

Example:

puts xml_doc.root.to_xml
# =>
    <patron>
      ...
    </patron>

Returns:

  • ({Nokogiri::XML::Node})

    the root node of the XML Document

Raises:

  • (ArgumentError)


165
166
167
168
# File 'lib/libis/tools/xml_document.rb', line 165

def root
  raise ArgumentError, 'XML document not valid.' if @document.nil?
  @document.root
end

#root=(node) ⇒ {Nokogiri::XML::Node}

Set the root node of the XML Document.

Example:

patron = ::Nokogiri::XML::Node.new 'patron', xml_doc.document
xml_doc.root = patron
puts xml_doc.to_xml
# =>
    <?xml version="1.0" encoding="utf-8"?>
    <patron/>

Parameters:

  • node ({Nokogiri::XML::Node})

    new root node

Returns:

  • ({Nokogiri::XML::Node})

    the new root node

Raises:

  • (ArgumentError)


183
184
185
186
187
# File 'lib/libis/tools/xml_document.rb', line 183

def root=(node)
  raise ArgumentError, 'XML document not valid.' if @document.nil?
  #noinspection RubyArgCount
  @document.root = node
end

#save(file, indent = 2, encoding = 'utf-8') ⇒ Object

Save the XML Document to a given XML file.

Parameters:

  • file (String)

    name of the file to save to

  • indent (Integer) (defaults to: 2)

    amount of space for indenting; default 2

  • encoding (String) (defaults to: 'utf-8')

    character encoding; default ‘utf-8’



83
84
85
86
87
# File 'lib/libis/tools/xml_document.rb', line 83

def save(file, indent = 2, encoding = 'utf-8')
  fd = File.open(file, 'w')
  @document.write_xml_to(fd, :indent => indent, :encoding => encoding)
  fd.close
end

#to_hash(options = {}) ⇒ Hash

Note:

The hash is generated using the Nori gem. The options passed to this call are used to configure Nori in the constructor call. For content and syntax see the Nori documentation. Nori also uses an enhanced String class with an extra method #attributes that will return a Hash containing tag-value pairs for each attribute of the XML element.

Export the XML Document to a Hash.

Example:

h = xml_doc.to_hash
# => { "patron" =>
        { "name" => "Harry Potter",
          "barcode" => "1234567890",
          "access_level" => "student",
          "email" => ["[email protected]", "[email protected]"],
     }  }
h['patron']['barcode']
# => "12345678890"
h['patron']['barcode'].attributes
# => {"library" => "Hogwarts Library"}
h['patron']['barcode'].attributes['library']
# => "Hogwarts Library"

Parameters:

  • options (Hash) (defaults to: {})

Returns:



124
125
126
# File 'lib/libis/tools/xml_document.rb', line 124

def to_hash(options = {})
  Nori.new(**options).parse(to_xml)
end

#to_xml(options = {}) ⇒ String

Export the XML Document to an XML string.

Parameters:

  • options (Hash) (defaults to: {})

    options passed to the underlying Nokogiri::XML::Document#to_xml; default is: 2, encoding: ‘utf-8’

Returns:



93
94
95
96
# File 'lib/libis/tools/xml_document.rb', line 93

def to_xml(options = {})
  options = {indent: 2, encoding: 'utf-8', save_with: Nokogiri::XML::Node::SaveOptions::DEFAULT_XML}.merge(options)
  @document.to_xml(**options)
end

#valid?Boolean

Check if the embedded XML document is present and valid

Returns:

  • (Boolean)


35
36
37
# File 'lib/libis/tools/xml_document.rb', line 35

def valid?
  !invalid?
end

#validate(schema) ⇒ Array<{Nokogiri::XML::SyntaxError}>

Check if the document validates against a given XML schema file.

Returns:

  • (Array<{Nokogiri::XML::SyntaxError}>)

    a list of validation errors



138
139
140
141
# File 'lib/libis/tools/xml_document.rb', line 138

def validate(schema)
  schema_doc = Nokogiri::XML::Schema.new(File.open(schema))
  schema_doc.validate(@document)
end

#validates_against?(schema) ⇒ Boolean

Check if the document validates against a given XML schema file.

Parameters:

  • schema (String)

    the file path of the XML schema

Returns:

  • (Boolean)


131
132
133
134
# File 'lib/libis/tools/xml_document.rb', line 131

def validates_against?(schema)
  schema_doc = Nokogiri::XML::Schema.new(File.open(schema))
  schema_doc.valid?(@document)
end

#value(path, parent = nil) ⇒ String

Return the content of the first element found.

Example:

xml_doc.value('//email') # => "[email protected]"

Parameters:

  • path (String)

    the name or XPath term to search the node(s)

  • parent (Node) (defaults to: nil)

    parent node; document if nil

Returns:

  • (String)

    content or nil if not found



433
434
435
436
# File 'lib/libis/tools/xml_document.rb', line 433

def value(path, parent = nil)
  parent ||= document
  parent.xpath(path).first.content rescue nil
end

#values(path) ⇒ Array<String>

Return the content of all elements found. Example:

xml_doc.values('//email') # => [ "[email protected]", "[email protected]" ]

Parameters:

  • path (String)

    the name or XPath term to search the node(s)

  • parent (Node)

    parent node; document if nil

Returns:



457
458
459
# File 'lib/libis/tools/xml_document.rb', line 457

def values(path)
  xpath(path).map &:content
end

#xpath(path) ⇒ {Nokogiri::XML::NodeSet}

Search for nodes in the current document root.

Example:

nodes = xml_doc.xpath('//email')
nodes.size # => 2
nodes.map(&:content) # => ["[email protected]", "[email protected]"]

Parameters:

  • path (String)

    XPath search string

Returns:

  • ({Nokogiri::XML::NodeSet})

    set of nodes found

Raises:

  • (ArgumentError)


406
407
408
409
# File 'lib/libis/tools/xml_document.rb', line 406

def xpath(path)
  raise ArgumentError, 'XML document not valid.' if self.invalid?
  @document.xpath(path.to_s)
end