Class: Nokogiri::XML::Document

Inherits:
Node
  • Object
show all
Defined in:
lib/nokogiri/xml/document.rb,
ext/nokogiri/xml_document.c

Overview

Nokogiri::XML::Document is the main entry point for dealing with XML documents. The Document is created by parsing an XML document. See Nokogiri::XML::Document.parse() for more information on parsing.

For searching a Document, see Nokogiri::XML::Searchable#css and Nokogiri::XML::Searchable#xpath

Direct Known Subclasses

HTML4::Document, HTML::Document

Constant Summary collapse

NCNAME_START_CHAR =

See www.w3.org/TR/REC-xml-names/#ns-decl for more details. Note that we’re not attempting to handle unicode characters partly because libxml2 doesn’t handle unicode characters in NCNAMEs.

"A-Za-z_"
NCNAME_CHAR =
NCNAME_START_CHAR + "\\-\\.0-9"
NCNAME_RE =
/^xmlns(?::([#{NCNAME_START_CHAR}][#{NCNAME_CHAR}]*))?$/

Constants inherited from Node

Node::ATTRIBUTE_DECL, Node::ATTRIBUTE_NODE, Node::CDATA_SECTION_NODE, Node::COMMENT_NODE, Node::DOCB_DOCUMENT_NODE, Node::DOCUMENT_FRAG_NODE, Node::DOCUMENT_NODE, Node::DOCUMENT_TYPE_NODE, Node::DTD_NODE, Node::ELEMENT_DECL, Node::ELEMENT_NODE, Node::ENTITY_DECL, Node::ENTITY_NODE, Node::ENTITY_REF_NODE, Node::HTML_DOCUMENT_NODE, Node::NAMESPACE_DECL, Node::NOTATION_NODE, Node::PI_NODE, Node::TEXT_NODE, Node::XINCLUDE_END, Node::XINCLUDE_START

Constants included from Searchable

Searchable::LOOKS_LIKE_XPATH

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Node

#<=>, #==, #>, #[], #[]=, #accept, #add_class, #add_namespace_definition, #add_next_sibling, #add_previous_sibling, #after, #ancestors, #append_class, #attribute, #attribute_nodes, #attribute_with_ns, #attributes, #before, #blank?, #cdata?, #child, #children, #children=, #classes, #comment?, #content, #content=, #create_external_subset, #create_internal_subset, #css_path, #decorate!, #default_namespace=, #description, #do_xinclude, #document?, #each, #element?, #element_children, #encode_special_chars, #external_subset, #first_element_child, #fragment?, #html?, #inner_html, #inner_html=, #internal_subset, #key?, #keys, #kwattr_add, #kwattr_append, #kwattr_remove, #kwattr_values, #lang, #lang=, #last_element_child, #line, #line=, #matches?, #namespace, #namespace=, #namespace_definitions, #namespace_scopes, #namespaced_key?, #native_content=, #next_element, #next_sibling, #node_name, #node_name=, #node_type, #parent, #parent=, #parse, #path, #pointer_id, #prepend_child, #previous_element, #previous_sibling, #processing_instruction?, #read_only?, #remove_attribute, #remove_class, #replace, #serialize, #swap, #text?, #to_html, #to_s, #to_xhtml, #traverse, #unlink, #value?, #values, #wrap, #write_html_to, #write_to, #write_xhtml_to, #write_xml_to, #xml?

Methods included from Searchable

#at, #at_css, #at_xpath, #css, #search, #xpath

Methods included from PP::Node

#inspect, #pretty_print

Methods included from HTML5::Node

#inner_html, #write_to

Constructor Details

#initialize(*args) ⇒ Document

:nodoc:



161
162
163
164
165
# File 'lib/nokogiri/xml/document.rb', line 161

def initialize *args # :nodoc:
  @errors     = []
  @decorators = nil
  @namespace_inheritance = false
end

Instance Attribute Details

#errorsObject

A list of Nokogiri::XML::SyntaxError found when parsing a document



114
115
116
# File 'lib/nokogiri/xml/document.rb', line 114

def errors
  @errors
end

#namespace_inheritanceBoolean

When true, reparented elements without a namespace will inherit their new parent’s namespace (if one exists). Defaults to false.

Examples:

Default behavior of namespace inheritance

xml = <<~EOF
        <root xmlns:foo="http://nokogiri.org/default_ns/test/foo">
          <foo:parent>
          </foo:parent>
        </root>
      EOF
doc = Nokogiri::XML(xml)
parent = doc.at_xpath("//foo:parent", "foo" => "http://nokogiri.org/default_ns/test/foo")
parent.add_child("<child></child>")
doc.to_xml
# => <?xml version="1.0"?>
#    <root xmlns:foo="http://nokogiri.org/default_ns/test/foo">
#      <foo:parent>
#        <child/>
#      </foo:parent>
#    </root>

Setting namespace inheritance to true

xml = <<~EOF
        <root xmlns:foo="http://nokogiri.org/default_ns/test/foo">
          <foo:parent>
          </foo:parent>
        </root>
      EOF
doc = Nokogiri::XML(xml)
doc.namespace_inheritance = true
parent = doc.at_xpath("//foo:parent", "foo" => "http://nokogiri.org/default_ns/test/foo")
parent.add_child("<child></child>")
doc.to_xml
# => <?xml version="1.0"?>
#    <root xmlns:foo="http://nokogiri.org/default_ns/test/foo">
#      <foo:parent>
#        <foo:child/>
#      </foo:parent>
#    </root>

Returns:

  • (Boolean)

Since:

  • v1.12.4



159
160
161
# File 'lib/nokogiri/xml/document.rb', line 159

def namespace_inheritance
  @namespace_inheritance
end

Class Method Details

.new(version = default) ⇒ Object

Create a new document with version (defaults to “1.0”)



390
391
392
393
394
395
396
397
398
399
400
401
402
403
# File 'ext/nokogiri/xml_document.c', line 390

static VALUE
new (int argc, VALUE *argv, VALUE klass)
{
  xmlDocPtr doc;
  VALUE version, rest, rb_doc ;

  rb_scan_args(argc, argv, "0*", &rest);
  version = rb_ary_entry(rest, (long)0);
  if (NIL_P(version)) { version = rb_str_new2("1.0"); }

  doc = xmlNewDoc((xmlChar *)StringValueCStr(version));
  rb_doc = noko_xml_document_wrap_with_init_args(klass, doc, argc, argv);
  return rb_doc ;
}

.parse(string_or_io, url = nil, encoding = nil, options = ParseOptions::DEFAULT_XML) {|options| ... } ⇒ Object

Parse an XML file.

string_or_io may be a String, or any object that responds to read and close such as an IO, or StringIO.

url (optional) is the URI where this document is located.

encoding (optional) is the encoding that should be used when processing the document.

options (optional) is a configuration object that sets options during parsing, such as Nokogiri::XML::ParseOptions::RECOVER. See the Nokogiri::XML::ParseOptions for more information.

block (optional) is passed a configuration object on which parse options may be set.

By default, Nokogiri treats documents as untrusted, and so does not attempt to load DTDs or access the network. See Nokogiri::XML::ParseOptions for a complete list of options; and that module’s DEFAULT_XML constant for what’s set (and not set) by default.

Nokogiri.XML() is a convenience method which will call this method.

Yields:

  • (options)


50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# File 'lib/nokogiri/xml/document.rb', line 50

def self.parse string_or_io, url = nil, encoding = nil, options = ParseOptions::DEFAULT_XML
  options = Nokogiri::XML::ParseOptions.new(options) if Integer === options

  yield options if block_given?

  url ||= string_or_io.respond_to?(:path) ? string_or_io.path : nil

  if empty_doc?(string_or_io)
    if options.strict?
      raise Nokogiri::XML::SyntaxError.new("Empty document")
    else
      return encoding ? new.tap { |i| i.encoding = encoding } : new
    end
  end

  doc = if string_or_io.respond_to?(:read)
          if string_or_io.is_a?(Pathname)
            # resolve the Pathname to the file and open it as an IO object, see #2110
            string_or_io = string_or_io.expand_path.open
            url ||= string_or_io.path
          end

          read_io(string_or_io, url, encoding, options.to_i)
        else
          # read_memory pukes on empty docs
          read_memory(string_or_io, url, encoding, options.to_i)
        end

  # do xinclude processing
  doc.do_xinclude(options) if options.xinclude?

  return doc
end

.read_io(io, url, encoding, options) ⇒ Object

Create a new document from an IO object



262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
# File 'ext/nokogiri/xml_document.c', line 262

static VALUE
read_io(VALUE klass,
        VALUE io,
        VALUE url,
        VALUE encoding,
        VALUE options)
{
  const char *c_url    = NIL_P(url)      ? NULL : StringValueCStr(url);
  const char *c_enc    = NIL_P(encoding) ? NULL : StringValueCStr(encoding);
  VALUE error_list      = rb_ary_new();
  VALUE document;
  xmlDocPtr doc;

  xmlResetLastError();
  xmlSetStructuredErrorFunc((void *)error_list, Nokogiri_error_array_pusher);

  doc = xmlReadIO(
          (xmlInputReadCallback)noko_io_read,
          (xmlInputCloseCallback)noko_io_close,
          (void *)io,
          c_url,
          c_enc,
          (int)NUM2INT(options)
        );
  xmlSetStructuredErrorFunc(NULL, NULL);

  if (doc == NULL) {
    xmlErrorPtr error;

    xmlFreeDoc(doc);

    error = xmlGetLastError();
    if (error) {
      rb_exc_raise(Nokogiri_wrap_xml_syntax_error(error));
    } else {
      rb_raise(rb_eRuntimeError, "Could not parse document");
    }

    return Qnil;
  }

  document = noko_xml_document_wrap(klass, doc);
  rb_iv_set(document, "@errors", error_list);
  return document;
}

.read_memory(string, url, encoding, options) ⇒ Object

Create a new document from a String



314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
# File 'ext/nokogiri/xml_document.c', line 314

static VALUE
read_memory(VALUE klass,
            VALUE string,
            VALUE url,
            VALUE encoding,
            VALUE options)
{
  const char *c_buffer = StringValuePtr(string);
  const char *c_url    = NIL_P(url)      ? NULL : StringValueCStr(url);
  const char *c_enc    = NIL_P(encoding) ? NULL : StringValueCStr(encoding);
  int len               = (int)RSTRING_LEN(string);
  VALUE error_list      = rb_ary_new();
  VALUE document;
  xmlDocPtr doc;

  xmlResetLastError();
  xmlSetStructuredErrorFunc((void *)error_list, Nokogiri_error_array_pusher);
  doc = xmlReadMemory(c_buffer, len, c_url, c_enc, (int)NUM2INT(options));
  xmlSetStructuredErrorFunc(NULL, NULL);

  if (doc == NULL) {
    xmlErrorPtr error;

    xmlFreeDoc(doc);

    error = xmlGetLastError();
    if (error) {
      rb_exc_raise(Nokogiri_wrap_xml_syntax_error(error));
    } else {
      rb_raise(rb_eRuntimeError, "Could not parse document");
    }

    return Qnil;
  }

  document = noko_xml_document_wrap(klass, doc);
  rb_iv_set(document, "@errors", error_list);
  return document;
}

.wrap(java_document) ⇒ Nokogiri::XML::Document

Note:

This method is only available when running JRuby.

Note:

The class Java::OrgW3cDom::Document is also accessible as org.w3c.dom.Document.

Create a Nokogiri::XML::Document using an existing Java DOM document object.

The returned Nokogiri::XML::Document shares the same underlying data structure as the Java object, so changes in one are reflected in the other.

Parameters:

  • java_document (Java::OrgW3cDom::Document)

Returns:

See Also:



# File 'lib/nokogiri/xml/document.rb', line 84

Instance Method Details

#add_child(node_or_tags) ⇒ Object Also known as: <<



355
356
357
358
359
360
361
362
363
364
# File 'lib/nokogiri/xml/document.rb', line 355

def add_child node_or_tags
  raise "A document may not have multiple root nodes." if (root && root.name != 'nokogiri_text_wrapper') && !(node_or_tags.comment? || node_or_tags.processing_instruction?)
  node_or_tags = coerce(node_or_tags)
  if node_or_tags.is_a?(XML::NodeSet)
    raise "A document may not have multiple root nodes." if node_or_tags.size > 1
    super(node_or_tags.first)
  else
    super
  end
end

#canonicalize(mode = XML_C14N_1_0, inclusive_namespaces = nil, with_comments = false) ⇒ Object #canonicalize {|obj, parent| ... } ⇒ Object

Canonicalize a document and return the results. Takes an optional block that takes two parameters: the obj and that node’s parent. The obj will be either a Nokogiri::XML::Node, or a Nokogiri::XML::Namespace The block must return a non-nil, non-false value if the obj passed in should be included in the canonicalized document.

Overloads:

  • #canonicalize {|obj, parent| ... } ⇒ Object

    Yields:



533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
# File 'ext/nokogiri/xml_document.c', line 533

static VALUE
rb_xml_document_canonicalize(int argc, VALUE *argv, VALUE self)
{
  VALUE mode;
  VALUE incl_ns;
  VALUE with_comments;
  xmlChar **ns;
  long ns_len, i;

  xmlDocPtr doc;
  xmlOutputBufferPtr buf;
  xmlC14NIsVisibleCallback cb = NULL;
  void *ctx = NULL;

  VALUE rb_cStringIO;
  VALUE io;

  rb_scan_args(argc, argv, "03", &mode, &incl_ns, &with_comments);

  Data_Get_Struct(self, xmlDoc, doc);

  rb_cStringIO = rb_const_get_at(rb_cObject, rb_intern("StringIO"));
  io           = rb_class_new_instance(0, 0, rb_cStringIO);
  buf          = xmlAllocOutputBuffer(NULL);

  buf->writecallback = (xmlOutputWriteCallback)noko_io_write;
  buf->closecallback = (xmlOutputCloseCallback)noko_io_close;
  buf->context       = (void *)io;

  if (rb_block_given_p()) {
    cb = block_caller;
    ctx = (void *)rb_block_proc();
  }

  if (NIL_P(incl_ns)) {
    ns = NULL;
  } else {
    Check_Type(incl_ns, T_ARRAY);
    ns_len = RARRAY_LEN(incl_ns);
    ns = calloc((size_t)ns_len + 1, sizeof(xmlChar *));
    for (i = 0 ; i < ns_len ; i++) {
      VALUE entry = rb_ary_entry(incl_ns, i);
      ns[i] = (xmlChar *)StringValueCStr(entry);
    }
  }


  xmlC14NExecute(doc, cb, ctx,
                 (int)(NIL_P(mode)        ? 0 : NUM2INT(mode)),
                 ns,
                 (int)      RTEST(with_comments),
                 buf);

  xmlOutputBufferClose(buf);

  return rb_funcall(io, rb_intern("string"), 0);
}

#collect_namespacesObject

Recursively get all namespaces from this node and its subtree and return them as a hash.

For example, given this document:

<root xmlns:foo="bar">
  <bar xmlns:hello="world" />
</root>

This method will return:

{ 'xmlns:foo' => 'bar', 'xmlns:hello' => 'world' }

WARNING: this method will clobber duplicate names in the keys. For example, given this document:

<root xmlns:foo="bar">
  <bar xmlns:foo="baz" />
</root>

The hash returned will look like this: { ‘xmlns:foo’ => ‘bar’ }

Non-prefixed default namespaces (as in “xmlns=”) are not included in the hash.

Note that this method does an xpath lookup for nodes with namespaces, and as a result the order may be dependent on the implementation of the underlying XML library.



280
281
282
283
284
285
# File 'lib/nokogiri/xml/document.rb', line 280

def collect_namespaces
  xpath("//namespace::*").inject({}) do |hash, ns|
    hash[["xmlns",ns.prefix].compact.join(":")] = ns.href if ns.prefix != "xml"
    hash
  end
end

#create_cdata(string, &block) ⇒ Object

Create a CDATA Node containing string



231
232
233
# File 'lib/nokogiri/xml/document.rb', line 231

def create_cdata string, &block
  Nokogiri::XML::CDATA.new self, string.to_s, &block
end

#create_comment(string, &block) ⇒ Object

Create a Comment Node containing string



236
237
238
# File 'lib/nokogiri/xml/document.rb', line 236

def create_comment string, &block
  Nokogiri::XML::Comment.new self, string.to_s, &block
end

#create_element(name, *contents_or_attrs) {|node| ... } ⇒ Nokogiri::XML::Element

Create a new Element with name sharing GC lifecycle with the document, optionally setting contents or attributes.

Arguments may be passed to initialize the element:

  • a Hash argument will be used to set attributes

  • a non-Hash object that responds to #to_s will be used to set the new node’s contents

A block may be passed to mutate the node.

Examples:

An empty element without attributes

doc.create_element("div")
# => <div></div>

An element with contents

doc.create_element("div", "contents")
# => <div>contents</div>

An element with attributes

doc.create_element("div", {"class" => "container"})
# => <div class='container'></div>

An element with contents and attributes

doc.create_element("div", "contents", {"class" => "container"})
# => <div class='container'>contents</div>

Passing a block to mutate the element

doc.create_element("div") { |node| node["class"] = "blue" if before_noon? }

Parameters:

  • name (String)
  • contents_or_attrs (#to_s, Hash)

Yield Parameters:

Returns:



201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
# File 'lib/nokogiri/xml/document.rb', line 201

def create_element(name, *contents_or_attrs, &block)
  elm = Nokogiri::XML::Element.new(name, self, &block)
  contents_or_attrs.each do |arg|
    case arg
    when Hash
      arg.each do |k, v|
        key = k.to_s
        if key =~ NCNAME_RE
          ns_name = Regexp.last_match(1)
          elm.add_namespace_definition(ns_name, v)
        else
          elm[k.to_s] = v.to_s
        end
      end
    else
      elm.content = arg
    end
  end
  if ns = elm.namespace_definitions.find { |n| n.prefix.nil? || (n.prefix == '') }
    elm.namespace = ns
  end
  elm
end

#create_entity(name, type, external_id, system_id, content) ⇒ Object

Create a new entity named name.

type is an integer representing the type of entity to be created, and it defaults to Nokogiri::XML::EntityDecl::INTERNAL_GENERAL. See the constants on Nokogiri::XML::EntityDecl for more information.

external_id, system_id, and content set the External ID, System ID, and content respectively. All of these parameters are optional.



463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
# File 'ext/nokogiri/xml_document.c', line 463

static VALUE
create_entity(int argc, VALUE *argv, VALUE self)
{
  VALUE name;
  VALUE type;
  VALUE external_id;
  VALUE system_id;
  VALUE content;
  xmlEntityPtr ptr;
  xmlDocPtr doc ;

  Data_Get_Struct(self, xmlDoc, doc);

  rb_scan_args(argc, argv, "14", &name, &type, &external_id, &system_id,
               &content);

  xmlResetLastError();
  ptr = xmlAddDocEntity(
          doc,
          (xmlChar *)(NIL_P(name)        ? NULL                        : StringValueCStr(name)),
          (int)(NIL_P(type)        ? XML_INTERNAL_GENERAL_ENTITY : NUM2INT(type)),
          (xmlChar *)(NIL_P(external_id) ? NULL                        : StringValueCStr(external_id)),
          (xmlChar *)(NIL_P(system_id)   ? NULL                        : StringValueCStr(system_id)),
          (xmlChar *)(NIL_P(content)     ? NULL                        : StringValueCStr(content))
        );

  if (NULL == ptr) {
    xmlErrorPtr error = xmlGetLastError();
    if (error) {
      rb_exc_raise(Nokogiri_wrap_xml_syntax_error(error));
    } else {
      rb_raise(rb_eRuntimeError, "Could not create entity");
    }

    return Qnil;
  }

  return noko_xml_node_wrap(cNokogiriXmlEntityDecl, (xmlNodePtr)ptr);
}

#create_text_node(string, &block) ⇒ Object

Create a Text Node with string



226
227
228
# File 'lib/nokogiri/xml/document.rb', line 226

def create_text_node string, &block
  Nokogiri::XML::Text.new string.to_s, self, &block
end

#decorate(node) ⇒ Object

Apply any decorators to node



328
329
330
331
332
333
334
# File 'lib/nokogiri/xml/document.rb', line 328

def decorate node
  return unless @decorators
  @decorators.each { |klass,list|
    next unless node.is_a?(klass)
    list.each { |moodule| node.extend(moodule) }
  }
end

#decorators(key) ⇒ Object

Get the list of decorators given key



288
289
290
291
# File 'lib/nokogiri/xml/document.rb', line 288

def decorators key
  @decorators ||= Hash.new
  @decorators[key] ||= []
end

#documentObject

A reference to self



246
247
248
# File 'lib/nokogiri/xml/document.rb', line 246

def document
  self
end

#dupObject Also known as: clone

Copy this Document. An optional depth may be passed in, but it defaults to a deep copy. 0 is a shallow copy, 1 is a deep copy.



361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
# File 'ext/nokogiri/xml_document.c', line 361

static VALUE
duplicate_document(int argc, VALUE *argv, VALUE self)
{
  xmlDocPtr doc, dup;
  VALUE copy;
  VALUE level;

  if (rb_scan_args(argc, argv, "01", &level) == 0) {
    level = INT2NUM((long)1);
  }

  Data_Get_Struct(self, xmlDoc, doc);

  dup = xmlCopyDoc(doc, (int)NUM2INT(level));

  if (dup == NULL) { return Qnil; }

  dup->type = doc->type;
  copy = noko_xml_document_wrap(rb_obj_class(self), dup);
  rb_iv_set(copy, "@errors", rb_iv_get(self, "@errors"));
  return copy ;
}

#encodingObject

Get the encoding for this Document



230
231
232
233
234
235
236
237
238
# File 'ext/nokogiri/xml_document.c', line 230

static VALUE
encoding(VALUE self)
{
  xmlDocPtr doc;
  Data_Get_Struct(self, xmlDoc, doc);

  if (!doc->encoding) { return Qnil; }
  return NOKOGIRI_STR_NEW2(doc->encoding);
}

#encoding=(encoding) ⇒ Object

Set the encoding string for this Document



209
210
211
212
213
214
215
216
217
218
219
220
221
222
# File 'ext/nokogiri/xml_document.c', line 209

static VALUE
set_encoding(VALUE self, VALUE encoding)
{
  xmlDocPtr doc;
  Data_Get_Struct(self, xmlDoc, doc);

  if (doc->encoding) {
    xmlFree(DISCARD_CONST_QUAL_XMLCHAR(doc->encoding));
  }

  doc->encoding = xmlStrdup((xmlChar *)StringValueCStr(encoding));

  return encoding;
}

#fragment(tags = nil) ⇒ Object

Create a Nokogiri::XML::DocumentFragment from tags Returns an empty fragment if tags is nil.



347
348
349
# File 'lib/nokogiri/xml/document.rb', line 347

def fragment tags = nil
  DocumentFragment.new(self, tags, self.root)
end

#nameObject

The name of this document. Always returns “document”



241
242
243
# File 'lib/nokogiri/xml/document.rb', line 241

def name
  'document'
end

#namespacesObject

Get the hash of namespaces on the root Nokogiri::XML::Node



340
341
342
# File 'lib/nokogiri/xml/document.rb', line 340

def namespaces
  root ? root.namespaces : {}
end

#remove_namespaces!Object

Remove all namespaces from all nodes in the document.

This could be useful for developers who either don’t understand namespaces or don’t care about them.

The following example shows a use case, and you can decide for yourself whether this is a good thing or not:

doc = Nokogiri::XML <<-EOXML
   <root>
     <car xmlns:part="http://general-motors.com/">
       <part:tire>Michelin Model XGV</part:tire>
     </car>
     <bicycle xmlns:part="http://schwinn.com/">
       <part:tire>I'm a bicycle tire!</part:tire>
     </bicycle>
   </root>
   EOXML

doc.xpath("//tire").to_s # => ""
doc.xpath("//part:tire", "part" => "http://general-motors.com/").to_s # => "<part:tire>Michelin Model XGV</part:tire>"
doc.xpath("//part:tire", "part" => "http://schwinn.com/").to_s # => "<part:tire>I'm a bicycle tire!</part:tire>"

doc.remove_namespaces!

doc.xpath("//tire").to_s # => "<tire>Michelin Model XGV</tire><tire>I'm a bicycle tire!</tire>"
doc.xpath("//part:tire", "part" => "http://general-motors.com/").to_s # => ""
doc.xpath("//part:tire", "part" => "http://schwinn.com/").to_s # => ""

For more information on why this probably is not a good thing in general, please direct your browser to tenderlovemaking.com/2009/04/23/namespaces-in-xml.html



442
443
444
445
446
447
448
449
450
# File 'ext/nokogiri/xml_document.c', line 442

static VALUE
remove_namespaces_bang(VALUE self)
{
  xmlDocPtr doc ;
  Data_Get_Struct(self, xmlDoc, doc);

  recursively_remove_namespaces_from_node((xmlNodePtr)doc);
  return self;
}

#rootObject

Get the root node for this document.



187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
# File 'ext/nokogiri/xml_document.c', line 187

static VALUE
rb_xml_document_root(VALUE self)
{
  xmlDocPtr c_document;
  xmlNodePtr c_root;

  Data_Get_Struct(self, xmlDoc, c_document);

  c_root = xmlDocGetRootElement(c_document);
  if (!c_root) {
    return Qnil;
  }

  return noko_xml_node_wrap(Qnil, c_root) ;
}

#root=Object

Set the root element on this document



143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
# File 'ext/nokogiri/xml_document.c', line 143

static VALUE
rb_xml_document_root_set(VALUE self, VALUE rb_new_root)
{
  xmlDocPtr c_document;
  xmlNodePtr c_new_root = NULL, c_current_root;

  Data_Get_Struct(self, xmlDoc, c_document);

  c_current_root = xmlDocGetRootElement(c_document);
  if (c_current_root) {
    xmlUnlinkNode(c_current_root);
    noko_xml_document_pin_node(c_current_root);
  }

  if (!NIL_P(rb_new_root)) {
    if (!rb_obj_is_kind_of(rb_new_root, cNokogiriXmlNode)) {
      rb_raise(rb_eArgError,
               "expected Nokogiri::XML::Node but received %"PRIsVALUE,
               rb_obj_class(rb_new_root));
    }

    Data_Get_Struct(rb_new_root, xmlNode, c_new_root);

    /* If the new root's document is not the same as the current document,
     * then we need to dup the node in to this document. */
    if (c_new_root->doc != c_document) {
      c_new_root = xmlDocCopyNode(c_new_root, c_document, 1);
      if (!c_new_root) {
        rb_raise(rb_eRuntimeError, "Could not reparent node (xmlDocCopyNode)");
      }
    }
  }

  xmlDocSetRootElement(c_document, c_new_root);

  return rb_new_root;
}

#slop!Object

Explore a document with shortcut methods. See Nokogiri::Slop for details.

Note that any nodes that have been instantiated before #slop! is called will not be decorated with sloppy behavior. So, if you’re in irb, the preferred idiom is:

irb> doc = Nokogiri::Slop my_markup

and not

irb> doc = Nokogiri::HTML my_markup
... followed by irb's implicit inspect (and therefore instantiation of every node) ...
irb> doc.slop!
... which does absolutely nothing.


317
318
319
320
321
322
323
324
# File 'lib/nokogiri/xml/document.rb', line 317

def slop!
  unless decorators(XML::Node).include? Nokogiri::Decorators::Slop
    decorators(XML::Node) << Nokogiri::Decorators::Slop
    decorate!
  end

  self
end

#to_javaJava::OrgW3cDom::Document

Note:

This method is only available when running JRuby.

Note:

The class Java::OrgW3cDom::Document is also accessible as org.w3c.dom.Document.

Returns the underlying Java DOM document object for the Nokogiri::XML::Document.

The returned Java object shares the same underlying data structure as the Nokogiri::XML::Document, so changes in one are reflected in the other.

Returns:

  • (Java::OrgW3cDom::Document)

See Also:



# File 'lib/nokogiri/xml/document.rb', line 99

#urlObject

Get the url name for this document.



126
127
128
129
130
131
132
133
134
135
# File 'ext/nokogiri/xml_document.c', line 126

static VALUE
url(VALUE self)
{
  xmlDocPtr doc;
  Data_Get_Struct(self, xmlDoc, doc);

  if (doc->URL) { return NOKOGIRI_STR_NEW2(doc->URL); }

  return Qnil;
}

#validateObject

Validate this Document against it’s DTD. Returns a list of errors on the document or nil when there is no DTD.



296
297
298
299
# File 'lib/nokogiri/xml/document.rb', line 296

def validate
  return nil unless internal_subset
  internal_subset.validate self
end

#versionObject

Get the XML version for this Document



246
247
248
249
250
251
252
253
254
# File 'ext/nokogiri/xml_document.c', line 246

static VALUE
version(VALUE self)
{
  xmlDocPtr doc;
  Data_Get_Struct(self, xmlDoc, doc);

  if (!doc->version) { return Qnil; }
  return NOKOGIRI_STR_NEW2(doc->version);
}