Class: Oga::XML::PullParser
- Defined in:
- lib/oga/xml/pull_parser.rb
Overview
The PullParser class can be used to parse an XML document incrementally instead of parsing it as a whole. This results in lower memory usage and potentially faster parsing times. The downside is that pull parsers are typically more difficult to use compared to DOM parsers.
Basic parsing using this class works as following:
parser = Oga::XML::PullParser.new('... xml here ...')
parser.parse do |node|
if node.is_a?(Oga::XML::PullParser)
end
end
This parses yields proper XML instances such as Element. Doctypes and XML declarations are ignored by this parser.
Constant Summary collapse
- DISABLED_CALLBACKS =
[ :on_document, :on_doctype, :on_xml_decl, :on_element_children ]
- BLOCK_CALLBACKS =
[ :on_cdata, :on_comment, :on_text, :on_proc_ins ]
- NODE_SHORTHANDS =
Returns the shorthands that can be used for various node classes.
{ :text => XML::Text, :node => XML::Node, :cdata => XML::Cdata, :element => XML::Element, :doctype => XML::Doctype, :comment => XML::Comment, :xml_declaration => XML::XmlDeclaration }
Constants inherited from Parser
Oga::XML::Parser::CONFIG, Oga::XML::Parser::TOKEN_ERROR_MAPPING
Instance Attribute Summary collapse
-
#nesting ⇒ Array
readonly
Array containing the names of the currently nested elements.
- #node ⇒ Oga::XML::Node readonly
Instance Method Summary collapse
- #after_element(*args) ⇒ Object
-
#on(type, nesting = []) ⇒ Object
Calls the supplied block if the current node type and optionally the nesting match.
- #on_element(*args) ⇒ Object
-
#parse {|| ... } ⇒ Object
Parses the input and yields every node to the supplied block.
- #reset ⇒ Object
Methods inherited from Parser
#_rule_0, #_rule_1, #_rule_10, #_rule_11, #_rule_12, #_rule_13, #_rule_14, #_rule_15, #_rule_16, #_rule_17, #_rule_18, #_rule_19, #_rule_2, #_rule_20, #_rule_21, #_rule_22, #_rule_23, #_rule_24, #_rule_25, #_rule_26, #_rule_27, #_rule_28, #_rule_29, #_rule_3, #_rule_30, #_rule_31, #_rule_32, #_rule_33, #_rule_34, #_rule_35, #_rule_36, #_rule_37, #_rule_38, #_rule_39, #_rule_4, #_rule_40, #_rule_41, #_rule_42, #_rule_43, #_rule_44, #_rule_45, #_rule_5, #_rule_6, #_rule_7, #_rule_8, #_rule_9, #each_token, #initialize, #on_attribute, #on_attributes, #on_cdata, #on_comment, #on_doctype, #on_document, #on_element_children, #on_proc_ins, #on_text, #on_xml_decl, #parser_error
Constructor Details
This class inherits a constructor from Oga::XML::Parser
Instance Attribute Details
#nesting ⇒ Array (readonly)
Array containing the names of the currently nested elements.
28 29 30 |
# File 'lib/oga/xml/pull_parser.rb', line 28 def nesting @nesting end |
#node ⇒ Oga::XML::Node (readonly)
24 25 26 |
# File 'lib/oga/xml/pull_parser.rb', line 24 def node @node end |
Instance Method Details
#after_element(*args) ⇒ Object
168 169 170 171 172 |
# File 'lib/oga/xml/pull_parser.rb', line 168 def after_element(*args) nesting.pop return end |
#on(type, nesting = []) ⇒ Object
Calls the supplied block if the current node type and optionally the nesting match. This method allows you to write this:
parser.parse do |node|
parser.on(:text, %w{people person name}) do
puts node.text
end
end
Instead of this:
parser.parse do |node|
if node.is_a?(Oga::XML::Text) and parser.nesting == %w{people person name}
puts node.text
end
end
When calling this method you can specify the following node types:
:cdata
:comment
:element
:text
124 125 126 127 128 129 130 |
# File 'lib/oga/xml/pull_parser.rb', line 124 def on(type, nesting = []) if node.is_a?(NODE_SHORTHANDS[type]) if nesting.empty? or nesting == self.nesting yield end end end |
#on_element(*args) ⇒ Object
155 156 157 158 159 160 161 162 163 |
# File 'lib/oga/xml/pull_parser.rb', line 155 def on_element(*args) @node = super nesting << @node.name @block.call(@node) return end |
#parse {|| ... } ⇒ Object
Parses the input and yields every node to the supplied block.
81 82 83 84 85 86 87 |
# File 'lib/oga/xml/pull_parser.rb', line 81 def parse(&block) @block = block super return end |
#reset ⇒ Object
68 69 70 71 72 73 74 |
# File 'lib/oga/xml/pull_parser.rb', line 68 def reset super @block = nil @nesting = [] @node = nil end |