Class: Gammo::Parser

Inherits:
Object
  • Object
show all
Includes:
Constants, Foreign
Defined in:
lib/gammo/parser.rb,
lib/gammo/parser/foreign.rb,
lib/gammo/parser/constants.rb,
lib/gammo/parser/node_stack.rb,
lib/gammo/parser/insertion_mode.rb,
lib/gammo/parser/insertion_mode/text.rb,
lib/gammo/parser/insertion_mode_stack.rb,
lib/gammo/parser/insertion_mode/in_row.rb,
lib/gammo/parser/insertion_mode/in_body.rb,
lib/gammo/parser/insertion_mode/in_cell.rb,
lib/gammo/parser/insertion_mode/in_head.rb,
lib/gammo/parser/insertion_mode/initial.rb,
lib/gammo/parser/insertion_mode/in_table.rb,
lib/gammo/parser/insertion_mode/in_select.rb,
lib/gammo/parser/insertion_mode/after_body.rb,
lib/gammo/parser/insertion_mode/after_head.rb,
lib/gammo/parser/insertion_mode/in_caption.rb,
lib/gammo/parser/insertion_mode/before_head.rb,
lib/gammo/parser/insertion_mode/before_html.rb,
lib/gammo/parser/insertion_mode/in_frameset.rb,
lib/gammo/parser/insertion_mode/in_template.rb,
lib/gammo/parser/insertion_mode/in_table_body.rb,
lib/gammo/parser/insertion_mode/after_frameset.rb,
lib/gammo/parser/insertion_mode/in_column_group.rb,
lib/gammo/parser/insertion_mode/after_after_body.rb,
lib/gammo/parser/insertion_mode/in_head_noscript.rb,
lib/gammo/parser/insertion_mode/in_select_in_table.rb,
lib/gammo/parser/insertion_mode/after_after_frameset.rb

Overview

Class for parsing an HTML input and building an HTML tree.

Direct Known Subclasses

FragmentParser

Defined Under Namespace

Modules: Constants, Foreign Classes: AfterAfterBody, AfterAfterFrameset, AfterBody, AfterFrameset, AfterHead, BeforeHTML, BeforeHead, InBody, InCaption, InCell, InColumnGroup, InFrameset, InHead, InHeadNoscript, InRow, InSelect, InSelectInTable, InTable, InTableBody, InTemplate, Initial, InsertionMode, Text

Constant Summary collapse

ParseError =

Raised if anything goes wrong while parsing an HTML.

Class.new(ArgumentError)

Constants included from Constants

Constants::SPECIAL_ELEMENTS

Constants included from Foreign

Foreign::BREAKOUT, Foreign::MATH_ML_ATTRIBUTE_ADJUSTMENTS, Foreign::SVG_ATTRIBUTE_ADJUSTMENTS, Foreign::SVG_TAG_NAME_ADJUSTMENTS

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Foreign

#adjust_attribute_names, #adjust_foreign_attributes, #html_integration_point?, #in_foreign_content?, #math_ml_text_integration_point?, #parse_foreign_content

Constructor Details

#initialize(input, scripting: true, frameset_ok: true, insertion_mode: Initial, context: nil) ⇒ Gammo::Parser

Constructs a parser for parsing an HTML input.

Parameters:

  • input (String)
  • scripting (TrueClass, FalseClass) (defaults to: true)
  • frameset_ok (TrueClass, FalseClass) (defaults to: true)
  • insertion_mode (InsertionMode) (defaults to: Initial)
  • context (Gammo::Node) (defaults to: nil)


135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
# File 'lib/gammo/parser.rb', line 135

def initialize(input, scripting: true, frameset_ok: true, insertion_mode: Initial, context: nil)
  @input                      = input
  @scripting                  = scripting
  @frameset_ok                = frameset_ok
  @context                    = context
  @insertion_mode             = insertion_mode
  @token                      = nil
  @tokenizer                  = Tokenizer.new(input)
  @document                   = Node::Document.new
  @open_elements              = Parser::NodeStack.new([])
  @active_formatting_elements = Parser::NodeStack.new([])
  @template_stack             = InsertionModeStack.new([])
  @foster_parenting           = false
  @has_self_closing_token     = false
  @quirks                     = false
  @form                       = nil
  @head                       = nil
end

Instance Attribute Details

#contextObject

The context element is for use in parsing an HTML fragment, defined in 12.2.4.2. html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments



122
123
124
# File 'lib/gammo/parser.rb', line 122

def context
  @context
end

#documentObject

Document root element



100
101
102
# File 'lib/gammo/parser.rb', line 100

def document
  @document
end

#formObject



91
92
93
# File 'lib/gammo/parser.rb', line 91

def form
  @form
end

#frameset_okObject Also known as: frameset_ok?

Other parsing state flags defined in 12.2.4.5. html.spec.whatwg.org/multipage/parsing.html#other-parsing-state-flags



95
96
97
# File 'lib/gammo/parser.rb', line 95

def frameset_ok
  @frameset_ok
end

#headObject



91
92
93
# File 'lib/gammo/parser.rb', line 91

def head
  @head
end

#scriptingObject Also known as: scripting?

Other parsing state flags defined in 12.2.4.5. html.spec.whatwg.org/multipage/parsing.html#other-parsing-state-flags



95
96
97
# File 'lib/gammo/parser.rb', line 95

def scripting
  @scripting
end

Instance Method Details

#parseGammo::Node::Document?

Parses the current input and builds HTML tree from it.

Returns:

Raises:

  • (Gammo::ParseError)

    Raised if the parser gets error while parsing.



157
158
159
160
161
162
163
164
165
166
167
168
# File 'lib/gammo/parser.rb', line 157

def parse
  while self.token != Tokenizer::EOS
    # CDATA sections are allowed only in foreign content.
    node = open_elements.last
    tokenizer.allow_cdata!(node && node.namespace)
    self.token = tokenizer.next_token
    return if self.token.instance_of?(Tokenizer::ErrorToken) && self.token != Tokenizer::EOS
    parse_current_token
    break if self.token == Tokenizer::EOS
  end
  self.document
end