Class: ArticleJSON::Import::GoogleDoc::HTML::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/article_json/import/google_doc/html/parser.rb

Instance Method Summary collapse

Constructor Details

#initialize(html) ⇒ Parser

Returns a new instance of Parser.

Parameters:

  • html (String)


7
8
9
10
11
12
13
14
15
16
17
18
# File 'lib/article_json/import/google_doc/html/parser.rb', line 7

def initialize(html)
  doc = Nokogiri::HTML(html)
  selection = if doc.xpath('//body/div').empty?
                doc.xpath('//body')
              else
                doc.xpath('//body/div')
              end
  @body_enumerator = selection.last.children.to_enum

  css_node = doc.xpath('//head/style').last
  @css_analyzer = CSSAnalyzer.new(css_node&.inner_text)
end

Instance Method Details

#parsed_contentArray[ArticleJSON::Elements::Base]

Parse the body of the document and return the result

Returns:



22
23
24
# File 'lib/article_json/import/google_doc/html/parser.rb', line 22

def parsed_content
  @parsed_content ||= parse_body
end