Class: ArticleJSON::Import::GoogleDoc::HTML::NodeAnalyzer
- Inherits:
-
Object
- Object
- ArticleJSON::Import::GoogleDoc::HTML::NodeAnalyzer
- Defined in:
- lib/article_json/import/google_doc/html/node_analyzer.rb
Instance Attribute Summary collapse
-
#node ⇒ Object
readonly
Returns the value of attribute node.
Instance Method Summary collapse
-
#br? ⇒ Boolean
Check if the node is a linebreak.
-
#embed? ⇒ Boolean
Check if the node contains an embedded element.
-
#empty? ⇒ Boolean
Check if the node is empty, i.e.
-
#has_text?(text) ⇒ Boolean
Check if a node equals a certain text.
-
#heading? ⇒ Boolean
Check if the node is a header tag between <h1> and <h5>.
-
#hr? ⇒ Boolean
Check if the node is a horizontal line (i.e. ‘<hr>`).
-
#image? ⇒ Boolean
Check if the node contains an image.
-
#initialize(node) ⇒ NodeAnalyzer
constructor
A new instance of NodeAnalyzer.
-
#list? ⇒ Boolean
Check if the node contains an ordered or unordered list.
-
#paragraph? ⇒ Boolean
Check if the node is a normal text paragraph.
-
#quote? ⇒ Boolean
Check if the node starts a quote Quotes start with a single line saying “Quote:”.
-
#text_box? ⇒ Boolean
Check if the node starts a text box Text boxes start with a single line saying “Textbox:” or “Highlight:”.
-
#type ⇒ Symbol
Determine the type of this node The type is one of the elements supported by article_json.
Constructor Details
#initialize(node) ⇒ NodeAnalyzer
Returns a new instance of NodeAnalyzer.
9 10 11 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 9 def initialize(node) @node = node end |
Instance Attribute Details
#node ⇒ Object (readonly)
Returns the value of attribute node.
6 7 8 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 6 def node @node end |
Instance Method Details
#br? ⇒ Boolean
Check if the node is a linebreak. A span only containing whitespaces and
tags is considered a linebreak.
96 97 98 99 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 96 def br? return @is_br if defined? @is_br @is_br = node.name == 'br' || only_includes_brs? end |
#embed? ⇒ Boolean
Check if the node contains an embedded element
88 89 90 91 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 88 def return @is_embed if defined? @is_embed @is_embed = EmbeddedParser.supported?(node) end |
#empty? ⇒ Boolean
Check if the node is empty, i.e. not containing any text Given that images are the only nodes without text, we have to make sure that it’s not an image.
24 25 26 27 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 24 def empty? return @is_empty if defined? @is_empty @is_empty = node.inner_text.strip.empty? && !image? && !hr? && !br? end |
#has_text?(text) ⇒ Boolean
Check if a node equals a certain text
16 17 18 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 16 def has_text?(text) node.inner_text.strip.downcase == text.strip.downcase end |
#heading? ⇒ Boolean
Check if the node is a header tag between <h1> and <h5>
31 32 33 34 35 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 31 def heading? return @is_heading if defined? @is_heading @is_heading = !quote? && !text_box? && %w(h1 h2 h3 h4 h5).include?(node.name) end |
#hr? ⇒ Boolean
Check if the node is a horizontal line (i.e. ‘<hr>`)
39 40 41 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 39 def hr? node.name == 'hr' end |
#image? ⇒ Boolean
Check if the node contains an image
81 82 83 84 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 81 def image? return @is_image if defined? @is_image @is_image = node.xpath('.//img').length > 0 end |
#list? ⇒ Boolean
Check if the node contains an ordered or unordered list
58 59 60 61 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 58 def list? return @is_list if defined? @is_list @is_list = %w(ul ol).include?(node.name) end |
#paragraph? ⇒ Boolean
Check if the node is a normal text paragraph
45 46 47 48 49 50 51 52 53 54 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 45 def paragraph? return @is_paragraph if defined? @is_paragraph @is_paragraph = node.name == 'p' && !empty? && !image? && !text_box? && !quote? && ! end |
#quote? ⇒ Boolean
Check if the node starts a quote Quotes start with a single line saying “Quote:”.
74 75 76 77 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 74 def quote? return @is_quote if defined? @is_quote @is_quote = has_text?('quote:') end |
#text_box? ⇒ Boolean
Check if the node starts a text box Text boxes start with a single line saying “Textbox:” or “Highlight:”.
66 67 68 69 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 66 def text_box? return @is_text_box if defined? @is_text_box @is_text_box = has_text?('textbox:') || has_text?('highlight:') end |
#type ⇒ Symbol
Determine the type of this node The type is one of the elements supported by article_json.
104 105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/article_json/import/google_doc/html/node_analyzer.rb', line 104 def type return :empty if empty? return :hr if hr? return :heading if heading? return :paragraph if paragraph? return :list if list? return :text_box if text_box? return :quote if quote? return :image if image? return :embed if :unknown end |