Class: Llmsherpa::Paragraph
Overview
A paragraph is a block of text. It can have children such as lists. A paragraph has tag ‘para’.
Instance Attribute Summary
Attributes inherited from Block
#bbox, #block_idx, #block_json, #children, #left, #level, #page_idx, #parent, #sentences, #tag, #top
Instance Method Summary collapse
- #to_html(include_children = false, recurse = false) ⇒ Object
- #to_text(include_children = false, recurse = false) ⇒ Object
Methods inherited from Block
#add_child, #chunks, #initialize, #iter_children, #paragraphs, #parent_chain, #parent_text, #sections, #tables, #to_context_text
Constructor Details
This class inherits a constructor from Llmsherpa::Block
Instance Method Details
#to_html(include_children = false, recurse = false) ⇒ Object
133 134 135 136 137 138 139 140 141 142 143 144 145 |
# File 'lib/llmsherpa/blocks.rb', line 133 def to_html(include_children = false, recurse = false) html_str = "<p>" html_str += @sentences.join("\n") if include_children && !@children.empty? html_str += "<ul>" @children.each do |child| html_str += child.to_html(include_children: recurse, recurse: recurse) end html_str += "</ul>" end html_str += "</p>" html_str end |
#to_text(include_children = false, recurse = false) ⇒ Object
123 124 125 126 127 128 129 130 131 |
# File 'lib/llmsherpa/blocks.rb', line 123 def to_text(include_children = false, recurse = false) para_text = @sentences.join("\n") if include_children @children.each do |child| para_text += "\n#{child.to_text(include_children: recurse, recurse: recurse)}" end end para_text end |