Class: Pismo::Document
- Inherits:
-
Object
- Object
- Pismo::Document
- Includes:
- ExternalAttributes, InternalAttributes
- Defined in:
- lib/pismo/document.rb
Overview
Pismo::Document represents a single HTML document within Pismo
Constant Summary collapse
- ATTRIBUTE_METHODS =
InternalAttributes.instance_methods + ExternalAttributes.instance_methods
Instance Attribute Summary collapse
-
#doc ⇒ Object
readonly
Returns the value of attribute doc.
-
#url ⇒ Object
readonly
Returns the value of attribute url.
Instance Method Summary collapse
- #clean_html(html) ⇒ Object
-
#html ⇒ Object
An HTML representation of the document.
-
#initialize(handle, url = nil) ⇒ Document
constructor
A new instance of Document.
- #load(handle, url = nil) ⇒ Object
- #match(args = [], all = false) ⇒ Object
Methods included from InternalAttributes
#author, #authors, #body, #datetime, #description, #favicon, #feed, #feeds, #html_title, #keywords, #lede, #ledes, #title, #titles
Constructor Details
#initialize(handle, url = nil) ⇒ Document
Returns a new instance of Document.
15 16 17 |
# File 'lib/pismo/document.rb', line 15 def initialize(handle, url = nil) load(handle, url) end |
Instance Attribute Details
#doc ⇒ Object (readonly)
Returns the value of attribute doc.
8 9 10 |
# File 'lib/pismo/document.rb', line 8 def doc @doc end |
#url ⇒ Object (readonly)
Returns the value of attribute url.
8 9 10 |
# File 'lib/pismo/document.rb', line 8 def url @url end |
Instance Method Details
#clean_html(html) ⇒ Object
45 46 47 48 49 50 51 52 |
# File 'lib/pismo/document.rb', line 45 def clean_html(html) html.gsub!('’', '\'') html.gsub!('”', '"') html.gsub!('–', '-') html.gsub!('“', '"') html.gsub!(' ', ' ') html end |
#html ⇒ Object
An HTML representation of the document
20 21 22 |
# File 'lib/pismo/document.rb', line 20 def html @doc.to_s end |
#load(handle, url = nil) ⇒ Object
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# File 'lib/pismo/document.rb', line 24 def load(handle, url = nil) @url = url if url @url = handle if handle =~ /\Ahttp/ @html = if handle =~ /\Ahttp/ open(handle).read elsif handle.is_a?(StringIO) || handle.is_a?(IO) || handle.is_a?(Tempfile) handle.read else handle end @html = clean_html(@html) @doc = Nokogiri::HTML(@html) end |
#match(args = [], all = false) ⇒ Object
41 42 43 |
# File 'lib/pismo/document.rb', line 41 def match(args = [], all = false) @doc.match([*args], all) end |