Class: Hypermicrodata::Document
- Inherits:
-
Object
- Object
- Hypermicrodata::Document
- Defined in:
- lib/hypermicrodata/document.rb
Instance Attribute Summary collapse
-
#doc ⇒ Object
readonly
Returns the value of attribute doc.
-
#items ⇒ Object
readonly
Returns the value of attribute items.
Instance Method Summary collapse
- #extract_items ⇒ Object
-
#initialize(content, options = {}) ⇒ Document
constructor
A new instance of Document.
Constructor Details
#initialize(content, options = {}) ⇒ Document
Returns a new instance of Document.
6 7 8 9 10 11 12 |
# File 'lib/hypermicrodata/document.rb', line 6 def initialize(content, = {}) encoding = [:force_encoding] || nil @doc = Nokogiri::HTML(content, nil, encoding) @page_url = [:page_url] @filter_xpath_attr = [:filter_xpath_attr] @items = extract_items end |
Instance Attribute Details
#doc ⇒ Object (readonly)
Returns the value of attribute doc.
4 5 6 |
# File 'lib/hypermicrodata/document.rb', line 4 def doc @doc end |
#items ⇒ Object (readonly)
Returns the value of attribute items.
4 5 6 |
# File 'lib/hypermicrodata/document.rb', line 4 def items @items end |
Instance Method Details
#extract_items ⇒ Object
14 15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/hypermicrodata/document.rb', line 14 def extract_items itemscopes = [] if @filter_xpath_attr itemscopes = @doc.xpath("//*[#{@filter_xpath_attr} and @itemscope]") puts "XPath //*[#{@filter_xpath_attr}] is not found. root node is used." if itemscopes.empty? end itemscopes = @doc.xpath('self::*[@itemscope] | .//*[@itemscope and not(@itemprop)]') if itemscopes.empty? itemscopes.collect do |itemscope| Item.new(itemscope, @page_url) end end |