Class: Mida::Document

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/mida/document.rb

Overview

Class that holds the extracted Microdata

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(target, page_url = nil) ⇒ Document

Create a new Microdata object

target

The string containing the html that you want to parse.

page_url

The url of target used for form absolute urls. This must include the filename, e.g. index.html.



18
19
20
21
22
# File 'lib/mida/document.rb', line 18

def initialize(target, page_url=nil)
  @doc = target.kind_of?(Nokogiri::XML::Document) ? target : Nokogiri(target)
  @page_url = page_url
  @items = extract_items
end

Instance Attribute Details

#itemsObject (readonly)

An Array of Mida::Item objects. These are all top-level and hence not properties of other Items



11
12
13
# File 'lib/mida/document.rb', line 11

def items
  @items
end

Instance Method Details

#eachObject

Implements method for Enumerable



25
26
27
# File 'lib/mida/document.rb', line 25

def each
  @items.each {|item| yield(item)}
end

#search(itemtype, items = @items) ⇒ Object

Returns an array of matching Mida::Item objects

This drills down through each Item to find match items

itemtype

A regexp to match the item types against

items

An array of items to search. If no argument supplied, will search through all items in the document.



36
37
38
39
40
41
42
43
44
45
# File 'lib/mida/document.rb', line 36

def search(itemtype, items=@items)
  items.each_with_object([]) do |item, found_items|
    # Allows matching against empty string, otherwise couldn't match
    # as item.type can be nil
    if (item.type.nil? && "" =~ itemtype) || (item.type =~ itemtype)
      found_items << item
    end
    found_items.concat(search_values(item.properties.values, itemtype))
  end
end