DocWrapper is a simple DSL for creating wrappers around DOM objects.
Usage
% gem install doc_wrapper
Example Usages
DocWrapper allows you to easily create a declarative wrapper to access data from HTML Document Object Model (DOM) or XML DOM documents and optionally transform them.
DocWrapper will work with any underlying "document" that has a search method, such as a DOM generated by Nokogiri, or Hpricot. This allows the selectors used by DocWrapper to support any selector your DOM library does. Using Nokogiri, you can use either XPath or CSS selectors for very flexible property definition.
DocWrapper works by declaring properties with a name, type, and the search path to find the raw data in the DOM.
Basic Example
require 'nokogiri'
require 'doc_wrapper'
html = % <html>
<body>
<p class="first_name">Mark</p>
<p class="last_name">Menard</p>
</body>
</html>
class PersonWrapper
include DocWrapper::Base
include DocWrapper::Properties
property :first_name, :string, './p[class="first_name"]'
property :last_name, :string, './p[class="last_name"]'
end
person_wrapper = PersonWrapper.new(Nokogiri::HTML(html))
person_wrapper.first_name # => 'Mark'
person_wrapper.last_name # => 'Menard'
Supported Property Types
Currently DocWrapper support :string, :date, :time, :boolean, :float and :raw. Additionally DocWrapper supports embedded wrappers using has_one and has_many functionality very similar to ActiveRecord. See specs for example usages.
Access to Node Attributes
String, Date, Time and Boolean properties can reference an attribute on a node.
Given the following XML document:
<?xml version="1.0" encoding="UTF-8"?>
<feed>
<link type="text/html" href="http://search.twitter.com/search?q=yahoo.com" rel="alternate"/>
</feed>
You can access the link href with the following property definition.
class FeedWrapper
include DocWrapper::Base
include DocWrapper::Properties
property :link, :string, '//feed/link', :use_attribute => :href
end
Usage
% gem install doc_wrapper
Example Usages
DocWrapper allows you to easily create a declarative wrapper to access data from HTML Document Object Model (DOM) or XML DOM documents and optionally transform them.
DocWrapper will work with any underlying "document" that has a search method, such as a DOM generated by Nokogiri, or Hpricot. This allows the selectors used by DocWrapper to support any selector your DOM library does. Using Nokogiri, you can use either XPath or CSS selectors for very flexible property definition.
DocWrapper works by declaring properties with a name, type, and the search path to find the raw data in the DOM.
Basic Example
require 'nokogiri'
require 'doc_wrapper'
html = % <html>
<body>
<p class="first_name">Mark</p>
<p class="last_name">Menard</p>
</body>
</html>
class PersonWrapper
include DocWrapper::Base
include DocWrapper::Properties
property :first_name, :string, './p[class="first_name"]'
property :last_name, :string, './p[class="last_name"]'
end
person_wrapper = PersonWrapper.new(Nokogiri::HTML(html))
person_wrapper.first_name # => 'Mark'
person_wrapper.last_name # => 'Menard'
Supported Property Types
Currently DocWrapper support :string, :date, :time, :boolean, :float and :raw. Additionally DocWrapper supports embedded wrappers using has_one and has_many functionality very similar to ActiveRecord. See specs for example usages.
Access to Node Attributes
String, Date, Time and Boolean properties can reference an attribute on a node.
Given the following XML document:
<?xml version="1.0" encoding="UTF-8"?>
<feed>
<link type="text/html" href="http://search.twitter.com/search?q=yahoo.com" rel="alternate"/>
</feed>
You can access the link href with the following property definition.
class FeedWrapper
include DocWrapper::Base
include DocWrapper::Properties
property :link, :string, '//feed/link', :use_attribute => :href
end