Mercury Web Parser

A simple Ruby wrapper for the Mercury Web Parser API

Gem Version Build Status Code Climate

Installation

Add this line to your application's Gemfile:

gem 'mercury_web_parser'

And then execute:

$ bundle

Or install it yourself as:

$ gem install mercury_web_parser

Configuration

You must first obtain an API token from the fine folks at Mercury in order to make requests to their Web Parser API.

Single token usage

MercuryWebParser.api_token = API_TOKEN

or set multiple options with a block:

MercuryWebParser.configure do |parser|
  parser.api_token = API_TOKEN
end

Multiple tokens or multithreaded usage:

client = MercuryWebParser::Client.new(api_token: API_TOKEN)

Usage

Parse

Parse a webpage and return its main content:

article = MercuryWebParser.parse("http://sethgodin.typepad.com/seths_blog/2016/11/all-we-have-is-each-other.html")
=> #<MercuryWebParser::Article title="Seth's Blog", author=nil, date_published=nil, dek=nil, lead_image_url="http://www.sethgodin.com/sg/images/og.jpg", content="<div id=\"alpha-inner\" class=\"pkg\"> <div class=\"module-typelist module\">...", next_page_url="http://sethgodin.typepad.com/seths_blog/2016/11/choose-better.html", url="http://sethgodin.typepad.com/seths_blog/2016/11/all-we-have-is-each-other.html", domain="sethgodin.typepad.com", excerpt="", word_count=462, direction="ltr", total_pages=4, pages_rendered=4>

article.title
article.content
article.author
article.date_published
article.lead_image_url
article.dek
article.next_page_url
article.url
article.domain
article.excerpt
article.word_count
article.direction
article.total_pages
article.rendered_pages

Inspiration

Clone of readability_parser gem