Pagedump
Installation
Add this line to your application's Gemfile:
gem 'pagedump'
And then execute:
$ bundle
Or install it yourself as:
$ gem install pagedump
Usage
Create a page driver:
require "pagedump"
class LeMonde < Pagedump::Driver
URL = "http://www.lemonde.fr/"
def headlines page
head 3, page.css(".titre_une a")[0]['href']
page.css(".titres_hauts article").each do |e|
head 1, e.css('a')[0]["href"]
end
end
end
And scrap its links:
require "pagedump"
healines = @driver.scrap
healines.each do |headline, w|
puts "%3d\t%-s" % [w, headline]
end
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/pompadour/pagedump.
License
The gem is available as open source under the terms of the MIT License.