Class: RestaurantWeekBoston::Scraper

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/restaurant_week_boston/scraper.rb

Overview

Scrapes Restaurant Week site.

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ Scraper

opts is a hash of options, with the following keys:

:neighborhood

dorchester, back-bay, etc. (default: “all”)

:meal

lunch, dinner, both, or any (default: “any”). Will create a

file in your home directory called “.restaurant_week_boston.cache” which contains the HTML from the RWB site, just so it doesn’t have to keep getting it. You can delete that file, it’ll just take longer next time since it will have to re-get the HTML.



17
18
19
20
21
22
# File 'lib/restaurant_week_boston/scraper.rb', line 17

def initialize(opts = {})
  @url = create_url(opts)
  @dump = File.expand_path('~/.restaurant_week_boston.cache')
  entries = doc().css('.restaurantEntry')
  @restaurants = entries.map{ |entry| Restaurant.new(entry) }
end

Instance Method Details

#create_url(opts = {}) ⇒ Object

opts is a hash of options, with the following keys:

:neighborhood

dorchester, back-bay, etc. (default: :all)

:meal

lunch, dinner, both, or any (default: :any)



27
28
29
30
31
32
33
34
35
# File 'lib/restaurant_week_boston/scraper.rb', line 27

def create_url(opts = {})
  # meal: any/lunch/dinner/both
  # &view=all
  default_opts = {:neighborhood => :all,
                  :meal => :any }
  opts = default_opts.merge!(opts)
  sprintf('http://www.restaurantweekboston.com/?neighborhood=%s&meal=%s&view=all',
          opts[:neighborhood].to_s, opts[:meal].to_s)
end

#docObject

Return a Nokogiri::HTML::Document parsed from get_html. Prints status messages along the way.



60
61
62
63
64
65
66
67
68
# File 'lib/restaurant_week_boston/scraper.rb', line 60

def doc
  # get_html beforehand for good output messages
  html = get_html
  print "Parsing doc..."
  doc = Nokogiri::HTML(html)
  puts "done."
  puts
  doc
end

#each(&blk) ⇒ Object

Iterates over @restaurants. All methods in Enumerable work.



39
40
41
# File 'lib/restaurant_week_boston/scraper.rb', line 39

def each(&blk)
  @restaurants.each(&blk)
end

#get_htmlObject

Returns the result of open()ing the url from create_url(), as a String.



44
45
46
47
48
49
50
51
52
53
54
55
56
# File 'lib/restaurant_week_boston/scraper.rb', line 44

def get_html
  print "Getting doc..."
  if File.size? @dump
    html = File.read(@dump)
  else
    html = open(@url).read()
    f = File.new(@dump, 'w')
    f.write(html)
    f.close()
  end
  puts "done."
  html
end

#special_find(array) ⇒ Object

Pass in an array of names that =~ (case-insensitive) the ones you’re thinking of, and this will get those. So, if you’re thinking of Bond, 224 Boston Street, and Artu, you can pass in [‘bond’, ‘boston street’, ‘artu’].



74
75
76
77
78
79
80
81
82
83
# File 'lib/restaurant_week_boston/scraper.rb', line 74

def special_find(array)
  array.map! do |name|
    /#{Regexp.escape(name)}/i
  end
  results = @restaurants.find_all do |restaurant|
    array.detect{ |regexp| regexp =~ restaurant.name }
  end
  # Add separators
  results.join("\n" + '-' * 80 + "\n")
end