Class: FindArt::WalMart
- Defined in:
- lib/FindArt/scrapers/walmart.rb
Instance Method Summary collapse
-
#extract_art(doc) ⇒ Object
Extracts album art url from wallmart product page.
- #scrape(artist, title, opts = {}) ⇒ Object
Methods inherited from Scraper
#find_art, register_scraper, registerd_sites, #scrapers, unregister_scrapers!
Instance Method Details
#extract_art(doc) ⇒ Object
Extracts album art url from wallmart product page
28 29 30 31 32 33 34 35 36 |
# File 'lib/FindArt/scrapers/walmart.rb', line 28 def extract_art(doc) url = nil element = doc.at("* div[@class='LargeItemPhoto150'] a[@href^=javascript]") href = element["href"] if !element.nil? && !element["href"].nil? if href match, url = *href.match(/javascript:photo_opener\('(http:\/\/.*.jpg)&/) end url end |
#scrape(artist, title, opts = {}) ⇒ Object
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/FindArt/scrapers/walmart.rb', line 6 def scrape(artist,title,opts={}) url = nil search_url = "#{@@url}#{CGI.escape("#{artist} #{title}")}" browser = WWW::Mechanize.new browser.get(search_url) do |page| doc = Hpricot(page.body) # check if there are multiple results and get the top result element = doc.at("* .firstRow a") if !element.nil? # extract and fetch item page item_page = browser.get(element.attributes["href"]) doc = Hpricot(item_page.body) end #extract art from product page url = extract_art(doc) end url end |