Class: Grabber::Site
Instance Method Summary collapse
- #crawl ⇒ Object
-
#initialize(url, path) ⇒ Site
constructor
A new instance of Site.
- #process_page(url) ⇒ Object
Methods included from Util
#format_url, #strip_non_url_parts, #uri, #with_url_protocol
Constructor Details
#initialize(url, path) ⇒ Site
Returns a new instance of Site.
5 6 7 8 |
# File 'lib/grabber/site.rb', line 5 def initialize(url, path) @url = with_url_protocol(url) @download_path = path end |
Instance Method Details
#crawl ⇒ Object
10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# File 'lib/grabber/site.rb', line 10 def crawl index = 0 page_urls = [format_url(@url)] while (url = page_urls[index]) page = process_page(url) other_urls = page.links.map { |link| format_url(link) }.select do |link| URI.parse(link).host == uri.host end page_urls = page_urls | other_urls.compact index += 1 end end |