Class: DaimonSkycrawlers::SitemapParser
- Inherits:
-
Object
- Object
- DaimonSkycrawlers::SitemapParser
- Defined in:
- lib/daimon_skycrawlers/sitemap_parser.rb
Overview
Based on github.com/benbalter/sitemap-parser
Instance Method Summary collapse
-
#initialize(urls, options = {}) ⇒ SitemapParser
constructor
A new instance of SitemapParser.
- #parse ⇒ Object
Constructor Details
#initialize(urls, options = {}) ⇒ SitemapParser
9 10 11 |
# File 'lib/daimon_skycrawlers/sitemap_parser.rb', line 9 def initialize(urls, = {}) @urls = urls end |
Instance Method Details
#parse ⇒ Object
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
# File 'lib/daimon_skycrawlers/sitemap_parser.rb', line 13 def parse hydra = Typhoeus::Hydra.new(max_concurrency: 1) sitemap_urls = [] @urls.each do |url| if URI(url).scheme.start_with?("http") request = Typhoeus::Request.new(url, followlocation: true) request.on_complete do |response| sitemap_urls.concat(on_complete(response)) end hydra.queue(request) else if File.exist?(url) extract_urls(File.read(url)) end end end hydra.run sitemap_urls end |