Class: Kudzu::Agent::UrlExtractor::ForXML

Inherits:
Object
  • Object
show all
Defined in:
lib/kudzu/agent/url_extractor.rb

Instance Method Summary collapse

Constructor Details

#initialize(config) ⇒ ForXML

Returns a new instance of ForXML.


109
110
111
# File 'lib/kudzu/agent/url_extractor.rb', line 109

def initialize(config)
  @config = config
end

Instance Method Details

#extract(response) ⇒ Object


113
114
115
116
117
118
119
# File 'lib/kudzu/agent/url_extractor.rb', line 113

def extract(response)
  doc = response.parsed_doc.dup
  doc.remove_namespaces!

  refs = from_rss(doc) + from_atom(doc)
  refs.reject { |ref| ref.url.nil? || ref.url.empty? }
end