Class: DaimonSkycrawlers::Processor::Spider
- Defined in:
- lib/daimon_skycrawlers/processor/spider.rb
Instance Attribute Summary collapse
-
#enqueue ⇒ Object
Returns the value of attribute enqueue.
Instance Method Summary collapse
- #append_link_filter(filter = nil, &block) ⇒ Object
- #call(message) ⇒ Object
-
#initialize ⇒ Spider
constructor
A new instance of Spider.
Methods inherited from Base
#before_process, #process, #storage
Methods included from LoggerMixin
Constructor Details
#initialize ⇒ Spider
Returns a new instance of Spider.
9 10 11 12 13 14 15 |
# File 'lib/daimon_skycrawlers/processor/spider.rb', line 9 def initialize super @link_filters = [] @doc = nil @links = nil @enqueue = true end |
Instance Attribute Details
#enqueue ⇒ Object
Returns the value of attribute enqueue.
7 8 9 |
# File 'lib/daimon_skycrawlers/processor/spider.rb', line 7 def enqueue @enqueue end |
Instance Method Details
#append_link_filter(filter = nil, &block) ⇒ Object
17 18 19 20 21 22 23 |
# File 'lib/daimon_skycrawlers/processor/spider.rb', line 17 def append_link_filter(filter = nil, &block) if block_given? @link_filters << block else @link_filters << filter if filter.respond_to?(:call) end end |
#call(message) ⇒ Object
28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/daimon_skycrawlers/processor/spider.rb', line 28 def call() key_url = [:url] depth = Integer([:depth] || 2) return if [:heartbeat] return if depth <= 1 page = storage.find(key_url) @doc = Nokogiri::HTML(page.body) = { depth: depth - 1, } links.each do |url| enqueue_url(url, ) end end |