Class: Iudex::Core::VisitQueue

Inherits:
Object
  • Object
show all
Defined in:
lib/iudex-core/visit_queue.rb

Overview

Configuration extensions for Java::iudex.core.VisitQueue.

Instance Method Summary collapse

Instance Method Details

#config(opts = {}) ⇒ Object

Configure defaults, a specific domain or domain,type pair via an options Hash.

Options

:domain

Registration level domain String. If not specified, :type is ignored and other options apply as general defaults for all (otherwise un-configured domains/types).

:type

An optional type (i.e. PAGE). If specified, this :domain,:type pair will be given its own HostQueue with other the options applying exclusively to it.

:rate

Target maximum rate of crawl as a Float requests/second for this :domain(,:type) or the default for any not otherwise configured. Resource limits including :cons and HTTP client connections may further inhibit rate below this value. (Initial default is 2.0 req/second)

:delay

Alternative inverse to :rate as Integer milliseconds to delay between scheduling visits. If specifies, takes precedence over rate.

:cons

Maximum number of concurrent requests for this :domain(,:type) or the default for any not otherwise configured. Note that the HTTP clients have their own per host:port destination connection limit which should generally be set higher than this value. (Initial default: 1)



55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/iudex-core/visit_queue.rb', line 55

def config( opts = {} )

  if opts[ :domain ]
    opts = { :rate => delay_to_rate( default_min_host_delay ),
             :cons => default_max_access_per_host }.merge( opts )
    configure_host( opts[ :domain ],
                    opts[ :type ], # includes nil
                    opts[ :delay ] || rate_to_delay( opts[ :rate ] ),
                    opts[ :cons ] )
  else
    if opts[ :rate ]
      self.default_min_host_delay = rate_to_delay( opts[ :rate ] )
    end
    self.default_min_host_delay = opts[ :delay ] if opts[ :delay ]
    self.default_max_access_per_host = opts[ :cons ] if opts[ :cons ]
  end

end