Method: WebRobots#initialize

Defined in:: lib/webrobots.rb

#initialize(user_agent, options = nil) ⇒ `WebRobots`

Creates a WebRobots object for a robot named user_agent, with optional options.

:http_get => a custom method, proc, or anything that responds to .call(uri), to be used for fetching robots.txt. It must return the response body if successful, return an empty string if the resource is not found, and return nil or raise any error on failure. Redirects should be handled within this proc.
:crawl_delay => determines how to react to Crawl-delay directives. If :sleep is given, WebRobots sleeps as demanded when allowed?(url)/disallowed?(url) is called. This is the default behavior. If :ignore is given, WebRobots does nothing. If a custom method, proc, or anything that responds to .call(delay, last_checked_at), it is called.

# File 'lib/webrobots.rb', line 28

def initialize(user_agent, options = nil)
  @user_agent = user_agent

  options ||= {}
  @http_get = options[:http_get] || method(:http_get)
  crawl_delay_handler =
    case value = options[:crawl_delay] || :sleep
    when :ignore
      nil
    when :sleep
      method(:crawl_delay_handler)
    else
      if value.respond_to?(:call)
        value
      else
        raise ArgumentError, "invalid Crawl-delay handler: #{value.inspect}"
      end
    end

  @parser = RobotsTxt::Parser.new(user_agent, crawl_delay_handler)
  @parser_mutex = Mutex.new

  @robotstxt = create_cache()
end

Method: WebRobots#initialize

#initialize(user_agent, options = nil) ⇒ WebRobots

#initialize(user_agent, options = nil) ⇒ `WebRobots`