Class: SpiderBot::Http::Client

Inherits:
Object
  • Object
show all
Defined in:
lib/spider_bot/http/client.rb

Constant Summary collapse

USER_AGENT =

Supported User-Agent

  • Linux Firefox (3.6.1)
  • Linux Konqueror (3)
  • Linux Mozilla
  • Linux Chrome
  • Mac Firefox
  • Mac Mozilla
  • Mac Chrome
  • Mac Safari
  • Mechanize (default)
  • Windows IE 6
  • Windows IE 7
  • Windows IE 8
  • Windows IE 9
  • Windows Mozilla
  • iPhone (3.0)
  • iPad
  • Android
{
  'bot' => "bot/#{SpiderBot::VERSION}",
  'Linux Firefox' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1',
  'Linux Mozilla' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624',
  'Linux Chrome' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624  Chrome/26.0.1410.43',
  'Mac Firefox' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:35.0) Gecko/20100101 Firefox/35.0',
  'Mac Safari' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/8.0.3 Safari/600.3.18',
  'Mac Chrome' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36',
  'Windows IE 6' => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)',
  'Windows IE 7' => 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
  'Windows IE 8' => 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
  'Windows IE 9' => 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)',
  'Windows Mozilla' => 'Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6',
  'iPhone' => 'Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1C28 Safari/419.3',
  'iPad' => 'Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.10',
  'Android' => 'Mozilla/5.0 (Linux; U; Android 3.0; en-us) AppleWebKit/534.13 (KHTML, like Gecko) Version/4.0 Safari/534.13'
}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(uri = nil, options = nil) {|builder| ... } ⇒ Client

Initialize a new HttpClient

Examples:

http = HttpClient.new

http = HttpClient.new do |http|
  http.user_agent= "Mac Safri"
  http.url= "http://example.com"
end

Parameters:

  • uri (String) (defaults to: nil)

    the uri with

  • options (Hash) (defaults to: nil)

    the options to create a http with configure

Options Hash (options):

  • :header (String)

    set the http request headers

Yields:



76
77
78
79
80
81
# File 'lib/spider_bot/http/client.rb', line 76

def initialize(uri = nil, options = nil, &block)
  @url = uri
  @options = options
  @user_agent ||= USER_AGENT['bot']
  yield self if block_given?
end

Instance Attribute Details

#conn_buildObject

Returns the value of attribute conn_build.



21
22
23
# File 'lib/spider_bot/http/client.rb', line 21

def conn_build
  @conn_build
end

#connectionconnection

The Faraday connection object

Returns:



19
20
21
# File 'lib/spider_bot/http/client.rb', line 19

def connection
  @connection
end

#headersObject

Returns the value of attribute headers.



13
14
15
# File 'lib/spider_bot/http/client.rb', line 13

def headers
  @headers
end

#optionsObject

Returns the value of attribute options.



16
17
18
# File 'lib/spider_bot/http/client.rb', line 16

def options
  @options
end

#urlObject

return url for HttpClient



8
9
10
# File 'lib/spider_bot/http/client.rb', line 8

def url
  @url
end

#user_agentObject

return http user_agent for HttpClient



11
12
13
# File 'lib/spider_bot/http/client.rb', line 11

def user_agent
  @user_agent
end

Instance Method Details

#builder(&block) ⇒ Object



83
84
85
# File 'lib/spider_bot/http/client.rb', line 83

def builder(&block)
  @conn_build = block
end

#get(uri, query = {}, &block) ⇒ Object

Handle get request with HttpClient

Parameters:

  • uri (String)

    URL path for request

  • query (Hash) (defaults to: {})

    additional query parameters for the URL of the request



155
156
157
# File 'lib/spider_bot/http/client.rb', line 155

def get(uri, query = {}, &block) 
  request(:get, uri, query, &block)
end

#post(uri, query = {}, &block) ⇒ Object

Handle post request with HttpClient

Parameters:

  • uri (String)

    URL path for request

  • query (Hash) (defaults to: {})

    additional query parameters for the URL of the request



161
162
163
# File 'lib/spider_bot/http/client.rb', line 161

def post(uri, query = {}, &block)
  request(:post, uri, query, &block)
end

#request(verb, uri, query = {}) ⇒ Object

Make request with HttpClient

Parameters:

  • verb (Symbol)

    verb one of :get, :post, :put, :delete

  • uri (String)

    URL path for request

  • query (Hash) (defaults to: {})

    additional query parameters for the URL of the request



133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
# File 'lib/spider_bot/http/client.rb', line 133

def request(verb, uri, query={})
  verb == :get ? query_get = query : query_post = query
  uri = connection.build_url(uri, query_get)

  response = connection.run_request(verb, uri, query_post, headers) do |request|
    yield request if block_given?
  end
  response = Response.new(response)
  
  case response.status
  when 301, 302, 303, 307
    request(verb, response.headers['location'], query)
  when 200..299, 300..399
    response
  end
end