Class: SiteMapper::Request

Inherits:
Object
  • Object
show all
Defined in:
lib/site_mapper/request.rb

Overview

Get webpage wrapper.

Constant Summary collapse

'https://rubygems.org/gems/site_mapper'
USER_AGENT =

Request User-Agent

"SiteMapper/#{SiteMapper::VERSION} (+#{INFO_LINK})"

Class Method Summary collapse

Class Method Details

.document(url, options = {}) ⇒ Nokogiri::HTML

Given an URL get it then parse it with Nokogiri::HTML.

Parameters:

  • url (String)
  • options (Hash) (defaults to: {})

Returns:

  • (Nokogiri::HTML)

    a nokogiri HTML object



16
17
18
# File 'lib/site_mapper/request.rb', line 16

def document(url, options = {})
  Nokogiri::HTML(Request.response_body(url, options))
end

.resolve_url(url) ⇒ String

Resolve an URL string and follows redirects. if the URL can't be resolved the original URL is returned.

Examples:

Resolve google.com

resolve_url('google.com')
# => 'https://www.google.com'

Parameters:

  • url (String)

    to resolve

Returns:

  • (String)

    a URL string that potentially is a redirected URL



60
61
62
63
64
# File 'lib/site_mapper/request.rb', line 60

def resolve_url(url)
  resolved = UrlResolver.resolve(url)
  resolved = resolved.prepend('http://') unless has_protocol?(resolved)
  resolved
end

.response(url, options = {}) ⇒ Net::HTTPOK

Given an URL get the response.

Examples:

get example.com and resolve the URL

Request.response('example.com', resolve: true)

get example.com and do not resolve the URL

Request.response('http://example.com')

get example.com and resolve the URL

Request.response('http://example.com', resolve: true)

get example.com and resolve the URL and use a custom User-Agent

Request.response('http://example.com', resolve: true, user_agent: 'MyUserAgent')

Parameters:

  • url (String)
  • options (Hash) (defaults to: {})

Returns:

  • (Net::HTTPOK)

    if response is successfull, raises error otherwise



32
33
34
35
36
37
38
39
40
41
42
43
44
45
# File 'lib/site_mapper/request.rb', line 32

def response(url, options = {})
  options = {
    resolve: false,
    user_agent: SiteMapper::USER_AGENT
  }.merge(options)
  resolved_url = options[:resolve] ? resolve_url(url) : url
  uri          = URI.parse(resolved_url)
  http         = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true if resolved_url.start_with?('https://')

  request = Net::HTTP::Get.new(uri.request_uri)
  request['User-Agent'] = options[:user_agent]
  http.request(request)
end

.response_body(*args) ⇒ Object

Get response body, rescues with nil if an exception is raised.

See Also:



49
50
51
# File 'lib/site_mapper/request.rb', line 49

def response_body(*args)
  response(*args).body
end