Class: OpenGraphReader::Fetcher Private

Inherits:
Object
  • Object
show all
Defined in:
lib/open_graph_reader/fetcher.rb

Overview

This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.

Fetch an URI to retrieve its HTML body, if available.

Constant Summary collapse

HEADERS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

{
  "Accept"     => "text/html",
  "User-Agent" => "OpenGraphReader/#{OpenGraphReader::VERSION} (+https://github.com/jhass/open_graph_reader)"
}.freeze

Instance Method Summary collapse

Constructor Details

#initialize(uri) ⇒ Fetcher

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Create a new fetcher.

Raises:

  • (ArgumentError)


26
27
28
29
30
31
32
33
34
# File 'lib/open_graph_reader/fetcher.rb', line 26

def initialize uri
  raise ArgumentError, "url needs to be an instance of URI" unless uri.is_a? URI
  @uri = uri
  @connection = Faraday.default_connection.dup
  @connection.headers.replace(HEADERS)

  prepend_middleware Faraday::CookieJar if defined? Faraday::CookieJar
  prepend_middleware FaradayMiddleware::FollowRedirects if defined? FaradayMiddleware
end

Instance Method Details

#bodyString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

TODO:

Custom error class

Retrieve the body

Raises:

  • (ArgumentError)

    The received content does not seems to be HTML.



65
66
67
68
69
# File 'lib/open_graph_reader/fetcher.rb', line 65

def body
  fetch_body unless fetched?
  raise ArgumentError, "Did not receive a HTML site at #{@uri}" unless html?
  @get_response.body
end

#fetchFaraday::Response? Also known as: fetch_body

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Fetch the full page.



46
47
48
49
# File 'lib/open_graph_reader/fetcher.rb', line 46

def fetch
  @get_response = @connection.get(@uri)
rescue Faraday::Error
end

#fetch_headersFaraday::Response?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Fetch just the headers



55
56
57
58
# File 'lib/open_graph_reader/fetcher.rb', line 55

def fetch_headers
  @head_response = @connection.head(@uri)
rescue Faraday::Error
end

#fetched?Bool Also known as: fetched_body?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Whether the target URI was fetched.



86
87
88
# File 'lib/open_graph_reader/fetcher.rb', line 86

def fetched?
  !@get_response.nil?
end

#fetched_headers?Bool

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Whether the headers of the target URI were fetched.



94
95
96
# File 'lib/open_graph_reader/fetcher.rb', line 94

def fetched_headers?
  !@get_response.nil? || !@head_response.nil?
end

#html?Bool

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Whether the target URI seems to return HTML



74
75
76
77
78
79
80
81
# File 'lib/open_graph_reader/fetcher.rb', line 74

def html?
  fetch_headers unless fetched_headers?
  response = @get_response || @head_response
  return false unless response
  return false unless response.success?
  return false unless response["content-type"]
  response["content-type"].include? "text/html"
end

#urlString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

The URL to fetch



39
40
41
# File 'lib/open_graph_reader/fetcher.rb', line 39

def url
  @uri.to_s
end