Method: Wgit::Url#initialize

Defined in:
lib/wgit/url.rb

#initialize(url_or_obj, crawled: false, date_crawled: nil, crawl_duration: nil) ⇒ Url

Initializes a new instance of Wgit::Url which models a web based HTTP URL.

Parameters:

  • url_or_obj (String, Wgit::Url, #fetch#[])

    Is either a String based URL or an object representing a Database record e.g. a MongoDB document/object.

  • crawled (Boolean) (defaults to: false)

    Whether or not the HTML of the URL's web page has been crawled or not. Only used if url_or_obj is a String.

  • date_crawled (Time) (defaults to: nil)

    Should only be provided if crawled is true. A suitable object can be returned from Wgit::Utils.time_stamp. Only used if url_or_obj is a String.

  • crawl_duration (Float) (defaults to: nil)

    Should only be provided if crawled is true. The duration of the crawl for this Url (in seconds).

Raises:

  • (StandardError)

    If url_or_obj is an Object with missing methods.



48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# File 'lib/wgit/url.rb', line 48

def initialize(
  url_or_obj, crawled: false, date_crawled: nil, crawl_duration: nil
)
  # Init from a URL String.
  if url_or_obj.is_a?(String)
    url = url_or_obj.to_s
  # Else init from a Hash like object e.g. database object.
  else
    obj = url_or_obj
    assert_respond_to(obj, :fetch)

    url            = obj.fetch("url") # Should always be present.
    crawled        = obj.fetch("crawled", false)
    date_crawled   = obj.fetch("date_crawled", nil)
    crawl_duration = obj.fetch("crawl_duration", nil)
    redirects      = obj.fetch("redirects", {})
  end

  @uri            = Addressable::URI.parse(url)
  @crawled        = crawled
  @date_crawled   = date_crawled
  @crawl_duration = crawl_duration
  @redirects      = redirects || {}

  super(url)
end