Class: SiteInspector::Domain

Inherits:
Object
  • Object
show all
Defined in:
lib/site-inspector/domain.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(host) ⇒ Domain

Returns a new instance of Domain.



5
6
7
8
9
10
11
12
# File 'lib/site-inspector/domain.rb', line 5

def initialize(host)
  host = host.downcase
  host = host.sub(/^https?\:/, '')
  host = host.sub(%r{^/+}, '')
  host = host.sub(/^www\./, '')
  uri = Addressable::URI.parse "//#{host}"
  @host = uri.host
end

Instance Attribute Details

#hostObject (readonly)

Returns the value of attribute host.



3
4
5
# File 'lib/site-inspector/domain.rb', line 3

def host
  @host
end

Instance Method Details

#canonical_endpointObject



23
24
25
26
27
28
29
30
# File 'lib/site-inspector/domain.rb', line 23

def canonical_endpoint
  @canonical_endpoint ||= begin
    prefetch
    endpoints.find do |e|
      e.https? == canonically_https? && e.www? == canonically_www?
    end
  end
end

#canonically_https?Boolean

A domain is “canonically” at https if:

* at least one of its https endpoints is live and
  doesn't have an invalid hostname
* both http endpoints are either down or redirect *somewhere*
* at least one http endpoint redirects immediately to
  an *internal* https endpoint

This is meant to affirm situations like:

http:// -> http://www -> https://
https:// -> http:// -> https://www

and meant to avoid affirming situations like:

http:// -> http://non-www
http://www -> http://non-www

or:

http:// -> 200, http://www -> https://www

It allows a site to be canonically HTTPS if the cert has a valid hostname but invalid chain issues.

Returns:

  • (Boolean)


152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/site-inspector/domain.rb', line 152

def canonically_https?
  # Does any endpoint respond?
  return false unless up?

  # At least one of its https endpoints is live and doesn't have an invalid hostname
  return false unless https?

  # Both http endpoints are down
  return true if endpoints.select(&:http?).all? { |e| !e.up? }

  # at least one http endpoint redirects immediately to https
  endpoints.select(&:http?).any? { |e| e.redirect && e.redirect.https? }
end

#canonically_www?Boolean

A domain is “canonically” at www if:

* at least one of its www endpoints responds
* both root endpoints are either down ~~or redirect *somewhere*~~, or
* at least one root endpoint redirect should immediately go to
  an *internal* www endpoint

This is meant to affirm situations like:

http:// -> https:// -> https://www
https:// -> http:// -> https://www

and meant to avoid affirming situations like:

http:// -> http://non-www,
http://www -> http://non-www

or like:

https:// -> 200, http:// -> http://www

Returns:

  • (Boolean)


121
122
123
124
125
126
127
128
129
130
131
132
133
# File 'lib/site-inspector/domain.rb', line 121

def canonically_www?
  # Does any endpoint respond?
  return false unless up?

  # Does at least one www endpoint respond?
  return false unless www?

  # Are both root endpoints down?
  return true if endpoints.select(&:root?).all? { |e| !e.up? }

  # Does either root endpoint redirect to a www endpoint?
  endpoints.select(&:root?).any? { |e| e.redirect && e.redirect.www? }
end

#defaults_https?Boolean

we can say that a canonical HTTPS site “defaults” to HTTPS, even if it doesn’t strictly enforce it (e.g. having a www subdomain first to go HTTP root before HTTPS root).

TODO: not implemented.

Returns:

  • (Boolean)


93
94
95
# File 'lib/site-inspector/domain.rb', line 93

def defaults_https?
  fail 'Not implemented. Halp?'
end

#downgrades_https?Boolean

HTTPS is “downgraded” if both:

  • HTTPS is supported, and

  • The ‘canonical’ endpoint gets an immediate internal redirect to HTTP.

TODO: the redirect must be internal.

Returns:

  • (Boolean)


103
104
105
106
# File 'lib/site-inspector/domain.rb', line 103

def downgrades_https?
  return false unless https?
  canonical_endpoint.redirect? && canonical_endpoint.redirect.http?
end

#endpointsObject



14
15
16
17
18
19
20
21
# File 'lib/site-inspector/domain.rb', line 14

def endpoints
  @endpoints ||= [
    Endpoint.new("https://#{host}", domain: self),
    Endpoint.new("https://www.#{host}", domain: self),
    Endpoint.new("http://#{host}", domain: self),
    Endpoint.new("http://www.#{host}", domain: self)
  ]
end

#enforces_https?Boolean

HTTPS is enforced if one of the HTTPS endpoints is “up”, and if both HTTP endpoints are either:

* down, or
* redirect immediately to HTTPS.

This is different than whether a domain is “canonically” HTTPS.

  • an HTTP redirect can go to HTTPS on another domain, as long as it’s immediate.

  • a domain with an invalid cert can still be enforcing HTTPS.

TODO: need to ensure the redirect immediately goes to HTTPS. TODO: don’t need to require that the HTTPS cert is valid for this purpose.

Returns:

  • (Boolean)


83
84
85
86
# File 'lib/site-inspector/domain.rb', line 83

def enforces_https?
  return false unless https?
  endpoints.select(&:http?).all? { |e| !e.up? || (e.redirect && e.redirect.https?) }
end

#government?Boolean

Returns:

  • (Boolean)


32
33
34
35
# File 'lib/site-inspector/domain.rb', line 32

def government?
  require 'gman'
  Gman.valid? host
end

#hsts?Boolean

HSTS on the canonical domain?

Returns:

  • (Boolean)


180
181
182
# File 'lib/site-inspector/domain.rb', line 180

def hsts?
  canonical_endpoint.hsts && canonical_endpoint.hsts.enabled?
end

#hsts_preload_ready?Boolean

Returns:

  • (Boolean)


188
189
190
191
# File 'lib/site-inspector/domain.rb', line 188

def hsts_preload_ready?
  return false unless hsts_subdomains?
  endpoints.find { |e| e.root? && e.https? }.hsts.preload_ready?
end

#hsts_subdomains?Boolean

Returns:

  • (Boolean)


184
185
186
# File 'lib/site-inspector/domain.rb', line 184

def hsts_subdomains?
  endpoints.find { |e| e.root? && e.https? }.hsts.include_subdomains?
end

#https?Boolean

HTTPS is “supported” (different than “canonical” or “enforced”) if:

  • Either of the HTTPS endpoints is listening, and doesn’t have an invalid hostname.

TODO: needs to allow an invalid chain.

Returns:

  • (Boolean)


65
66
67
# File 'lib/site-inspector/domain.rb', line 65

def https?
  endpoints.any? { |e| e.https? && e.up? && e.https.valid? }
end

#inspectObject



197
198
199
# File 'lib/site-inspector/domain.rb', line 197

def inspect
  "#<SiteInspector::Domain host=\"#{host}\">"
end

#prefetchObject

We know most API calls to the domain model are going to require That the root of all four endpoints are called. Rather than process them In serial, lets grab them in parallel and cache the results to speed up later calls.



205
206
207
208
209
210
211
# File 'lib/site-inspector/domain.rb', line 205

def prefetch
  endpoints.each do |endpoint|
    request = Typhoeus::Request.new(endpoint.uri, SiteInspector.typhoeus_defaults)
    SiteInspector.hydra.queue(request)
  end
  SiteInspector.hydra.run
end

#redirectObject

The first endpoint to respond with a redirect



175
176
177
# File 'lib/site-inspector/domain.rb', line 175

def redirect
  endpoints.find(&:external_redirect?)
end

#redirect?Boolean

A domain redirects if

  1. At least one endpoint is an external redirect, and

  2. All endpoints are either down or an external redirect

Returns:

  • (Boolean)


169
170
171
172
# File 'lib/site-inspector/domain.rb', line 169

def redirect?
  return false unless redirect
  endpoints.all? { |e| !e.up? || e.external_redirect? }
end

#responds?Boolean

Does any endpoint respond to HTTP? TODO: needs to allow an invalid chain.

Returns:

  • (Boolean)


44
45
46
# File 'lib/site-inspector/domain.rb', line 44

def responds?
  endpoints.any?(&:responds?)
end

#root?Boolean

Can you connect without www?

Returns:

  • (Boolean)


55
56
57
# File 'lib/site-inspector/domain.rb', line 55

def root?
  endpoints.any? { |e| e.root? && e.up? }
end

#to_h(options = {}) ⇒ Object

Converts the domain to a hash

By default, it only returns domain-wide information and information about the canonical endpoint

It will also pass options allong to each endpoint’s to_h method

options:

:all - return information about all endpoints

Returns a complete hash of the domain’s information



224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
# File 'lib/site-inspector/domain.rb', line 224

def to_h(options = {})
  prefetch

  hash = {
    host:               host,
    up:                 up?,
    responds:           responds?,
    www:                www?,
    root:               root?,
    https:              https?,
    enforces_https:     enforces_https?,
    downgrades_https:   downgrades_https?,
    canonically_www:    canonically_www?,
    canonically_https:  canonically_https?,
    redirect:           redirect?,
    hsts:               hsts?,
    hsts_subdomains:    hsts_subdomains?,
    hsts_preload_ready: hsts_preload_ready?,
    canonical_endpoint: canonical_endpoint.to_h(options)
  }

  if options['all']
    hash.merge!(endpoints: {
                  https: {
                    root: endpoints[0].to_h(options),
                    www:  endpoints[1].to_h(options)
                  },
                  http:  {
                    root: endpoints[2].to_h(options),
                    www:  endpoints[3].to_h(options)
                  }
                })
  end

  hash
end

#to_jsonObject



261
262
263
# File 'lib/site-inspector/domain.rb', line 261

def to_json
  to_h.to_json
end

#to_sObject



193
194
195
# File 'lib/site-inspector/domain.rb', line 193

def to_s
  host
end

#up?Boolean

Does any endpoint return a 200 or 300 response code?

Returns:

  • (Boolean)


38
39
40
# File 'lib/site-inspector/domain.rb', line 38

def up?
  endpoints.any?(&:up?)
end

#www?Boolean

TODO: These weren’t present before, and may not be useful. Can you connect to www?

Returns:

  • (Boolean)


50
51
52
# File 'lib/site-inspector/domain.rb', line 50

def www?
  endpoints.any? { |e| e.www? && e.up? }
end