Class: SiteInspector::Domain
- Inherits:
-
Object
- Object
- SiteInspector::Domain
- Defined in:
- lib/site-inspector/domain.rb
Instance Attribute Summary collapse
-
#host ⇒ Object
readonly
Returns the value of attribute host.
Instance Method Summary collapse
- #canonical_endpoint ⇒ Object
-
#canonically_https? ⇒ Boolean
A domain is “canonically” at https if: * at least one of its https endpoints is live and doesn’t have an invalid hostname * both http endpoints are either down or redirect somewhere * at least one http endpoint redirects immediately to an internal https endpoint This is meant to affirm situations like: http:// -> www -> https:// https:// -> http:// -> www and meant to avoid affirming situations like: http:// -> non-www www -> non-www or: http:// -> 200, www -> www.
-
#canonically_www? ⇒ Boolean
A domain is “canonically” at www if: * at least one of its www endpoints responds * both root endpoints are either down ~~or redirect somewhere~~, or * at least one root endpoint redirect should immediately go to an internal www endpoint This is meant to affirm situations like: http:// -> https:// -> www https:// -> http:// -> www and meant to avoid affirming situations like: http:// -> non-www, www -> non-www or like: https:// -> 200, http:// -> www.
-
#defaults_https? ⇒ Boolean
we can say that a canonical HTTPS site “defaults” to HTTPS, even if it doesn’t strictly enforce it (e.g. having a www subdomain first to go HTTP root before HTTPS root).
-
#downgrades_https? ⇒ Boolean
HTTPS is “downgraded” if both:.
- #endpoints ⇒ Object
-
#enforces_https? ⇒ Boolean
HTTPS is enforced if one of the HTTPS endpoints is “up”, and if both HTTP endpoints are either:.
- #government? ⇒ Boolean
-
#hsts? ⇒ Boolean
HSTS on the canonical domain?.
- #hsts_preload_ready? ⇒ Boolean
- #hsts_subdomains? ⇒ Boolean
-
#https? ⇒ Boolean
HTTPS is “supported” (different than “canonical” or “enforced”) if:.
-
#initialize(host) ⇒ Domain
constructor
A new instance of Domain.
- #inspect ⇒ Object
-
#prefetch ⇒ Object
We know most API calls to the domain model are going to require That the root of all four endpoints are called.
-
#redirect ⇒ Object
The first endpoint to respond with a redirect.
-
#redirect? ⇒ Boolean
A domain redirects if 1.
-
#responds? ⇒ Boolean
Does any endpoint respond to HTTP? TODO: needs to allow an invalid chain.
-
#root? ⇒ Boolean
Can you connect without www?.
-
#to_h(options = {}) ⇒ Object
Converts the domain to a hash.
- #to_json(*_args) ⇒ Object
- #to_s ⇒ Object
-
#up? ⇒ Boolean
Does any endpoint return a 200 or 300 response code?.
-
#www? ⇒ Boolean
TODO: These weren’t present before, and may not be useful.
Constructor Details
#initialize(host) ⇒ Domain
Returns a new instance of Domain.
7 8 9 10 11 12 13 14 |
# File 'lib/site-inspector/domain.rb', line 7 def initialize(host) host = host.downcase host = host.sub(/^https?:/, '') host = host.sub(%r{^/+}, '') host = host.sub(/^www\./, '') uri = Addressable::URI.parse "//#{host}" @host = uri.host end |
Instance Attribute Details
#host ⇒ Object (readonly)
Returns the value of attribute host.
5 6 7 |
# File 'lib/site-inspector/domain.rb', line 5 def host @host end |
Instance Method Details
#canonical_endpoint ⇒ Object
25 26 27 28 29 30 31 32 |
# File 'lib/site-inspector/domain.rb', line 25 def canonical_endpoint @canonical_endpoint ||= begin prefetch endpoints.find do |e| e.https? == canonically_https? && e.www? == canonically_www? end end end |
#canonically_https? ⇒ Boolean
A domain is “canonically” at https if:
* at least one of its https endpoints is live and
doesn't have an invalid hostname
* both http endpoints are either down or redirect *somewhere*
* at least one http endpoint redirects immediately to
an *internal* https endpoint
This is meant to affirm situations like:
http:// -> http://www -> https://
https:// -> http:// -> https://www
and meant to avoid affirming situations like:
http:// -> http://non-www
http://www -> http://non-www
or:
http:// -> 200, http://www -> https://www
It allows a site to be canonically HTTPS if the cert has a valid hostname but invalid chain issues.
156 157 158 159 160 161 162 163 164 165 166 167 168 |
# File 'lib/site-inspector/domain.rb', line 156 def canonically_https? # Does any endpoint respond? return false unless up? # At least one of its https endpoints is live and doesn't have an invalid hostname return false unless https? # Both http endpoints are down return true if endpoints.select(&:http?).all? { |e| !e.up? } # at least one http endpoint redirects immediately to https endpoints.select(&:http?).any? { |e| e.redirect&.https? } end |
#canonically_www? ⇒ Boolean
A domain is “canonically” at www if:
* at least one of its www endpoints responds
* both root endpoints are either down ~~or redirect *somewhere*~~, or
* at least one root endpoint redirect should immediately go to
an *internal* www endpoint
This is meant to affirm situations like:
http:// -> https:// -> https://www
https:// -> http:// -> https://www
and meant to avoid affirming situations like:
http:// -> http://non-www,
http://www -> http://non-www
or like:
https:// -> 200, http:// -> http://www
125 126 127 128 129 130 131 132 133 134 135 136 137 |
# File 'lib/site-inspector/domain.rb', line 125 def canonically_www? # Does any endpoint respond? return false unless up? # Does at least one www endpoint respond? return false unless www? # Are both root endpoints down? return true if endpoints.select(&:root?).all? { |e| !e.up? } # Does either root endpoint redirect to a www endpoint? endpoints.select(&:root?).any? { |e| e.redirect&.www? } end |
#defaults_https? ⇒ Boolean
we can say that a canonical HTTPS site “defaults” to HTTPS, even if it doesn’t strictly enforce it (e.g. having a www subdomain first to go HTTP root before HTTPS root).
TODO: not implemented.
96 97 98 |
# File 'lib/site-inspector/domain.rb', line 96 def defaults_https? raise 'Not implemented. Halp?' end |
#downgrades_https? ⇒ Boolean
HTTPS is “downgraded” if both:
-
HTTPS is supported, and
-
The ‘canonical’ endpoint gets an immediate internal redirect to HTTP.
TODO: the redirect must be internal.
106 107 108 109 110 |
# File 'lib/site-inspector/domain.rb', line 106 def downgrades_https? return false unless https? canonical_endpoint.redirect? && canonical_endpoint.redirect.http? end |
#endpoints ⇒ Object
16 17 18 19 20 21 22 23 |
# File 'lib/site-inspector/domain.rb', line 16 def endpoints @endpoints ||= [ Endpoint.new("https://#{host}", domain: self), Endpoint.new("https://www.#{host}", domain: self), Endpoint.new("http://#{host}", domain: self), Endpoint.new("http://www.#{host}", domain: self) ] end |
#enforces_https? ⇒ Boolean
HTTPS is enforced if one of the HTTPS endpoints is “up”, and if both HTTP endpoints are either:
* down, or
* redirect immediately to HTTPS.
This is different than whether a domain is “canonically” HTTPS.
-
an HTTP redirect can go to HTTPS on another domain, as long as it’s immediate.
-
a domain with an invalid cert can still be enforcing HTTPS.
TODO: need to ensure the redirect immediately goes to HTTPS. TODO: don’t need to require that the HTTPS cert is valid for this purpose.
85 86 87 88 89 |
# File 'lib/site-inspector/domain.rb', line 85 def enforces_https? return false unless https? endpoints.select(&:http?).all? { |e| !e.up? || e.redirect&.https? } end |
#government? ⇒ Boolean
34 35 36 37 |
# File 'lib/site-inspector/domain.rb', line 34 def government? require 'gman' Gman.valid? host end |
#hsts? ⇒ Boolean
HSTS on the canonical domain?
185 186 187 |
# File 'lib/site-inspector/domain.rb', line 185 def hsts? canonical_endpoint.hsts&.enabled? end |
#hsts_preload_ready? ⇒ Boolean
193 194 195 196 197 |
# File 'lib/site-inspector/domain.rb', line 193 def hsts_preload_ready? return false unless hsts_subdomains? endpoints.find { |e| e.root? && e.https? }.hsts.preload_ready? end |
#hsts_subdomains? ⇒ Boolean
189 190 191 |
# File 'lib/site-inspector/domain.rb', line 189 def hsts_subdomains? endpoints.find { |e| e.root? && e.https? }.hsts.include_subdomains? end |
#https? ⇒ Boolean
HTTPS is “supported” (different than “canonical” or “enforced”) if:
-
Either of the HTTPS endpoints is listening, and doesn’t have an invalid hostname.
TODO: needs to allow an invalid chain.
67 68 69 |
# File 'lib/site-inspector/domain.rb', line 67 def https? endpoints.any? { |e| e.https? && e.up? && e.https.valid? } end |
#inspect ⇒ Object
203 204 205 |
# File 'lib/site-inspector/domain.rb', line 203 def inspect "#<SiteInspector::Domain host=\"#{host}\">" end |
#prefetch ⇒ Object
We know most API calls to the domain model are going to require That the root of all four endpoints are called. Rather than process them In serial, lets grab them in parallel and cache the results to speed up later calls.
211 212 213 214 215 216 217 |
# File 'lib/site-inspector/domain.rb', line 211 def prefetch endpoints.each do |endpoint| request = Typhoeus::Request.new(endpoint.uri, SiteInspector.typhoeus_defaults) SiteInspector.hydra.queue(request) end SiteInspector.hydra.run end |
#redirect ⇒ Object
The first endpoint to respond with a redirect
180 181 182 |
# File 'lib/site-inspector/domain.rb', line 180 def redirect endpoints.find(&:external_redirect?) end |
#redirect? ⇒ Boolean
A domain redirects if
-
At least one endpoint is an external redirect, and
-
All endpoints are either down or an external redirect
173 174 175 176 177 |
# File 'lib/site-inspector/domain.rb', line 173 def redirect? return false unless redirect endpoints.all? { |e| !e.up? || e.external_redirect? } end |
#responds? ⇒ Boolean
Does any endpoint respond to HTTP? TODO: needs to allow an invalid chain.
46 47 48 |
# File 'lib/site-inspector/domain.rb', line 46 def responds? endpoints.any?(&:responds?) end |
#root? ⇒ Boolean
Can you connect without www?
57 58 59 |
# File 'lib/site-inspector/domain.rb', line 57 def root? endpoints.any? { |e| e.root? && e.up? } end |
#to_h(options = {}) ⇒ Object
Converts the domain to a hash
By default, it only returns domain-wide information and information about the canonical endpoint
It will also pass options allong to each endpoint’s to_h method
options:
:all - return information about all endpoints
Returns a complete hash of the domain’s information
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 |
# File 'lib/site-inspector/domain.rb', line 230 def to_h( = {}) prefetch hash = { host: host, up: up?, responds: responds?, www: www?, root: root?, https: https?, enforces_https: enforces_https?, downgrades_https: downgrades_https?, canonically_www: canonically_www?, canonically_https: canonically_https?, redirect: redirect?, hsts: hsts?, hsts_subdomains: hsts_subdomains?, hsts_preload_ready: hsts_preload_ready?, canonical_endpoint: canonical_endpoint.to_h() } if ['all'] hash[:endpoints] = { https: { root: endpoints[0].to_h(), www: endpoints[1].to_h() }, http: { root: endpoints[2].to_h(), www: endpoints[3].to_h() } } end hash end |
#to_json(*_args) ⇒ Object
267 268 269 |
# File 'lib/site-inspector/domain.rb', line 267 def to_json(*_args) to_h.to_json end |
#to_s ⇒ Object
199 200 201 |
# File 'lib/site-inspector/domain.rb', line 199 def to_s host end |
#up? ⇒ Boolean
Does any endpoint return a 200 or 300 response code?
40 41 42 |
# File 'lib/site-inspector/domain.rb', line 40 def up? endpoints.any?(&:up?) end |
#www? ⇒ Boolean
TODO: These weren’t present before, and may not be useful. Can you connect to www?
52 53 54 |
# File 'lib/site-inspector/domain.rb', line 52 def www? endpoints.any? { |e| e.www? && e.up? } end |