Class: HTTP::URI

Inherits:
Object
  • Object
show all
Defined in:
lib/http/uri.rb,
lib/http/uri/parsing.rb,
lib/http/uri/normalizer.rb

Overview

URI normalization and dot-segment removal

Defined Under Namespace

Classes: InvalidError

Constant Summary collapse

HTTP_SCHEME =

HTTP scheme string

"http"
HTTPS_SCHEME =

HTTPS scheme string

"https"
PERCENT_ENCODE =

Pattern matching characters requiring percent-encoding

/[^\x21-\x7E]+/
DEFAULT_PORTS =

Default ports for supported URI schemes

{
  "http"  => 80,
  "https" => 443,
  "ws"    => 80,
  "wss"   => 443
}.freeze
NEEDS_ADDRESSABLE =

Pattern for characters that stdlib’s URI.parse silently modifies

/[^\x20-\x7E]/
NORMALIZER =

Default URI normalizer

lambda do |uri|
  uri = HTTP::URI.parse uri
  scheme = uri.scheme&.downcase
  host = uri.normalized_host
  host = "[#{host}]" if host&.include?(":")
  default_port = scheme == HTTPS_SCHEME ? 443 : 80

  HTTP::URI.new(
    scheme:   scheme,
    user:     uri.user,
    password: uri.password,
    host:     host,
    port:     (uri.port == default_port ? nil : uri.port),
    path:     uri.path.empty? ? "/" : percent_encode(remove_dot_segments(uri.path)),
    query:    percent_encode(uri.query),
    fragment: uri.fragment
  )
end
DOT_SEGMENTS =

Standalone dot segments that terminate the algorithm

%w[. ..].freeze
SINGLE_DOT_SEGMENT =

Matches “/.” followed by “/” or end-of-string

%r{\A/\.(?:/|\z)}
DOUBLE_DOT_SEGMENT =

Matches “/..” followed by “/” or end-of-string

%r{\A/\.\.(?:/|\z)}
LAST_SEGMENT =

Matches the last segment in a path (everything after the final “/”)

%r{/[^/]*\z}
FIRST_SEGMENT =

Matches the first path segment, with or without a leading “/”

%r{\A/?[^/]*}

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(scheme: nil, user: nil, password: nil, host: nil, port: nil, path: nil, query: nil, fragment: nil) ⇒ HTTP::URI

Creates an HTTP::URI instance from the given keyword arguments

Examples:

HTTP::URI.new(scheme: "http", host: "example.com")

Parameters:

  • scheme (String, nil) (defaults to: nil)

    URI scheme

  • user (String, nil) (defaults to: nil)

    for basic authentication

  • password (String, nil) (defaults to: nil)

    for basic authentication

  • host (String, nil) (defaults to: nil)

    name component (IPv6 addresses must be bracketed)

  • port (Integer, nil) (defaults to: nil)

    network port to connect to

  • path (String, nil) (defaults to: nil)

    component to request

  • query (String, nil) (defaults to: nil)

    component distinct from path

  • fragment (String, nil) (defaults to: nil)

    component at the end of the URI



127
128
129
130
131
132
133
134
135
136
137
138
139
# File 'lib/http/uri.rb', line 127

def initialize(scheme: nil, user: nil, password: nil, host: nil,
               port: nil, path: nil, query: nil, fragment: nil)
  @scheme   = scheme
  @user     = user
  @password = password
  @raw_host = host
  @host     = process_ipv6_brackets(host)
  @normalized_host = normalize_host(@host)
  @port     = port
  @path     = path || ""
  @query    = query
  @fragment = fragment
end

Instance Attribute Details

#fragmentString? (readonly)

URI fragment

Examples:

uri.fragment # => "section1"

Returns:

  • (String, nil)

    The fragment component



84
85
86
# File 'lib/http/uri.rb', line 84

def fragment
  @fragment
end

#hostString?

Host, either a domain name or IP address

Examples:

uri.host # => "example.com"

Returns:

  • (String, nil)

    The host of the URI



48
49
50
# File 'lib/http/uri.rb', line 48

def host
  @host
end

#normalized_hostString? (readonly)

Normalized host

Examples:

uri.normalized_host # => "example.com"

Returns:

  • (String, nil)

    The normalized host of the URI



57
58
59
# File 'lib/http/uri.rb', line 57

def normalized_host
  @normalized_host
end

#passwordString? (readonly)

Password component for authentication

Examples:

uri.password # => "secret"

Returns:

  • (String, nil)

    The password component



39
40
41
# File 'lib/http/uri.rb', line 39

def password
  @password
end

#pathString

URI path component

Examples:

uri.path # => "/foo"

Returns:

  • (String)

    The path component



66
67
68
# File 'lib/http/uri.rb', line 66

def path
  @path
end

#queryString?

URI query string

Examples:

uri.query # => "q=1"

Returns:

  • (String, nil)

    The query component



75
76
77
# File 'lib/http/uri.rb', line 75

def query
  @query
end

#schemeString? (readonly)

URI scheme (e.g. “http”, “https”)

Examples:

uri.scheme # => "http"

Returns:

  • (String, nil)

    The URI scheme



21
22
23
# File 'lib/http/uri.rb', line 21

def scheme
  @scheme
end

#userString? (readonly)

User component for authentication

Examples:

uri.user # => "admin"

Returns:

  • (String, nil)

    The user component



30
31
32
# File 'lib/http/uri.rb', line 30

def user
  @user
end

Class Method Details

.form_encode(form_values, sort: false) ⇒ String

Encodes key/value pairs as application/x-www-form-urlencoded

Examples:

HTTP::URI.form_encode(foo: "bar")

Parameters:

  • form_values (#to_hash, #to_ary)

    to encode

  • sort (TrueClass, FalseClass) (defaults to: false)

    should key/value pairs be sorted first?

Returns:

  • (String)

    encoded value



37
38
39
40
41
# File 'lib/http/uri/parsing.rb', line 37

def self.form_encode(form_values, sort: false)
  return ::URI.encode_www_form(form_values) unless sort

  ::URI.encode_www_form(form_values.sort_by { |k, _| String(k) })
end

.idna_to_ascii(host) ⇒ String

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Convert a hostname to ASCII via IDNA (requires addressable)

Parameters:

  • host (String)

    hostname to encode

Returns:

  • (String)

    ASCII-encoded hostname



72
73
74
75
76
77
# File 'lib/http/uri/parsing.rb', line 72

def self.idna_to_ascii(host)
  return host if host.ascii_only?

  require_addressable
  Addressable::IDNA.to_ascii(host) # steep:ignore
end

.parse(uri) ⇒ HTTP::URI

Parse the given URI string, returning an HTTP::URI object

Examples:

HTTP::URI.parse("http://example.com/path")

Parameters:

Returns:

Raises:



15
16
17
18
19
20
21
22
23
24
25
# File 'lib/http/uri/parsing.rb', line 15

def self.parse(uri)
  return uri if uri.is_a?(self)
  raise InvalidError, "invalid URI: nil" if uri.nil?

  uri_string = begin
    String(uri)
  rescue TypeError, NoMethodError
    raise InvalidError, "invalid URI: #{uri.inspect}"
  end
  new(**parse_components(uri_string))
end

.percent_encode(string) ⇒ String

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Percent-encode matching characters in a string

Parameters:

  • string (String)

    raw string

Returns:

  • (String)

    encoded value



49
50
51
52
53
# File 'lib/http/uri/parsing.rb', line 49

def self.percent_encode(string)
  string&.gsub(PERCENT_ENCODE) do |substr|
    substr.bytes.map { |c| format("%%%02X", c) }.join
  end
end

.require_addressablevoid

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

This method returns an undefined value.

Loads the addressable gem on first use

Raises:

  • (LoadError)

    if addressable gem is not installed



60
61
62
63
64
65
# File 'lib/http/uri/parsing.rb', line 60

def self.require_addressable
  return if defined?(@addressable_loaded)

  require "addressable/uri"
  @addressable_loaded = true
end

Instance Method Details

#==(other) ⇒ TrueClass, FalseClass

Are these URI objects equal after normalization

Examples:

HTTP::URI.parse("http://example.com") == HTTP::URI.parse("http://example.com")

Parameters:

  • other (Object)

    URI to compare this one with

Returns:

  • (TrueClass, FalseClass)

    are the URIs equivalent (after normalization)?



150
151
152
# File 'lib/http/uri.rb', line 150

def ==(other)
  other.is_a?(URI) && String(normalize).eql?(String(other.normalize))
end

#deconstruct_keys(keys) ⇒ Hash{Symbol => Object}

Pattern matching interface

Examples:

uri.deconstruct_keys(%i[scheme host])

Parameters:

  • keys (Array<Symbol>, nil)

    keys to extract, or nil for all

Returns:

  • (Hash{Symbol => Object})


367
368
369
370
371
# File 'lib/http/uri.rb', line 367

def deconstruct_keys(keys)
  hash = { scheme: @scheme, host: @host, port: port, path: @path,
           query: @query, fragment: @fragment, user: @user, password: @password }
  keys ? hash.slice(*keys) : hash
end

#default_portInteger?

Default port for the URI scheme

Examples:

HTTP::URI.parse("http://example.com").default_port # => 80

Returns:

  • (Integer, nil)

    default port or nil for unknown schemes



212
213
214
# File 'lib/http/uri.rb', line 212

def default_port
  DEFAULT_PORTS[@scheme&.downcase]
end

#dupHTTP::URI

Duplicates the URI object

Examples:

HTTP::URI.parse("http://example.com").dup

Returns:



323
324
325
326
327
328
# File 'lib/http/uri.rb', line 323

def dup
  self.class.new(
    scheme: @scheme, user: @user, password: @password, host: @raw_host,
    port: @port, path: @path, query: @query, fragment: @fragment
  )
end

#eql?(other) ⇒ TrueClass, FalseClass

Are these URI objects equal without normalization

Examples:

uri = HTTP::URI.parse("http://example.com")
uri.eql?(HTTP::URI.parse("http://example.com"))

Parameters:

  • other (Object)

    URI to compare this one with

Returns:

  • (TrueClass, FalseClass)

    are the URIs equivalent?



164
165
166
# File 'lib/http/uri.rb', line 164

def eql?(other)
  other.is_a?(URI) && String(self).eql?(String(other))
end

#hashInteger

Hash value based off the normalized form of a URI

Examples:

HTTP::URI.parse("http://example.com").hash

Returns:

  • (Integer)

    A hash of the URI



175
176
177
# File 'lib/http/uri.rb', line 175

def hash
  @hash ||= [self.class, String(self)].hash
end

#http?True, False

Checks whether the URI scheme is HTTP

Examples:

HTTP::URI.parse("http://example.com").http?

Returns:

  • (True)

    if URI is HTTP

  • (False)

    otherwise



300
301
302
# File 'lib/http/uri.rb', line 300

def http?
  HTTP_SCHEME.eql?(@scheme)
end

#https?True, False

Checks whether the URI scheme is HTTPS

Examples:

HTTP::URI.parse("https://example.com").https?

Returns:

  • (True)

    if URI is HTTPS

  • (False)

    otherwise



312
313
314
# File 'lib/http/uri.rb', line 312

def https?
  HTTPS_SCHEME.eql?(@scheme)
end

#inspectString

Returns human-readable representation of URI

Examples:

HTTP::URI.parse("http://example.com").inspect

Returns:

  • (String)

    human-readable representation of URI



355
356
357
# File 'lib/http/uri.rb', line 355

def inspect
  format("#<%s:0x%014x URI:%s>", self.class, object_id << 1, self)
end

#join(other) ⇒ HTTP::URI

Resolves another URI against this one per RFC 3986

Examples:

HTTP::URI.parse("http://example.com/foo/").join("bar")

Parameters:

  • other (String, URI)

    the URI to resolve

Returns:



263
264
265
266
267
# File 'lib/http/uri.rb', line 263

def join(other)
  base = self.class.percent_encode(String(self))
  ref  = self.class.percent_encode(String(other))
  self.class.parse(::URI.join(base, ref))
end

#normalizeHTTP::URI

Returns a normalized copy of the URI

Lowercases scheme and host, strips default port. Used by #== to compare URIs for equivalence.

Examples:

HTTP::URI.parse("HTTP://EXAMPLE.COM:80").normalize

Returns:



279
280
281
282
283
284
285
286
287
288
289
290
# File 'lib/http/uri.rb', line 279

def normalize
  self.class.new(
    scheme:   @scheme&.downcase,
    user:     @user,
    password: @password,
    host:     @raw_host&.downcase,
    port:     (@port unless port.eql?(default_port)),
    path:     @path.empty? && @raw_host ? "/" : @path,
    query:    @query,
    fragment: @fragment
  )
end

#omit(*components) ⇒ HTTP::URI

Returns a new URI with the specified components removed

Examples:

HTTP::URI.parse("http://example.com#frag").omit(:fragment)

Parameters:

  • components (Symbol)

    URI components to remove

Returns:

  • (HTTP::URI)

    new URI without the specified components



247
248
249
250
251
252
# File 'lib/http/uri.rb', line 247

def omit(*components)
  self.class.new(
    **{ scheme: @scheme, user: @user, password: @password, host: @raw_host,
        port: @port, path: @path, query: @query, fragment: @fragment }.except(*components)
  )
end

#originString

The origin (scheme + host + port) per RFC 6454

Examples:

HTTP::URI.parse("http://example.com").origin # => "http://example.com"

Returns:

  • (String)

    origin of the URI



223
224
225
226
# File 'lib/http/uri.rb', line 223

def origin
  port_suffix = ":#{port}" unless port.eql?(default_port)
  "#{String(@scheme).downcase}://#{String(@raw_host).downcase}#{port_suffix}"
end

#portInteger?

Port number, either as specified or the default

Examples:

HTTP::URI.parse("http://example.com").port

Returns:

  • (Integer, nil)

    port number



201
202
203
# File 'lib/http/uri.rb', line 201

def port
  @port || default_port
end

#request_uriString

The path and query for use in an HTTP request line

Examples:

HTTP::URI.parse("http://example.com/path?q=1").request_uri # => "/path?q=1"

Returns:

  • (String)

    request URI string



235
236
237
# File 'lib/http/uri.rb', line 235

def request_uri
  "#{'/' if @path.empty?}#{@path}#{"?#{@query}" if @query}"
end

#to_sString Also known as: to_str

Convert an HTTP::URI to a String

Examples:

HTTP::URI.parse("http://example.com").to_s

Returns:

  • (String)

    URI serialized as a String



337
338
339
340
341
342
343
344
345
# File 'lib/http/uri.rb', line 337

def to_s
  str = +""
  str << "#{@scheme}:" if @scheme
  str << authority_string if @raw_host
  str << @path
  str << "?#{@query}" if @query
  str << "##{@fragment}" if @fragment
  str
end