Class: Fuzzyurl

Inherits:
Object
  • Object
show all
Defined in:
lib/fuzzyurl.rb,
lib/fuzzyurl/fields.rb,
lib/fuzzyurl/version.rb

Overview

Fuzzyurl provides two related functions: non-strict parsing of URLs or URL-like strings into their component pieces (protocol, username, password, hostname, port, path, query, and fragment), and fuzzy matching of URLs and URL patterns.

Specifically, URLs that look like this:

[protocol ://] [username [: password] @] [hostname] [: port] [/ path] [? query] [# fragment]

Fuzzyurls can be constructed using some or all of the above fields, optionally replacing some or all of those fields with a ‘*` wildcard if you wish to use the Fuzzyurl as a URL mask.

## Parsing URLs

irb> Fuzzyurl.from_string("https://api.example.com/users/123?full=true")
#=> #<Fuzzyurl:0x007ff55b914f58 @protocol="https", @username=nil, @password=nil, @hostname="api.example.com", @port=nil, @path="/users/123", @query="full=true", @fragment=nil>

## Constructing URLs

irb> f = Fuzzyurl.new(hostname: "example.com", protocol: "http", port: "8080")
irb> f.to_s
#=> "http://example.com:8080"

## Matching URLs

Fuzzyurl supports wildcard matching:

  • ‘*` matches anything, including `null`.

  • ‘foo*` matches `foo`, `foobar`, `foo/bar`, etc.

  • ‘*bar` matches `bar`, `foobar`, `foo/bar`, etc.

Path and hostname matching allows the use of a greedier wildcard ‘**` in addition to the naive wildcard `*`:

  • ‘*.example.com` matches `filsrv-01.corp.example.com` but not `example.com`.

  • ‘**.example.com` matches `filsrv-01.corp.example.com` and `example.com`.

  • ‘/some/path/*` matches `/some/path/foo/bar` and `/some/path/`

    but not `/some/path`
    
  • ‘/some/path/**` matches `/some/path/foo/bar` and `/some/path/`

    and `/some/path`
    

The ‘Fuzzyurl.mask` function aids in the creation of URL masks.

irb> Fuzzyurl.mask
#=> #<Fuzzyurl:0x007ff55b039578 @protocol="*", @username="*", @password="*", @hostname="*", @port="*", @path="*", @query="*", @fragment="*">

irb> Fuzzyurl.matches?(Fuzzyurl.mask, "http://example.com:8080/foo/bar")
#=> true

irb> mask = Fuzzyurl.mask(path: "/a/b/**")
irb> Fuzzyurl.matches?(mask, "https://example.com/a/b/")
#=> true
irb> Fuzzyurl.matches?(mask, "git+ssh://[email protected]/a/b/")
#=> true
irb> Fuzzyurl.matches?(mask, "https://example.com/a/bar")
#=> false

‘Fuzzyurl.bestMatch`, given a list of URL masks and a URL, will return the given mask which most closely matches the URL:

irb> masks = ["/foo/*", "/foo/bar", Fuzzyurl.mask]
irb> Fuzzyurl.best_match(masks, "http://example.com/foo/bar")
#=> "/foo/bar"

If you’d prefer the array index instead of the matching mask itself, use ‘Fuzzyurl.best_match_index` instead:

irb> Fuzzyurl.best_match_index(masks, "http://example.com/foo/bar")
#=> 1

Defined Under Namespace

Classes: Match, Protocols, Strings

Constant Summary collapse

FIELDS =
[
  :protocol,
  :username,
  :password,
  :hostname,
  :port,
  :path,
  :query,
  :fragment
]
VERSION =
"0.9.0"
VERSION_DATE =
"2016-06-28"

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(params = {}) ⇒ Fuzzyurl

Creates a new Fuzzyurl object from the given params or URL string. Keys of ‘params` should be symbols.

Parameters:

  • params (Hash|String|nil) (defaults to: {})

    URL string or parameter hash.



90
91
92
93
94
95
# File 'lib/fuzzyurl.rb', line 90

def initialize(params={})
  p = params.kind_of?(String) ? Fuzzyurl.from_string(params).to_hash : params
  (FIELDS & p.keys).each do |f|
    self.send("#{f}=", p[f])
  end
end

Class Method Details

.best_match(masks, url) ⇒ Integer|nil

Given an array of URL masks, returns the one which most closely matches ‘url`, or nil if none match.

‘url` and each element of `masks` may be Fuzzyurl or String format.

Parameters:

  • masks (Array)

    Array of URL masks.

  • url (Fuzzyurl|String)

    URL.

Returns:

  • (Integer|nil)

    Best-matching given mask, or nil for no match.



240
241
242
243
# File 'lib/fuzzyurl.rb', line 240

def best_match(masks, url)
  index = best_match_index(masks, url)
  index && masks[index]
end

.best_match_index(masks, url) ⇒ Integer|nil

Given an array of URL masks, returns the array index of the one which most closely matches ‘url`, or nil if none match.

‘url` and each element of `masks` may be Fuzzyurl or String format.

Parameters:

  • masks (Array)

    Array of URL masks.

  • url (Fuzzyurl|String)

    URL.

Returns:

  • (Integer|nil)

    Array index of best-matching mask, or nil for no match.



226
227
228
229
230
# File 'lib/fuzzyurl.rb', line 226

def best_match_index(masks, url)
  ms = masks.map {|m| m.kind_of?(Fuzzyurl) ? m : Fuzzyurl.mask(m)}
  u = url.kind_of?(Fuzzyurl) ? url : Fuzzyurl.from_string(url)
  Fuzzyurl::Match.best_match_index(ms, u)
end

.from_string(str, opts = {}) ⇒ Fuzzyurl

Returns a Fuzzyurl representation of the given URL string. Any fields not present in ‘str` will be assigned the value of `opts` (defaults to nil).

Parameters:

  • str (String)

    String URL to convert to Fuzzyurl.

  • opts (Hash|nil) (defaults to: {})

    Options.

Returns:

  • (Fuzzyurl)

    Fuzzyurl representation of ‘str`.



168
169
170
# File 'lib/fuzzyurl.rb', line 168

def from_string(str, opts={})
  Fuzzyurl::Strings.from_string(str, opts)
end

.fuzzy_match(mask, value) ⇒ Object

If ‘mask` (which may contain * wildcards) matches `url` (which may not), returns 1 if `mask` and `url` match perfectly, 0 if `mask` and `url` are a wildcard match, or null otherwise.

Wildcard language:

*              matches anything
foo/*          matches "foo/" and "foo/bar/baz" but not "foo"
foo/**         matches "foo/" and "foo/bar/baz" and "foo"
*.example.com  matches "api.v1.example.com" but not "example.com"
**.example.com matches "api.v1.example.com" and "example.com"

Any other form is treated as a literal match.

Parameters:

  • mask (String)

    String mask to match with (may contain wildcards).

  • value (String)

    String value to match.



262
263
264
# File 'lib/fuzzyurl.rb', line 262

def fuzzy_match(mask, value)
  Fuzzyurl::Match.fuzzy_match(mask, value)
end

.mask(params = {}) ⇒ Fuzzyurl

Returns a Fuzzyurl suitable for use as a URL mask, with the given values optionally set from ‘params` (Hash or String).

Parameters:

  • params (Hash|String|nil) (defaults to: {})

    Parameters to set.

Returns:



142
143
144
145
146
147
148
149
150
151
# File 'lib/fuzzyurl.rb', line 142

def mask(params={})
  params ||= {}
  return from_string(params, default: "*") if params.kind_of?(String)

  m = Fuzzyurl.new
  FIELDS.each do |f|
    m.send("#{f}=", params.has_key?(f) ? params[f].to_s : "*")
  end
  m
end

.match(mask, url) ⇒ Integer|nil

Returns an integer representing how closely ‘mask` matches `url` (0 means wildcard match, higher is closer), or nil for no match.

‘mask` and `url` may each be Fuzzyurl or String format.

Parameters:

Returns:

  • (Integer|nil)

    0 for wildcard match, 1 for perfect match, or nil.



180
181
182
183
184
# File 'lib/fuzzyurl.rb', line 180

def match(mask, url)
  m = mask.kind_of?(Fuzzyurl) ? mask : Fuzzyurl.mask(mask)
  u = url.kind_of?(Fuzzyurl) ? url : Fuzzyurl.from_string(url)
  Fuzzyurl::Match.match(m, u)
end

.match_scores(mask, url) ⇒ Object

Returns a Hash of match scores for each field of ‘mask` and `url`, indicating the closeness of the match. Values are from `fuzzy_match`: 0 indicates wildcard match, 1 indicates perfect match, and nil indicates no match.

‘mask` and `url` may each be Fuzzyurl or String format.

Parameters:



210
211
212
213
214
215
216
# File 'lib/fuzzyurl.rb', line 210

def match_scores(mask, url)
  m = mask.kind_of?(Fuzzyurl) ? m : Fuzzyurl.mask(m)
  u = url.kind_of?(Fuzzyurl) ? u : Fuzzyurl.from_string(u)
  m = mask.kind_of?(Fuzzyurl) ? mask : Fuzzyurl.mask(mask)
  u = url.kind_of?(Fuzzyurl) ? url : Fuzzyurl.from_string(url)
  Fuzzyurl::Match.match_scores(m, u)
end

.matches?(mask, url) ⇒ Boolean

Returns true if ‘mask` matches `url`, false otherwise.

‘mask` and `url` may each be Fuzzyurl or String format.

Parameters:

Returns:

  • (Boolean)

    Whether ‘mask` matches `url`.



193
194
195
196
197
198
199
# File 'lib/fuzzyurl.rb', line 193

def matches?(mask, url)
  m = mask.kind_of?(Fuzzyurl) ? m : Fuzzyurl.mask(m)
  u = url.kind_of?(Fuzzyurl) ? u : Fuzzyurl.from_string(u)
  m = mask.kind_of?(Fuzzyurl) ? mask : Fuzzyurl.mask(mask)
  u = url.kind_of?(Fuzzyurl) ? url : Fuzzyurl.from_string(url)
  Fuzzyurl::Match.matches?(m, u)
end

.to_string(fuzzyurl) ⇒ String

Returns a string representation of ‘fuzzyurl`.

Parameters:

  • fuzzyurl (Fuzzyurl)

    Fuzzyurl to convert to string.

Returns:

  • (String)

    String representation of ‘fuzzyurl`.



157
158
159
# File 'lib/fuzzyurl.rb', line 157

def to_string(fuzzyurl)
  Fuzzyurl::Strings.to_string(fuzzyurl)
end

Instance Method Details

#==(other) ⇒ Object



130
131
132
# File 'lib/fuzzyurl.rb', line 130

def ==(other)
  self.to_hash == other.to_hash
end

#to_hashHash

Returns a hash representation of this Fuzzyurl, with one key/value pair for each of ‘Fuzzyurl::FIELDS`.

Returns:

  • (Hash)

    Hash representation of this Fuzzyurl.



101
102
103
104
105
106
107
108
# File 'lib/fuzzyurl.rb', line 101

def to_hash
  FIELDS.reduce({}) do |hash, f|
    val = self.send(f)
    val = val.to_s if val
    hash[f] = val
    hash
  end
end

#to_sString

Returns a string representation of this Fuzzyurl.

Returns:

  • (String)

    String representation of this Fuzzyurl.



125
126
127
# File 'lib/fuzzyurl.rb', line 125

def to_s
  Fuzzyurl::Strings.to_string(self)
end

#with(params = {}) ⇒ Fuzzyurl

Returns a new copy of this Fuzzyurl, with the given params changed.

Parameters:

  • params (Hash|nil) (defaults to: {})

    New parameter values.

Returns:

  • (Fuzzyurl)

    Copy of ‘self` with the given parameters changed.



114
115
116
117
118
119
120
# File 'lib/fuzzyurl.rb', line 114

def with(params={})
  fu = Fuzzyurl.new(self.to_hash)
  (FIELDS & params.keys).each do |f|
    fu.send("#{f}=", params[f].to_s)
  end
  fu
end