Class: PublicSuffix::List
- Inherits:
-
Object
- Object
- PublicSuffix::List
- Includes:
- Enumerable
- Defined in:
- lib/public_suffix/list.rb
Overview
A List is a collection of one or more Rule.
Given a List, you can add or remove Rule, iterate all items in the list or search for the first rule which matches a specific domain name.
# Create a new list
list = PublicSuffix::List.new
# Push two rules to the list
list << PublicSuffix::Rule.factory("it")
list << PublicSuffix::Rule.factory("com")
# Get the size of the list
list.size
# => 2
# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffix::Rule::Normal>
list.find("example.org")
# => nil
You can create as many List you want. The List.default rule list is used to tokenize and validate a domain.
List implements Enumerable
module.
Constant Summary collapse
- DEFAULT_DEFINITION_PATH =
File.join(File.dirname(__FILE__), "..", "..", "data", "definitions.txt")
Class Attribute Summary collapse
-
.default_definition ⇒ File
Gets the default definition list.
Instance Attribute Summary collapse
-
#indexes ⇒ Array
readonly
Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).
-
#rules ⇒ Array<PublicSuffix::Rule::*>
readonly
Gets the array of rules.
Class Method Summary collapse
-
.clear ⇒ self
Sets the default rule list to
nil
. -
.default ⇒ PublicSuffix::List
Gets the default rule list.
-
.default=(value) ⇒ PublicSuffix::List
Sets the default rule list to
value
. -
.parse(input) ⇒ Array<PublicSuffix::Rule::*>
Parse given
input
treating the content as Public Suffix List. -
.private_domains=(value) ⇒ PublicSuffix::List
Enables/disables support for private (non-ICANN) domains Implicitly reloads the list.
-
.private_domains? ⇒ Boolean
Shows if support for private (non-ICANN) domains is enabled or not.
-
.reload ⇒ PublicSuffix::List
Resets the default rule list and reinitialize it parsing the content of List.default_definition.
Instance Method Summary collapse
-
#==(other) ⇒ Boolean
(also: #eql?)
Checks whether two lists are equal.
-
#add(rule, index = true) ⇒ self
(also: #<<)
Adds the given object to the list and optionally refreshes the rule index.
-
#clear ⇒ self
Removes all elements.
-
#create_index! ⇒ Object
Creates a naive index for @rules.
-
#each(*args, &block) ⇒ Object
Iterates each rule in the list.
-
#empty? ⇒ Boolean
Checks whether the list is empty.
-
#find(domain) ⇒ PublicSuffix::Rule::*?
Returns the most appropriate rule for domain.
-
#initialize {|self| ... } ⇒ List
constructor
Initializes an empty List.
-
#select(domain) ⇒ Array<PublicSuffix::Rule::*>
Selects all the rules matching given domain.
-
#size ⇒ Integer
(also: #length)
Gets the number of elements in the list.
-
#to_a ⇒ Array<PublicSuffix::Rule::*>
Gets the list as array.
Constructor Details
#initialize {|self| ... } ⇒ List
Initializes an empty PublicSuffix::List.
157 158 159 160 161 162 |
# File 'lib/public_suffix/list.rb', line 157 def initialize(&block) @rules = [] @indexes = {} yield(self) if block_given? create_index! end |
Class Attribute Details
.default_definition ⇒ File
Gets the default definition list. Can be any IOStream
including a File
or a simple String
. The object must respond to #each_line
.
111 112 113 |
# File 'lib/public_suffix/list.rb', line 111 def self.default_definition @default_definition || File.new(DEFAULT_DEFINITION_PATH, "r:utf-8") end |
Instance Attribute Details
#indexes ⇒ Array (readonly)
Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).
150 151 152 |
# File 'lib/public_suffix/list.rb', line 150 def indexes @indexes end |
#rules ⇒ Array<PublicSuffix::Rule::*> (readonly)
Gets the array of rules.
144 145 146 |
# File 'lib/public_suffix/list.rb', line 144 def rules @rules end |
Class Method Details
.clear ⇒ self
Sets the default rule list to nil
.
90 91 92 93 |
# File 'lib/public_suffix/list.rb', line 90 def self.clear self.default = nil self end |
.default ⇒ PublicSuffix::List
Gets the default rule list. Initializes a new PublicSuffix::List parsing the content of default_definition, if required.
54 55 56 |
# File 'lib/public_suffix/list.rb', line 54 def self.default @default ||= parse(default_definition) end |
.default=(value) ⇒ PublicSuffix::List
Sets the default rule list to value
.
64 65 66 |
# File 'lib/public_suffix/list.rb', line 64 def self.default=(value) @default = value end |
.parse(input) ⇒ Array<PublicSuffix::Rule::*>
Parse given input
treating the content as Public Suffix List.
See publicsuffix.org/format/ for more details about input format.
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
# File 'lib/public_suffix/list.rb', line 122 def self.parse(input) new do |list| input.each_line do |line| line.strip! break if !private_domains? && line.include?('===BEGIN PRIVATE DOMAINS===') # strip blank lines if line.empty? next # strip comments elsif line =~ %r{^//} next # append rule else list.add(Rule.factory(line), false) end end end end |
.private_domains=(value) ⇒ PublicSuffix::List
Enables/disables support for private (non-ICANN) domains Implicitly reloads the list
82 83 84 85 |
# File 'lib/public_suffix/list.rb', line 82 def self.private_domains=(value) @private_domains = !!value self.clear end |
.private_domains? ⇒ Boolean
Shows if support for private (non-ICANN) domains is enabled or not
71 72 73 |
# File 'lib/public_suffix/list.rb', line 71 def self.private_domains? @private_domains != false end |
.reload ⇒ PublicSuffix::List
Resets the default rule list and reinitialize it parsing the content of default_definition.
99 100 101 |
# File 'lib/public_suffix/list.rb', line 99 def self.reload self.clear.default end |
Instance Method Details
#==(other) ⇒ Boolean Also known as: eql?
Checks whether two lists are equal.
List one
is equal to two
, if two
is an instance of PublicSuffix::List and each PublicSuffix::Rule::*
in list one
is available in list two
, in the same order.
191 192 193 194 195 |
# File 'lib/public_suffix/list.rb', line 191 def ==(other) return false unless other.is_a?(List) self.equal?(other) || self.rules == other.rules end |
#add(rule, index = true) ⇒ self Also known as: <<
Adds the given object to the list and optionally refreshes the rule index.
223 224 225 226 227 |
# File 'lib/public_suffix/list.rb', line 223 def add(rule, index = true) @rules << rule create_index! if index == true self end |
#clear ⇒ self
Removes all elements.
248 249 250 251 |
# File 'lib/public_suffix/list.rb', line 248 def clear @rules.clear self end |
#create_index! ⇒ Object
Creates a naive index for @rules. Just a hash that will tell us where the elements of @rules are relative to its first Rule::Base#labels element.
For instance if @rules and @rules are the only elements of the list where Rule#labels.first is ‘us’ @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.
171 172 173 174 175 176 177 178 179 |
# File 'lib/public_suffix/list.rb', line 171 def create_index! @rules.map { |l| l.labels.first }.each_with_index do |elm, inx| if !@indexes.has_key?(elm) @indexes[elm] = [inx] else @indexes[elm] << inx end end end |
#each(*args, &block) ⇒ Object
Iterates each rule in the list.
199 200 201 |
# File 'lib/public_suffix/list.rb', line 199 def each(*args, &block) @rules.each(*args, &block) end |
#empty? ⇒ Boolean
Checks whether the list is empty.
241 242 243 |
# File 'lib/public_suffix/list.rb', line 241 def empty? @rules.empty? end |
#find(domain) ⇒ PublicSuffix::Rule::*?
Returns the most appropriate rule for domain.
From the Public Suffix List documentation:
-
If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.
-
An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.
Algorithm description
-
Match domain against all rules and take note of the matching ones.
-
If no rules match, the prevailing rule is “*”.
-
If more than one rule matches, the prevailing rule is the one which is an exception rule.
-
If there is no matching exception rule, the prevailing rule is the one with the most labels.
-
If the prevailing rule is a exception rule, modify it by removing the leftmost label.
-
The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).
-
The registered domain is the public suffix plus one additional label.
276 277 278 279 280 |
# File 'lib/public_suffix/list.rb', line 276 def find(domain) rules = select(domain) rules.detect { |r| r.type == :exception } || rules.inject { |t,r| t.length > r.length ? t : r } end |
#select(domain) ⇒ Array<PublicSuffix::Rule::*>
Selects all the rules matching given domain.
Will use @indexes to try only the rules that share the same first label, that will speed up things when using List.find(‘foo’) a lot.
290 291 292 293 294 295 296 297 298 |
# File 'lib/public_suffix/list.rb', line 290 def select(domain) # raise DomainInvalid, "Blank domain" return [] if domain.to_s =~ /\A\s*\z/ # raise DomainInvalid, "`#{domain}' is not expected to contain a scheme" return [] if domain.include?("://") indices = (@indexes[Domain.domain_to_labels(domain).first] || []) @rules.values_at(*indices).select { |rule| rule.match?(domain) } end |
#size ⇒ Integer Also known as: length
Gets the number of elements in the list.
233 234 235 |
# File 'lib/public_suffix/list.rb', line 233 def size @rules.size end |
#to_a ⇒ Array<PublicSuffix::Rule::*>
Gets the list as array.
206 207 208 |
# File 'lib/public_suffix/list.rb', line 206 def to_a @rules end |