Class: PublicSuffix::List

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/public_suffix/list.rb

Overview

A List is a collection of one or more Rule.

Given a List, you can add or remove Rule, iterate all items in the list or search for the first rule which matches a specific domain name.

# Create a new list
list =  PublicSuffix::List.new

# Push two rules to the list
list << PublicSuffix::Rule.factory("it")
list << PublicSuffix::Rule.factory("com")

# Get the size of the list
list.size
# => 2

# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffix::Rule::Normal>
list.find("example.org")
# => nil

You can create as many List you want. The List.default rule list is used to tokenize and validate a domain.

List implements Enumerable module.

Class Attribute Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize {|self| ... } ⇒ List

Initializes an empty PublicSuffix::List.

Yields:

  • (self)

    Yields on self.

Yield Parameters:



157
158
159
160
161
162
# File 'lib/public_suffix/list.rb', line 157

def initialize(&block)
  @rules   = []
  @indexes = {}
  yield(self) if block_given?
  create_index!
end

Class Attribute Details

.defaultPublicSuffix::List

Gets the default rule list. Initializes a new PublicSuffix::List parsing the content of default_definition, if required.

Returns:



56
57
58
# File 'lib/public_suffix/list.rb', line 56

def default
  @default
end

.default_definitionFile

Gets the default definition list. Can be any IOStream including a File or a simple String. The object must respond to #each_line.

Returns:

  • (File)


109
110
111
# File 'lib/public_suffix/list.rb', line 109

def default_definition
  @default_definition
end

Instance Attribute Details

#indexesArray (readonly)

Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).

Returns:

  • (Array)


149
150
151
# File 'lib/public_suffix/list.rb', line 149

def indexes
  @indexes
end

#rulesArray<PublicSuffix::Rule::*> (readonly)

Gets the array of rules.

Returns:



143
144
145
# File 'lib/public_suffix/list.rb', line 143

def rules
  @rules
end

Class Method Details

.clearself

Sets the default rule list to nil.

Returns:

  • (self)


90
91
92
93
# File 'lib/public_suffix/list.rb', line 90

def self.clear
  self.default = nil
  self
end

.parse(input) ⇒ Array<PublicSuffix::Rule::*>

Parse given input treating the content as Public Suffix List.

See publicsuffix.org/format/ for more details about input format.

Parameters:

  • input (String)

    The rule list to parse.

Returns:



120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/public_suffix/list.rb', line 120

def self.parse(input)
  new do |list|
    input.each_line do |line|
      line.strip!
      break if !private_domains? && line.include?('===BEGIN PRIVATE DOMAINS===')
      # strip blank lines
      if line.empty?
        next
      # strip comments
      elsif line =~ %r{^//}
        next
      # append rule
      else
        list.add(Rule.factory(line), false)
      end
    end
  end
end

.private_domains=(value) ⇒ PublicSuffix::List

Enables/disables support for private (non-ICANN) domains Implicitly reloads the list

Parameters:

  • enable/disable (Boolean)

    support

Returns:



82
83
84
85
# File 'lib/public_suffix/list.rb', line 82

def self.private_domains=(value)
  @private_domains = !!value
  self.clear
end

.private_domains?Boolean

Shows if support for private (non-ICANN) domains is enabled or not

Returns:

  • (Boolean)


73
74
75
# File 'lib/public_suffix/list.rb', line 73

def self.private_domains?
  @private_domains != false
end

.reloadPublicSuffix::List

Resets the default rule list and reinitialize it parsing the content of default_definition.

Returns:



99
100
101
# File 'lib/public_suffix/list.rb', line 99

def self.reload
  self.clear.default
end

Instance Method Details

#==(other) ⇒ Boolean Also known as: eql?

Checks whether two lists are equal.

List one is equal to two, if two is an instance of PublicSuffix::List and each PublicSuffix::Rule::* in list one is available in list two, in the same order.

Parameters:

Returns:

  • (Boolean)


191
192
193
194
195
# File 'lib/public_suffix/list.rb', line 191

def ==(other)
  return false unless other.is_a?(List)
  self.equal?(other) ||
  self.rules == other.rules
end

#add(rule, index = true) ⇒ self Also known as: <<

Adds the given object to the list  and optionally refreshes the rule index.

Parameters:

  • rule (PublicSuffix::Rule::*)

    The rule to add to the list.

  • index (Boolean) (defaults to: true)

    Set to true to recreate the rule index after the rule has been added to the list.

Returns:

  • (self)

See Also:



223
224
225
226
227
# File 'lib/public_suffix/list.rb', line 223

def add(rule, index = true)
  @rules << rule
  create_index! if index == true
  self
end

#clearself

Removes all elements.

Returns:

  • (self)


248
249
250
251
# File 'lib/public_suffix/list.rb', line 248

def clear
  @rules.clear
  self
end

#create_index!Object

Creates a naive index for @rules. Just a hash that will tell us where the elements of @rules are relative to its first Rule::Base#labels element.

For instance if @rules and @rules are the only elements of the list where Rule#labels.first is ‘us’ @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.



171
172
173
174
175
176
177
178
179
# File 'lib/public_suffix/list.rb', line 171

def create_index!
  @rules.map { |l| l.labels.first }.each_with_index do |elm, inx|
    if !@indexes.has_key?(elm)
      @indexes[elm] = [inx]
    else
      @indexes[elm] << inx
    end
  end
end

#each(*args, &block) ⇒ Object

Iterates each rule in the list.



199
200
201
# File 'lib/public_suffix/list.rb', line 199

def each(*args, &block)
  @rules.each(*args, &block)
end

#empty?Boolean

Checks whether the list is empty.

Returns:

  • (Boolean)


241
242
243
# File 'lib/public_suffix/list.rb', line 241

def empty?
  @rules.empty?
end

#find(domain) ⇒ PublicSuffix::Rule::*?

Returns the most appropriate rule for domain.

From the Public Suffix List documentation:

  • If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.

  • An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.

Algorithm description

  • Match domain against all rules and take note of the matching ones.

  • If no rules match, the prevailing rule is “*”.

  • If more than one rule matches, the prevailing rule is the one which is an exception rule.

  • If there is no matching exception rule, the prevailing rule is the one with the most labels.

  • If the prevailing rule is a exception rule, modify it by removing the leftmost label.

  • The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).

  • The registered domain is the public suffix plus one additional label.

Parameters:

  • domain (String, #to_s)

    The domain name.

Returns:



277
278
279
280
281
# File 'lib/public_suffix/list.rb', line 277

def find(domain)
  rules = select(domain)
  rules.select { |r|   r.type == :exception }.first ||
  rules.inject { |t,r| t.length > r.length ? t : r }
end

#select(domain) ⇒ Array<PublicSuffix::Rule::*>

Selects all the rules matching given domain.

Will use @indexes to try only the rules that share the same first label, that will speed up things when using List.find(‘foo’) a lot.

Parameters:

  • domain (String, #to_s)

    The domain name.

Returns:



291
292
293
294
295
296
297
298
299
# File 'lib/public_suffix/list.rb', line 291

def select(domain)
  # raise DomainInvalid, "Blank domain"
  return [] if domain.to_s !~ /[^[:space:]]/
  # raise DomainInvalid, "`#{domain}' is not expected to contain a scheme"
  return [] if domain.include?("://")

  indices = (@indexes[Domain.domain_to_labels(domain).first] || [])
  @rules.values_at(*indices).select { |rule| rule.match?(domain) }
end

#sizeInteger Also known as: length

Gets the number of elements in the list.

Returns:

  • (Integer)


233
234
235
# File 'lib/public_suffix/list.rb', line 233

def size
  @rules.size
end

#to_aArray<PublicSuffix::Rule::*>

Gets the list as array.

Returns:



206
207
208
# File 'lib/public_suffix/list.rb', line 206

def to_a
  @rules
end