Class: PublicSuffix::List

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/public_suffix/list.rb

Overview

A List is a collection of one or more Rule.

Given a List, you can add or remove Rule, iterate all items in the list or search for the first rule which matches a specific domain name.

# Create a new list
list =  PublicSuffix::List.new

# Push two rules to the list
list << PublicSuffix::Rule.factory("it")
list << PublicSuffix::Rule.factory("com")

# Get the size of the list
list.size
# => 2

# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffix::Rule::Normal>
list.find("example.org")
# => nil

You can create as many List you want. The List.default rule list is used to tokenize and validate a domain.

List implements Enumerable module.

Constant Summary collapse

@@default =
nil

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize {|self| ... } ⇒ List

Initializes an empty PublicSuffix::List.

Yields:

  • (self)

    Yields on self.

Yield Parameters:



63
64
65
66
67
68
# File 'lib/public_suffix/list.rb', line 63

def initialize(&block)
  @rules   = []
  @indexes = {}
  yield(self) if block_given?
  create_index!
end

Instance Attribute Details

#indexesArray (readonly)

Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).

Returns:

  • (Array)


55
56
57
# File 'lib/public_suffix/list.rb', line 55

def indexes
  @indexes
end

#rulesArray<PublicSuffix::Rule::*> (readonly)

Gets the array of rules.

Returns:



49
50
51
# File 'lib/public_suffix/list.rb', line 49

def rules
  @rules
end

Class Method Details

.clearself

Sets the default rule list to nil.

Returns:

  • (self)


229
230
231
232
# File 'lib/public_suffix/list.rb', line 229

def clear
  self.default = nil
  self
end

.defaultPublicSuffix::List

Gets the default rule list. Initializes a new PublicSuffix::List parsing the content of default_definition, if required.

Returns:



212
213
214
# File 'lib/public_suffix/list.rb', line 212

def default
  @@default ||= parse(default_definition)
end

.default=(value) ⇒ PublicSuffix::List

Sets the default rule list to value.

Parameters:

Returns:



222
223
224
# File 'lib/public_suffix/list.rb', line 222

def default=(value)
  @@default = value
end

.default_definitionFile

Gets the default definition list. Can be any IOStream including a File or a simple String. The object must respond to #each_line.

Returns:

  • (File)


248
249
250
# File 'lib/public_suffix/list.rb', line 248

def default_definition
  File.new(File.join(File.dirname(__FILE__), "definitions.txt"), "r:utf-8")
end

.parse(input) ⇒ Array<PublicSuffix::Rule::*>

Parse given input treating the content as Public Suffix List.

See publicsuffix.org/format/ for more details about input format.

Parameters:

  • input (String)

    The rule list to parse.

Returns:



260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
# File 'lib/public_suffix/list.rb', line 260

def parse(input)
  new do |list|
    input.each_line do |line|
      line.strip!

      # strip blank lines
      if line.empty?
        next
      # strip comments
      elsif line =~ %r{^//}
        next
      # append rule
      else
        list.add(Rule.factory(line), false)
      end
    end
  end
end

.reloadPublicSuffix::List

Resets the default rule list and reinitialize it parsing the content of default_definition.

Returns:



238
239
240
# File 'lib/public_suffix/list.rb', line 238

def reload
  self.clear.default
end

Instance Method Details

#==(other) ⇒ Boolean Also known as: eql?

Checks whether two lists are equal.

List one is equal to two, if two is an instance of PublicSuffix::List and each PublicSuffix::Rule::* in list one is available in list two, in the same order.

Parameters:

Returns:

  • (Boolean)


97
98
99
100
101
# File 'lib/public_suffix/list.rb', line 97

def ==(other)
  return false unless other.is_a?(List)
  self.equal?(other) ||
  self.rules == other.rules
end

#add(rule, index = true) ⇒ self Also known as: <<

Adds the given object to the list  and optionally refreshes the rule index.

Parameters:

  • rule (PublicSuffix::Rule::*)

    The rule to add to the list.

  • index (Boolean) (defaults to: true)

    Set to true to recreate the rule index after the rule has been added to the list.

Returns:

  • (self)

See Also:



129
130
131
132
133
# File 'lib/public_suffix/list.rb', line 129

def add(rule, index = true)
  @rules << rule
  create_index! if index == true
  self
end

#clearself

Removes all elements.

Returns:

  • (self)


154
155
156
157
# File 'lib/public_suffix/list.rb', line 154

def clear
  @rules.clear
  self
end

#create_index!Object

Creates a naive index for @rules. Just a hash that will tell us where the elements of @rules are relative to its first Rule::Base#labels element.

For instance if @rules and @rules are the only elements of the list where Rule#labels.first is ‘us’ @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.



77
78
79
80
81
82
83
84
85
# File 'lib/public_suffix/list.rb', line 77

def create_index!
  @rules.map { |l| l.labels.first }.each_with_index do |elm, inx|
    if !@indexes.has_key?(elm)
      @indexes[elm] = [inx]
    else
      @indexes[elm] << inx
    end
  end
end

#each(*args, &block) ⇒ Object

Iterates each rule in the list.



105
106
107
# File 'lib/public_suffix/list.rb', line 105

def each(*args, &block)
  @rules.each(*args, &block)
end

#empty?Boolean

Checks whether the list is empty.

Returns:

  • (Boolean)


147
148
149
# File 'lib/public_suffix/list.rb', line 147

def empty?
  @rules.empty?
end

#find(domain) ⇒ PublicSuffix::Rule::*?

Returns the most appropriate rule for domain.

From the Public Suffix List documentation:

  • If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.

  • An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.

Algorithm description

  • Match domain against all rules and take note of the matching ones.

  • If no rules match, the prevailing rule is “*”.

  • If more than one rule matches, the prevailing rule is the one which is an exception rule.

  • If there is no matching exception rule, the prevailing rule is the one with the most labels.

  • If the prevailing rule is a exception rule, modify it by removing the leftmost label.

  • The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).

  • The registered domain is the public suffix plus one additional label.

Parameters:

  • domain (String, #to_s)

    The domain name.

Returns:



183
184
185
186
187
# File 'lib/public_suffix/list.rb', line 183

def find(domain)
  rules = select(domain)
  rules.select { |r|   r.type == :exception }.first ||
  rules.inject { |t,r| t.length > r.length ? t : r }
end

#select(domain) ⇒ Array<PublicSuffix::Rule::*>

Selects all the rules matching given domain.

Will use @indexes to try only the rules that share the same first label, that will speed up things when using List.find(‘foo’) a lot.

Parameters:

  • domain (String, #to_s)

    The domain name.

Returns:



197
198
199
200
# File 'lib/public_suffix/list.rb', line 197

def select(domain)
  indices = (@indexes[Domain.domain_to_labels(domain).first] || [])
  @rules.values_at(*indices).select { |rule| rule.match?(domain) }
end

#sizeInteger Also known as: length

Gets the number of elements in the list.

Returns:

  • (Integer)


139
140
141
# File 'lib/public_suffix/list.rb', line 139

def size
  @rules.size
end

#to_aArray<PublicSuffix::Rule::*>

Gets the list as array.

Returns:



112
113
114
# File 'lib/public_suffix/list.rb', line 112

def to_a
  @rules
end