Class: Ai4r::Classifiers::Hyperpipes

Inherits:
Classifier show all
Defined in:
lib/ai4r/classifiers/hyperpipes.rb

Overview

Introduction

A fast classifier algorithm, created by Lucio de Souza Coelho and Len Trigg.

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Data::Parameterizable

#get_parameters, included, #set_parameters

Constructor Details

#initializeObject



36
37
38
39
40
41
42
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 36

def initialize
  super()
  @tie_break = :last
  @margin = 0
  @random_seed = nil
  @rng = nil
end

Instance Attribute Details

#data_setObject (readonly)

Returns the value of attribute data_set.



27
28
29
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 27

def data_set
  @data_set
end

#pipesObject (readonly)

Returns the value of attribute pipes.



27
28
29
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 27

def pipes
  @pipes
end

Instance Method Details

#build(data_set) ⇒ Object

Build a new Hyperpipes classifier. You must provide a DataSet instance as parameter. The last attribute of each item is considered as the item class.

Parameters:

  • data_set (Object)

Returns:

  • (Object)


49
50
51
52
53
54
55
56
57
58
59
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 49

def build(data_set)
  data_set.check_not_empty
  @data_set = data_set
  @domains = data_set.build_domains

  @pipes = {}
  @domains.last.each { |cat| @pipes[cat] = build_pipe(@data_set) }
  @data_set.data_items.each { |item| update_pipe(@pipes[item.last], item) }

  self
end

#eval(data) ⇒ Object

You can evaluate new data, predicting its class. e.g.

classifier.eval(['New York',  '<30', 'F'])  # => 'Y'

Tie resolution is controlled by tie_break parameter.

Parameters:

  • data (Object)

Returns:

  • (Object)


67
68
69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 67

def eval(data)
  votes = Votes.new
  @pipes.each do |category, pipe|
    pipe.each_with_index do |bounds, i|
      if data[i].is_a? Numeric
        votes.increment_category(category) if data[i].between?(bounds[:min], bounds[:max])
      elsif bounds[data[i]]
        votes.increment_category(category)
      end
    end
  end
  rng = @rng || (@random_seed.nil? ? Random.new : Random.new(@random_seed))
  votes.get_winner(@tie_break, rng: rng)
end

#get_rulesObject

This method returns the generated rules in ruby code. e.g.

classifier.get_rules
  # =>  if age_range == '<30' then marketing_target = 'Y'
        elsif age_range == '[30-50)' then marketing_target = 'N'
        elsif age_range == '[50-80]' then marketing_target = 'N'
        end

It is a nice way to inspect induction results, and also to execute them:

marketing_target = nil
eval classifier.get_rules
puts marketing_target
  # =>  'Y'

rubocop:disable Metrics/AbcSize

Returns:

  • (Object)


99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 99

def get_rules
  rules = []
  rules << 'votes = Votes.new'
  data = @data_set.data_items.first
  labels = @data_set.data_labels.collect(&:to_s)
  @pipes.each do |category, pipe|
    pipe.each_with_index do |bounds, i|
      rule = "votes.increment_category('#{category}') "
      rule += if data[i].is_a? Numeric
                "if #{labels[i]} >= #{bounds[:min]} && #{labels[i]} <= #{bounds[:max]}"
              else
                "if #{bounds.inspect}[#{labels[i]}]"
              end
      rules << rule
    end
  end
  rules << "#{labels.last} = votes.get_winner(:#{@tie_break})"
  rules.join("\n")
end

#pipes_summary(margin: 0) ⇒ Object

Return a summary representation of all pipes.

The returned hash maps each category to another hash where the keys are attribute labels and the values are either numeric ranges ‘[min, max]` (including the optional margin) or a Set of nominal values.

classifier.pipes_summary
  # => { "Y" => { "city" => #{Set['New York', 'Chicago']},
                 "age" => [18, 85],
                 "gender" => #{Set['M', 'F']} },
       "N" => { ... } }

The optional margin parameter expands numeric bounds by the given fraction. A value of 0.1 would enlarge each range by 10%. rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity

Parameters:

  • margin (Object) (defaults to: 0)

Returns:

  • (Object)


138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 138

def pipes_summary(margin: 0)
  raise 'Model not built yet' unless @data_set && @pipes

  labels = @data_set.data_labels[0...-1]
  summary = {}
  @pipes.each do |category, pipe|
    attr_summary = {}
    pipe.each_with_index do |bounds, i|
      if bounds.is_a?(Hash) && bounds.key?(:min) && bounds.key?(:max)
        min = bounds[:min]
        max = bounds[:max]
        range_margin = (max - min) * margin
        attr_summary[labels[i]] = [min - range_margin, max + range_margin]
      else
        attr_summary[labels[i]] = bounds.select { |_k, v| v }.keys.to_set
      end
    end
    summary[category] = attr_summary
  end
  summary
end