Class: Ai4r::Classifiers::Hyperpipes
- Inherits:
-
Classifier
- Object
- Classifier
- Ai4r::Classifiers::Hyperpipes
- Defined in:
- lib/ai4r/classifiers/hyperpipes.rb
Overview
Introduction
A fast classifier algorithm, created by Lucio de Souza Coelho and Len Trigg.
Instance Attribute Summary collapse
-
#data_set ⇒ Object
readonly
Returns the value of attribute data_set.
-
#pipes ⇒ Object
readonly
Returns the value of attribute pipes.
Instance Method Summary collapse
-
#build(data_set) ⇒ Object
Build a new Hyperpipes classifier.
-
#eval(data) ⇒ Object
You can evaluate new data, predicting its class.
-
#get_rules ⇒ Object
This method returns the generated rules in ruby code.
- #initialize ⇒ Object constructor
-
#pipes_summary(margin: 0) ⇒ Object
Return a summary representation of all pipes.
Methods included from Data::Parameterizable
#get_parameters, included, #set_parameters
Constructor Details
#initialize ⇒ Object
36 37 38 39 40 41 42 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 36 def initialize super() @tie_break = :last @margin = 0 @random_seed = nil @rng = nil end |
Instance Attribute Details
#data_set ⇒ Object (readonly)
Returns the value of attribute data_set.
27 28 29 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 27 def data_set @data_set end |
#pipes ⇒ Object (readonly)
Returns the value of attribute pipes.
27 28 29 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 27 def pipes @pipes end |
Instance Method Details
#build(data_set) ⇒ Object
Build a new Hyperpipes classifier. You must provide a DataSet instance as parameter. The last attribute of each item is considered as the item class.
49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 49 def build(data_set) data_set.check_not_empty @data_set = data_set @domains = data_set.build_domains @pipes = {} @domains.last.each { |cat| @pipes[cat] = build_pipe(@data_set) } @data_set.data_items.each { |item| update_pipe(@pipes[item.last], item) } self end |
#eval(data) ⇒ Object
You can evaluate new data, predicting its class. e.g.
classifier.eval(['New York', '<30', 'F']) # => 'Y'
Tie resolution is controlled by tie_break parameter.
67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 67 def eval(data) votes = Votes.new @pipes.each do |category, pipe| pipe.each_with_index do |bounds, i| if data[i].is_a? Numeric votes.increment_category(category) if data[i].between?(bounds[:min], bounds[:max]) elsif bounds[data[i]] votes.increment_category(category) end end end rng = @rng || (@random_seed.nil? ? Random.new : Random.new(@random_seed)) votes.get_winner(@tie_break, rng: rng) end |
#get_rules ⇒ Object
This method returns the generated rules in ruby code. e.g.
classifier.get_rules
# => if age_range == '<30' then marketing_target = 'Y'
elsif age_range == '[30-50)' then marketing_target = 'N'
elsif age_range == '[50-80]' then marketing_target = 'N'
end
It is a nice way to inspect induction results, and also to execute them:
marketing_target = nil
eval classifier.get_rules
puts marketing_target
# => 'Y'
rubocop:disable Metrics/AbcSize
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 99 def get_rules rules = [] rules << 'votes = Votes.new' data = @data_set.data_items.first labels = @data_set.data_labels.collect(&:to_s) @pipes.each do |category, pipe| pipe.each_with_index do |bounds, i| rule = "votes.increment_category('#{category}') " rule += if data[i].is_a? Numeric "if #{labels[i]} >= #{bounds[:min]} && #{labels[i]} <= #{bounds[:max]}" else "if #{bounds.inspect}[#{labels[i]}]" end rules << rule end end rules << "#{labels.last} = votes.get_winner(:#{@tie_break})" rules.join("\n") end |
#pipes_summary(margin: 0) ⇒ Object
Return a summary representation of all pipes.
The returned hash maps each category to another hash where the keys are attribute labels and the values are either numeric ranges ‘[min, max]` (including the optional margin) or a Set of nominal values.
classifier.pipes_summary
# => { "Y" => { "city" => #{Set['New York', 'Chicago']},
"age" => [18, 85],
"gender" => #{Set['M', 'F']} },
"N" => { ... } }
The optional margin parameter expands numeric bounds by the given fraction. A value of 0.1 would enlarge each range by 10%. rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'lib/ai4r/classifiers/hyperpipes.rb', line 138 def pipes_summary(margin: 0) raise 'Model not built yet' unless @data_set && @pipes labels = @data_set.data_labels[0...-1] summary = {} @pipes.each do |category, pipe| attr_summary = {} pipe.each_with_index do |bounds, i| if bounds.is_a?(Hash) && bounds.key?(:min) && bounds.key?(:max) min = bounds[:min] max = bounds[:max] range_margin = (max - min) * margin attr_summary[labels[i]] = [min - range_margin, max + range_margin] else attr_summary[labels[i]] = bounds.select { |_k, v| v }.keys.to_set end end summary[category] = attr_summary end summary end |