Class: Ai4r::Classifiers::NaiveBayes

Inherits:
Classifier show all
Defined in:
lib/ai4r/classifiers/naive_bayes.rb

Overview

Probabilistic classifier based on Bayes’ theorem.

Defined Under Namespace

Classes: DataEntry

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Data::Parameterizable

#get_parameters, included, #set_parameters

Constructor Details

#initializeObject



68
69
70
71
72
73
74
75
76
77
78
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 68

def initialize
  super()
  @m = 0
  @unknown_value_strategy = :ignore
  @class_counts = []
  @class_prob = [] # stores the probability of the classes
  @pcc = [] # stores the number of instances divided into attribute/value/class
  @pcp = [] # stores the conditional probabilities of the values of an attribute
  @klass_index = {} # hashmap for quick lookup of all the used klasses and their indice
  @values = {} # hashmap for quick lookup of all the values
end

Instance Attribute Details

#class_probObject (readonly)

Returns the value of attribute class_prob.



60
61
62
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 60

def class_prob
  @class_prob
end

#pccObject (readonly)

Returns the value of attribute pcc.



60
61
62
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 60

def pcc
  @pcc
end

#pcpObject (readonly)

Returns the value of attribute pcp.



60
61
62
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 60

def pcp
  @pcp
end

Instance Method Details

#build(data) ⇒ Object

counts values of the attribute instances and calculates the probability of the classes and the conditional probabilities Parameter data has to be an instance of CsvDataSet

Parameters:

  • data (Object)

Returns:

  • (Object)


118
119
120
121
122
123
124
125
126
127
128
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 118

def build(data)
  raise 'Error instance must be passed' unless data.is_a?(Ai4r::Data::DataSet)
  raise 'Data should not be empty' if data.data_items.empty?

  initialize_domain_data(data)
  initialize_klass_index
  initialize_pc
  calculate_probabilities

  self
end

#eval(data) ⇒ Object

You can evaluate new data, predicting its category. e.g.

b.eval(["Red", "SUV", "Domestic"])
  => 'No'

Parameters:

  • data (Object)

Returns:

  • (Object)


86
87
88
89
90
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 86

def eval(data)
  prob = @class_prob.dup
  prob = calculate_class_probabilities_for_entry(data, prob)
  index_to_klass(prob.index(prob.max))
end

#get_probability_map(data) ⇒ Object

Calculates the probabilities for the data entry Data. data has to be an array of the same dimension as the training data minus the class column. Returns a map containint all classes as keys: {Class_1 => probability, Class_2 => probability2 … } Probability is <= 1 and of type Float. e.g.

b.get_probability_map(["Red", "SUV", "Domestic"])
  => {"Yes"=>0.4166666666666667, "No"=>0.5833333333333334}

Parameters:

  • data (Object)

Returns:

  • (Object)


103
104
105
106
107
108
109
110
111
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 103

def get_probability_map(data)
  prob = @class_prob.dup
  prob = calculate_class_probabilities_for_entry(data, prob)
  prob = normalize_class_probability prob
  probability_map = {}
  prob.each_with_index { |p, i| probability_map[index_to_klass(i)] = p }

  probability_map
end

#get_rulesObject

Naive Bayes classifiers cannot generate human readable rules. This method returns a descriptive string explaining that rule extraction is not supported for this algorithm.



133
134
135
# File 'lib/ai4r/classifiers/naive_bayes.rb', line 133

def get_rules
  'NaiveBayes does not support rule extraction.'
end