Module: Nimbus::LossFunctions

Defined in:
lib/nimbus/loss_functions.rb

Overview

Math functions.

The LossFunctions class provides handy mathematical functions as class methods to be used by Tree and Forest when estimating predictions, errors and loss functions for training and testing data.

Class Method Summary collapse

Class Method Details

.average(ids, value_table) ⇒ Object

Simple average: sum(n) / n



17
18
19
# File 'lib/nimbus/loss_functions.rb', line 17

def average(ids, value_table)
  ids.inject(0.0){|sum, i| sum + value_table[i]} / ids.size
end

.class_sizes(ids, value_table, classes) ⇒ Object

Array with the list of sizes of each class in the given list of individuals.



81
82
83
# File 'lib/nimbus/loss_functions.rb', line 81

def class_sizes(ids, value_table, classes)
  classes.map{|c| ids.count{|i| value_table[i] == c}}
end

.class_sizes_in_list(list, classes) ⇒ Object

Array with the list of sizes of each class in the given list of classes.



86
87
88
# File 'lib/nimbus/loss_functions.rb', line 86

def class_sizes_in_list(list, classes)
  classes.map{|c| list.count{|i| i == c}}
end

.gini_index(ids, value_table, classes) ⇒ Object

Gini index of a list of classified individuals.

If a dataset T contains examples from n classes, then: gini(T) = 1 - Sum (Pj)^2 where Pj is the relative frequency of class j in T



57
58
59
60
61
62
# File 'lib/nimbus/loss_functions.rb', line 57

def gini_index(ids, value_table, classes)
  total_size = ids.size.to_f
  gini = 1 - class_sizes(ids, value_table, classes).inject(0.0){|sum, size|
    sum + (size/total_size)**2}
  gini.round(5)
end

.majority_class(ids, value_table, classes) ⇒ Object

Majority class of a list of classified individuals. If more than one class has the same number of individuals, one of the majority classes is selected randomly.



67
68
69
70
# File 'lib/nimbus/loss_functions.rb', line 67

def majority_class(ids, value_table, classes)
  sizes = class_sizes(ids, value_table, classes)
  Hash[classes.zip sizes].keep_if{|k,v| v == sizes.max}.keys.sample
end

.majority_class_in_list(list, classes) ⇒ Object

Majority class of a list of classes. If more than one class has the same number of individuals, one of the majority classes is selected randomly.



75
76
77
78
# File 'lib/nimbus/loss_functions.rb', line 75

def majority_class_in_list(list, classes)
  sizes = classes.map{|c| list.count{|i| i == c}}
  Hash[classes.zip sizes].keep_if{|k,v| v == sizes.max}.keys.sample
end

.mean_squared_error(ids, value_table, mean = nil) ⇒ Object

Mean squared error: sum (x-y)^2



22
23
24
25
# File 'lib/nimbus/loss_functions.rb', line 22

def mean_squared_error(ids, value_table, mean = nil)
  mean ||= self.average ids, value_table
  ids.inject(0.0){|sum, i| sum + ((value_table[i] - mean)**2) }
end

.pseudo_huber_error(ids, value_table, mean = nil) ⇒ Object

Simplified Huber function



40
41
42
43
# File 'lib/nimbus/loss_functions.rb', line 40

def pseudo_huber_error(ids, value_table, mean = nil)
  mean ||= self.average ids, value_table
  ids.inject(0.0){|sum, i| sum + (Math.log(Math.cosh(value_table[i] - mean))) }
end

.pseudo_huber_loss(ids, value_table, mean = nil) ⇒ Object

Simplified Huber loss function: PHE / n



46
47
48
# File 'lib/nimbus/loss_functions.rb', line 46

def pseudo_huber_loss(ids, value_table, mean = nil)
  self.pseudo_huber_error(ids, value_table, mean) / ids.size
end

.quadratic_loss(ids, value_table, mean = nil) ⇒ Object

Quadratic loss: averaged mean squared error: sum (x-y)^2 / n

Default loss function for regression forests.



30
31
32
# File 'lib/nimbus/loss_functions.rb', line 30

def quadratic_loss(ids, value_table, mean = nil)
  self.mean_squared_error(ids, value_table, mean) / ids.size
end

.squared_difference(x, y) ⇒ Object

Difference between two values, squared. (x-y)^2



35
36
37
# File 'lib/nimbus/loss_functions.rb', line 35

def squared_difference(x,y)
  0.0 + (x-y)**2
end