Module: Ai4r::Data::Statistics
- Defined in:
- lib/ai4r/data/statistics.rb
Overview
This module provides some basic statistics functions to operate on data set attributes.
Class Method Summary collapse
-
.max(data_set, attribute) ⇒ Object
Get the maximum value of an attribute in the data set.
-
.mean(data_set, attribute) ⇒ Object
Get the sample mean.
-
.min(data_set, attribute) ⇒ Object
Get the minimum value of an attribute in the data set.
-
.mode(data_set, attribute) ⇒ Object
Get the sample mode.
-
.standard_deviation(data_set, attribute, variance = nil) ⇒ Object
Get the standard deviation.
-
.variance(data_set, attribute, mean = nil) ⇒ Object
Get the variance.
Class Method Details
.max(data_set, attribute) ⇒ Object
Get the maximum value of an attribute in the data set
71 72 73 74 75 |
# File 'lib/ai4r/data/statistics.rb', line 71 def self.max(data_set, attribute) index = data_set.get_index(attribute) item = data_set.data_items.max_by { |item| item[index] } item ? item[index] : -Float::INFINITY end |
.mean(data_set, attribute) ⇒ Object
Get the sample mean
21 22 23 24 25 26 |
# File 'lib/ai4r/data/statistics.rb', line 21 def self.mean(data_set, attribute) index = data_set.get_index(attribute) sum = 0.0 data_set.data_items.each { |item| sum += item[index] } sum / data_set.data_items.length end |
.min(data_set, attribute) ⇒ Object
Get the minimum value of an attribute in the data set
81 82 83 84 85 |
# File 'lib/ai4r/data/statistics.rb', line 81 def self.min(data_set, attribute) index = data_set.get_index(attribute) item = data_set.data_items.min_by { |item| item[index] } item ? item[index] : Float::INFINITY end |
.mode(data_set, attribute) ⇒ Object
Get the sample mode.
57 58 59 60 61 62 63 64 65 |
# File 'lib/ai4r/data/statistics.rb', line 57 def self.mode(data_set, attribute) index = data_set.get_index(attribute) data_set .data_items .map { |item| item[index] } .tally .max_by { _2 } &.first end |
.standard_deviation(data_set, attribute, variance = nil) ⇒ Object
Get the standard deviation. You can provide the variance if you have it already, to speed up things.
48 49 50 51 |
# File 'lib/ai4r/data/statistics.rb', line 48 def self.standard_deviation(data_set, attribute, variance = nil) variance ||= variance(data_set, attribute) Math.sqrt(variance) end |
.variance(data_set, attribute, mean = nil) ⇒ Object
Get the variance. You can provide the mean if you have it already, to speed up things.
34 35 36 37 38 39 40 |
# File 'lib/ai4r/data/statistics.rb', line 34 def self.variance(data_set, attribute, mean = nil) index = data_set.get_index(attribute) mean ||= mean(data_set, attribute) sum = 0.0 data_set.data_items.each { |item| sum += (item[index] - mean)**2 } sum / (data_set.data_items.length - 1) end |