Class: QuickStats

Inherits:
Object
  • Object
show all
Defined in:
lib/quickstats.rb

Overview

Computationally stable and efficient basic descriptive statistics. This class uses Kalman Filter updating to tally sample mean and sum of squares, along with min, max, and sample size. Sample variance, standard deviation and standard error are calculated on demand.

Author

Paul J Sanchez ([email protected])

Copyright

Copyright © Paul J Sanchez

License

MIT

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeQuickStats

Initialize state vars in a new QuickStats object.

  • n = 0

  • sample_mean = sample_variance = min = max = NaN



25
26
27
# File 'lib/quickstats.rb', line 25

def initialize
  reset
end

Instance Attribute Details

#maxObject (readonly)

Returns the value of attribute max.



13
14
15
# File 'lib/quickstats.rb', line 13

def max
  @max
end

#minObject (readonly)

Returns the value of attribute min.



13
14
15
# File 'lib/quickstats.rb', line 13

def min
  @min
end

#nObject (readonly) Also known as: sample_size

Returns the value of attribute n.



13
14
15
# File 'lib/quickstats.rb', line 13

def n
  @n
end

#sample_meanObject (readonly) Also known as: average, avg

Returns the value of attribute sample_mean.



13
14
15
# File 'lib/quickstats.rb', line 13

def sample_mean
  @sample_mean
end

#ssdObject (readonly) Also known as: sum_squared_deviations

Returns the value of attribute ssd.



13
14
15
# File 'lib/quickstats.rb', line 13

def ssd
  @ssd
end

Instance Method Details

#add_set(enumerable_set) ⇒ Object Also known as: add_all

Update the statistics with all elements of an enumerable set.

Arguments
  • enumerable_set -> the set of new observation. All elements must be numeric.

Returns
  • a reference to the QuickStats object.



77
78
79
80
# File 'lib/quickstats.rb', line 77

def add_set(enumerable_set)
  enumerable_set.each { |x| new_obs x }
  self
end

#loss(target:) ⇒ Object

Estimates of quadratic loss (a la Taguchi) relative to a specified target value.

Arguments
  • target: -> the designated target value for the loss function.

Returns
  • the quadratic loss calculated for the data, or NaN if this is a new or just-reset QuickStats object.



160
161
162
163
# File 'lib/quickstats.rb', line 160

def loss(target:)
  fail 'Must supply target to loss function' unless target
  @n > 1 ? (@sample_mean - target)**2 + @ssd / @n : Float::NAN
end

#mle_sample_varianceObject Also known as: mle_var

Calculates the MLE sample variance on demand (divisor is n).

Returns
  • the MLE sample variance of the data, or NaN if this is a new or just-reset QuickStats object.



100
101
102
# File 'lib/quickstats.rb', line 100

def mle_sample_variance
  @n > 1 ? @ssd / @n : Float::NAN
end

#mle_standard_deviationObject Also known as: mle_s, mle_std_dev

Calculates the square root of the MLE sample variance on demand.

Returns
  • the MLE standard deviation of the data, or NaN if this is a new or just-reset QuickStats object.



123
124
125
# File 'lib/quickstats.rb', line 123

def mle_standard_deviation
  Math.sqrt mle_sample_variance
end

#mle_standard_errorObject Also known as: mle_std_err

Calculates sqrt(mle_sample_variance / n) on demand.

Returns
  • the sample standard error of the data, or NaN if this is a new or just-reset QuickStats object.



146
147
148
# File 'lib/quickstats.rb', line 146

def mle_standard_error
  Math.sqrt(mle_sample_variance / @n)
end

#new_obs(datum) ⇒ Object

Update the sample size, sample mean, sum of squares, min, and max given a new observation. All but the sample size are maintained as floating point.

Arguments
  • datum -> the new observation. Must be numeric

Returns
  • a reference to the QuickStats object.

Raises
  • RuntimeError if datum is non-numeric



52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# File 'lib/quickstats.rb', line 52

def new_obs(datum)
  fail 'Observations must be numeric' unless datum.is_a? Numeric
  x = datum.to_f
  @max = x unless x <= @max
  @min = x unless x >= @min
  if @n > 0
    delta = x - @sample_mean
    @n += 1
    @sample_mean += delta / n
    @ssd += delta * (x - @sample_mean)
  else
    @sample_mean = x
    @n += 1
  end
  self
end

#resetObject

Reset all state vars to initial values.

  • ssd = n = 0

  • sample_mean = sample_variance = min = max = NaN

Returns
  • a reference to the QuickStats object.



36
37
38
39
40
# File 'lib/quickstats.rb', line 36

def reset
  @ssd = @n = 0
  @sample_mean = @max = @min = Float::NAN
  self
end

#sample_varianceObject Also known as: var

Calculates the unbiased sample variance on demand (divisor is n-1).

Returns
  • the sample variance of the data, or NaN if this is a new or just-reset QuickStats object.



89
90
91
# File 'lib/quickstats.rb', line 89

def sample_variance
  @n > 1 ? @ssd / (@n - 1) : Float::NAN
end

#standard_deviationObject Also known as: s, std_dev

Calculates the square root of the unbiased sample variance on demand.

Returns
  • the sample standard deviation of the data, or NaN if this is a new or just-reset QuickStats object.



111
112
113
# File 'lib/quickstats.rb', line 111

def standard_deviation
  Math.sqrt sample_variance
end

#standard_errorObject Also known as: std_err

Calculates sqrt(sample_variance / n) on demand.

Returns
  • the sample standard error of the data, or NaN if this is a new or just-reset QuickStats object.



135
136
137
# File 'lib/quickstats.rb', line 135

def standard_error
  Math.sqrt(sample_variance / @n)
end