Class: MoreMath::Sequence

Inherits:
Object show all
Includes:
Enumerable, MovingAverage
Defined in:
lib/more_math/sequence.rb,
lib/more_math/sequence/moving_average.rb

Overview

This class is used to contain elements and compute various statistical values for them.

Defined Under Namespace

Modules: MovingAverage, Refinement

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from MovingAverage

#simple_moving_average

Constructor Details

#initialize(elements) ⇒ Sequence

Returns a new instance of Sequence.



10
11
12
# File 'lib/more_math/sequence.rb', line 10

def initialize(elements)
  @elements = elements.dup.freeze
end

Instance Attribute Details

#elementsObject (readonly)

Returns the array of elements.



15
16
17
# File 'lib/more_math/sequence.rb', line 15

def elements
  @elements
end

Instance Method Details

#autocorrelationObject

Returns the array of autocorrelation values c_k / c_0 (of length size - 1).



284
285
286
287
# File 'lib/more_math/sequence.rb', line 284

def autocorrelation
  c = autovariance
  Array.new(c.size) { |k| c[k] / c[0] }
end

#autovarianceObject

Returns the array of autovariances (of length size - 1).



272
273
274
275
276
277
278
279
280
# File 'lib/more_math/sequence.rb', line 272

def autovariance
  Array.new(size - 1) do |k|
    s = 0.0
    0.upto(size - k - 1) do |i|
      s += (@elements[i] - arithmetic_mean) * (@elements[i + k] - arithmetic_mean)
    end
    s / size
  end
end

#common_standard_deviation(other) ⇒ Object

Returns an estimation of the common standard deviation of the elements of this and other.



213
214
215
# File 'lib/more_math/sequence.rb', line 213

def common_standard_deviation(other)
  Math.sqrt(common_variance(other))
end

#common_variance(other) ⇒ Object

Returns an estimation of the common variance of the elements of this and other.



219
220
221
222
# File 'lib/more_math/sequence.rb', line 219

def common_variance(other)
  (size - 1) * sample_variance + (other.size - 1) *
    other.sample_variance / (size + other.size - 2)
end

#compute_student_df(other) ⇒ Object

Compute the # degrees of freedom for Student’s t-test.



225
226
227
# File 'lib/more_math/sequence.rb', line 225

def compute_student_df(other)
  size + other.size - 2
end

#compute_welch_df(other) ⇒ Object

Use an approximation of the Welch-Satterthwaite equation to compute the degrees of freedom for Welch’s t-test.



194
195
196
197
198
# File 'lib/more_math/sequence.rb', line 194

def compute_welch_df(other)
  (sample_variance / size + other.sample_variance / other.size) ** 2 / (
    (sample_variance ** 2 / (size ** 2 * (size - 1))) +
    (other.sample_variance ** 2 / (other.size ** 2 * (other.size - 1))))
end

#confidence_interval(alpha = 0.05) ⇒ Object

Return the confidence interval for the arithmetic mean with alpha level alpha of the elements of this Sequence instance as a Range object.



264
265
266
267
268
269
# File 'lib/more_math/sequence.rb', line 264

def confidence_interval(alpha = 0.05)
  td = TDistribution.new(size - 1)
  t = td.inverse_probability(alpha / 2).abs
  delta = t * sample_standard_deviation / Math.sqrt(size)
  (arithmetic_mean - delta)..(arithmetic_mean + delta)
end

#cover?(other, alpha = 0.05) ⇒ Boolean

Return true, if the Sequence instance covers the other, that is their arithmetic mean value is most likely to be equal for the alpha error level.

Returns:

  • (Boolean)


256
257
258
259
260
# File 'lib/more_math/sequence.rb', line 256

def cover?(other, alpha = 0.05)
  t = t_welch(other)
  td = TDistribution.new(compute_welch_df(other))
  t.abs < td.inverse_probability(1 - alpha.abs / 2.0)
end

#detect_autocorrelation(lags = 20, alpha_level = 0.05) ⇒ Object

This method tries to detect autocorrelation with the Ljung-Box statistic. If enough lags can be considered it returns a hash with results, otherwise nil is returned. The keys are

:lags

the number of lags,

:alpha_level

the alpha level for the test,

:q

the value of the ljung_box_statistic,

:p

the p-value computed, if p is higher than alpha no correlation was detected,

:detected

true if a correlation was found.



317
318
319
320
321
322
323
324
325
326
327
328
# File 'lib/more_math/sequence.rb', line 317

def detect_autocorrelation(lags = 20, alpha_level = 0.05)
  if q = ljung_box_statistic(lags)
    p = ChiSquareDistribution.new(lags).probability(q)
    return {
      :lags         => lags,
      :alpha_level  => alpha_level,
      :q            => q,
      :p            => p,
      :detected     => p >= 1 - alpha_level,
    }
  end
end

#detect_outliers(factor = 3.0, epsilon = 1E-5) ⇒ Object

Return a result hash with the number of :very_low, :low, :high, and :very_high outliers, determined by the box plotting algorithm run with :median and :iqr parameters. If no outliers were found or the iqr is less than epsilon, nil is returned.



334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
# File 'lib/more_math/sequence.rb', line 334

def detect_outliers(factor = 3.0, epsilon = 1E-5)
  half_factor = factor / 2.0
  quartile1 = percentile(25)
  quartile3 = percentile(75)
  iqr = quartile3 - quartile1
  iqr < epsilon and return
  result = @elements.inject(Hash.new(0)) do |h, t|
    extreme =
      case t
      when -Infinity..(quartile1 - factor * iqr)
        :very_low
      when (quartile1 - factor * iqr)..(quartile1 - half_factor * iqr)
        :low
      when (quartile1 + half_factor * iqr)..(quartile3 + factor * iqr)
        :high
      when (quartile3 + factor * iqr)..Infinity
        :very_high
      end and h[extreme] += 1
    h
  end
  unless result.empty?
    result[:median] = median
    result[:iqr] = iqr
    result[:factor] = factor
    result
  end
end

#durbin_watson_statisticObject

Returns the d-value for the Durbin-Watson statistic. The value is d << 2 for positive, d >> 2 for negative and d around 2 for no autocorrelation.



291
292
293
294
295
296
# File 'lib/more_math/sequence.rb', line 291

def durbin_watson_statistic
  e = linear_regression.residues
  e.size <= 1 and return 2.0
  (1...e.size).inject(0.0) { |s, i| s + (e[i] - e[i - 1]) ** 2 } /
    e.inject(0.0) { |s, x| s + x ** 2 }
end

#each(&block) ⇒ Object

Calls the block for every element of this Sequence.



18
19
20
# File 'lib/more_math/sequence.rb', line 18

def each(&block)
  @elements.each(&block)
end

#empty?Boolean

Returns true if this sequence is empty, otherwise false.

Returns:

  • (Boolean)


24
25
26
# File 'lib/more_math/sequence.rb', line 24

def empty?
  @elements.empty?
end

#histogram(bins) ⇒ Object

Returns a Histogram instance with bins as the number of bins for this analysis’ elements.



371
372
373
# File 'lib/more_math/sequence.rb', line 371

def histogram(bins)
  Histogram.new(self, bins)
end

#ljung_box_statistic(lags = 20) ⇒ Object

Returns the q value of the Ljung-Box statistic for the number of lags lags. A higher value might indicate autocorrelation in the elements of this Sequence instance. This method returns nil if there weren’t enough (at least lags) lags available.



302
303
304
305
306
307
# File 'lib/more_math/sequence.rb', line 302

def ljung_box_statistic(lags = 20)
  r = autocorrelation
  lags >= r.size and return
  n = size
  n * (n + 2) * (1..lags).inject(0.0) { |s, i| s + r[i] ** 2 / (n - i) }
end

#percentile(p = 50) ⇒ Object Also known as: median

Returns the p-percentile of the elements. There are many methods to compute the percentile, this method uses the the weighted average at x_(n + 1)p, which allows p to be in 0…100 (excluding the 100).



171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# File 'lib/more_math/sequence.rb', line 171

def percentile(p = 50)
  (0...100).include?(p) or
    raise ArgumentError, "p = #{p}, but has to be in (0...100)"
  p /= 100.0
  sorted_elements = sorted
  r = p * (sorted_elements.size + 1)
  r_i = r.to_i
  r_f = r - r_i
  if r_i >= 1
    result = sorted_elements[r_i - 1]
    if r_i < sorted_elements.size
      result += r_f * (sorted_elements[r_i] - sorted_elements[r_i - 1])
    end
  else
    result = sorted_elements[0]
  end
  result
end

#push(element) ⇒ Object Also known as: <<

Push element on this Sequence and return a new Sequence instance with element as its last element.



47
48
49
# File 'lib/more_math/sequence.rb', line 47

def push(element)
  Sequence.new(@elements.dup.push(element))
end

#resetObject

Reset all memoized values of this sequence.



34
35
36
37
# File 'lib/more_math/sequence.rb', line 34

def reset
  self.class.mize_cache_clear
  self
end

#sizeObject

Returns the number of elements, on which the analysis is based.



29
30
31
# File 'lib/more_math/sequence.rb', line 29

def size
  @elements.size
end

#suggested_sample_size(other, alpha = 0.05, beta = 0.05) ⇒ Object

Compute a sample size, that will more likely yield a mean difference between this instance’s elements and those of other. Use alpha and beta as levels for the first- and second-order errors.



243
244
245
246
247
248
249
250
251
# File 'lib/more_math/sequence.rb', line 243

def suggested_sample_size(other, alpha = 0.05, beta = 0.05)
  alpha, beta = alpha.abs, beta.abs
  signal = arithmetic_mean - other.arithmetic_mean
  df = size + other.size - 2
  pooled_variance_estimate = (sum_of_squares + other.sum_of_squares) / df
  td = TDistribution.new df
  (((td.inverse_probability(alpha) + td.inverse_probability(beta)) *
    Math.sqrt(pooled_variance_estimate)) / signal) ** 2
end

#t_student(other) ⇒ Object

Returns the t value of the Student’s t-test between this Sequence instance and the other.



231
232
233
234
235
236
237
238
# File 'lib/more_math/sequence.rb', line 231

def t_student(other)
  signal = arithmetic_mean - other.arithmetic_mean
  noise = common_standard_deviation(other) *
    Math.sqrt(size ** -1 + size ** -1)
  signal / noise
rescue Errno::EDOM
  0.0
end

#t_welch(other) ⇒ Object

Returns the t value of the Welch’s t-test between this Sequence instance and the other.



202
203
204
205
206
207
208
209
# File 'lib/more_math/sequence.rb', line 202

def t_welch(other)
  signal = arithmetic_mean - other.arithmetic_mean
  noise = Math.sqrt(sample_variance / size +
    other.sample_variance / other.size)
  signal / noise
rescue Errno::EDOM
  0.0
end

#to_aryObject Also known as: to_a



39
40
41
# File 'lib/more_math/sequence.rb', line 39

def to_ary
  @elements.dup
end