# Class: Bullshit::Analysis

Inherits:
Object
• Object
show all
Defined in:
lib/bullshit.rb

## Overview

This class is used to analyse the time measurements and compute their statistics.

## Instance Attribute Summary collapse

Returns the array of measurements.

## Instance Method Summary collapse

• #arithmetic_mean ⇒ Object (also: #mean)

Returns the arithmetic mean of the measurements.

• Returns the array of autocorrelation values c_k / c_0 (of length size - 1).

• Returns the array of autovariances (of length size - 1).

• Returns an estimation of the common standard deviation of the measurements of this and `other`.

• Returns an estimation of the common variance of the measurements of this and `other`.

• Compute the # degrees of freedom for Student's t-test.

• Use an approximation of the Welch-Satterthwaite equation to compute the degrees of freedom for Welch's t-test.

• Return the confidence interval for the arithmetic mean with alpha level `alpha` of the measurements of this Analysis instance as a Range object.

• Return true, if the Analysis instance covers the `other`, that is their arithmetic mean value is most likely to be equal for the `alpha` error level.

• This method tries to detect autocorrelation with the Ljung-Box statistic.

• Return a result hash with the number of :very_low, :low, :high, and :very_high outliers, determined by the box plotting algorithm run with :median and :iqr parameters.

• Returns the d-value for the Durbin-Watson statistic.

• Returns the geometric mean of the measurements.

• Returns the harmonic mean of the measurements.

• Returns a Histogram instance with `bins` as the number of bins for this analysis' measurements.

• constructor

A new instance of Analysis.

• Returns the LinearRegression object for the equation a * x + b which represents the line computed by the linear regression algorithm.

• Returns the q value of the Ljung-Box statistic for the number of lags `lags`.

• Returns the maximum of the measurements.

• Returns the minimum of the measurements.

• #percentile(p = 50) ⇒ Object (also: #median)

Returns the `p`-percentile of the measurements.

• Returns the sample standard deviation of the measurements.

• Returns the sample standard deviation of the measurements in percentage of the arithmetic mean.

• Returns the sample_variance of the measurements.

• Returns the number of measurements, on which the analysis is based.

• Returns the standard deviation of the measurements.

• Returns the standard deviation of the measurements in percentage of the arithmetic mean.

• Compute a sample size, that will more likely yield a mean difference between this instance's measurements and those of `other`.

• Returns the sum of all measurements.

• Returns the sum of squares (the sum of the squared deviations) of the measurements.

• Returns the t value of the Student's t-test between this Analysis instance and the `other`.

• Returns the t value of the Welch's t-test between this Analysis instance and the `other`.

• Returns the variance of the measurements.

## Constructor Details

### #initialize(measurements) ⇒ Analysis

Returns a new instance of Analysis.

 ``` 1047 1048 1049 1050``` ```# File 'lib/bullshit.rb', line 1047 def initialize(measurements) @measurements = measurements @measurements.freeze end```

## Instance Attribute Details

Returns the array of measurements.

 ``` 1053 1054 1055``` ```# File 'lib/bullshit.rb', line 1053 def measurements @measurements end```

## Instance Method Details

### #arithmetic_mean ⇒ ObjectAlso known as: mean

Returns the arithmetic mean of the measurements.

 ``` 1104 1105 1106``` ```# File 'lib/bullshit.rb', line 1104 def arithmetic_mean @arithmetic_mean ||= sum / size end```

### #autocorrelation ⇒ Object

Returns the array of autocorrelation values c_k / c_0 (of length size - 1).

 ``` 1276 1277 1278 1279``` ```# File 'lib/bullshit.rb', line 1276 def autocorrelation c = autovariance Array.new(c.size) { |k| c[k] / c[0] } end```

### #autovariance ⇒ Object

Returns the array of autovariances (of length size - 1).

 ``` 1264 1265 1266 1267 1268 1269 1270 1271 1272``` ```# File 'lib/bullshit.rb', line 1264 def autovariance Array.new(size - 1) do |k| s = 0.0 0.upto(size - k - 1) do |i| s += (@measurements[i] - arithmetic_mean) * (@measurements[i + k] - arithmetic_mean) end s / size end end```

### #common_standard_deviation(other) ⇒ Object

Returns an estimation of the common standard deviation of the measurements of this and `other`.

 ``` 1206 1207 1208``` ```# File 'lib/bullshit.rb', line 1206 def common_standard_deviation(other) Math.sqrt(common_variance(other)) end```

### #common_variance(other) ⇒ Object

Returns an estimation of the common variance of the measurements of this and `other`.

 ``` 1212 1213 1214 1215``` ```# File 'lib/bullshit.rb', line 1212 def common_variance(other) (size - 1) * sample_variance + (other.size - 1) * other.sample_variance / (size + other.size - 2) end```

### #compute_student_df(other) ⇒ Object

Compute the # degrees of freedom for Student's t-test.

 ``` 1218 1219 1220``` ```# File 'lib/bullshit.rb', line 1218 def compute_student_df(other) size + other.size - 2 end```

### #compute_welch_df(other) ⇒ Object

Use an approximation of the Welch-Satterthwaite equation to compute the degrees of freedom for Welch's t-test.

 ``` 1187 1188 1189 1190 1191``` ```# File 'lib/bullshit.rb', line 1187 def compute_welch_df(other) (sample_variance / size + other.sample_variance / other.size) ** 2 / ( (sample_variance ** 2 / (size ** 2 * (size - 1))) + (other.sample_variance ** 2 / (other.size ** 2 * (other.size - 1)))) end```

### #confidence_interval(alpha = 0.05) ⇒ Object

Return the confidence interval for the arithmetic mean with alpha level `alpha` of the measurements of this Analysis instance as a Range object.

 ``` 1256 1257 1258 1259 1260 1261``` ```# File 'lib/bullshit.rb', line 1256 def confidence_interval(alpha = 0.05) td = TDistribution.new(size - 1) t = td.inverse_probability(alpha / 2).abs delta = t * sample_standard_deviation / Math.sqrt(size) (arithmetic_mean - delta)..(arithmetic_mean + delta) end```

### #cover?(other, alpha = 0.05) ⇒ Boolean

Return true, if the Analysis instance covers the `other`, that is their arithmetic mean value is most likely to be equal for the `alpha` error level.

Returns:

• (Boolean)
 ``` 1248 1249 1250 1251 1252``` ```# File 'lib/bullshit.rb', line 1248 def cover?(other, alpha = 0.05) t = t_welch(other) td = TDistribution.new(compute_welch_df(other)) t.abs < td.inverse_probability(1 - alpha.abs / 2.0) end```

### #detect_autocorrelation(lags = 20, alpha_level = 0.05) ⇒ Object

This method tries to detect autocorrelation with the Ljung-Box statistic. If enough lags can be considered it returns a hash with results, otherwise nil is returned. The keys are

``````:lags: the number of lags,
:alpha_level: the alpha level for the test,
:q: the value of the ljung_box_statistic,
:p: the p-value computed, if p is higher than alpha no correlation was detected,
:detected: true if a correlation was found.
``````
 ``` 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320``` ```# File 'lib/bullshit.rb', line 1309 def detect_autocorrelation(lags = 20, alpha_level = 0.05) if q = ljung_box_statistic(lags) p = ChiSquareDistribution.new(lags).probability(q) return { :lags => lags, :alpha_level => alpha_level, :q => q, :p => p, :detected => p >= 1 - alpha_level, } end end```

### #detect_outliers(factor = 3.0, epsilon = 1E-5) ⇒ Object

Return a result hash with the number of :very_low, :low, :high, and :very_high outliers, determined by the box plotting algorithm run with :median and :iqr parameters. If no outliers were found or the iqr is less than epsilon, nil is returned.

 ``` 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352``` ```# File 'lib/bullshit.rb', line 1326 def detect_outliers(factor = 3.0, epsilon = 1E-5) half_factor = factor / 2.0 quartile1 = percentile(25) quartile3 = percentile(75) iqr = quartile3 - quartile1 iqr < epsilon and return result = @measurements.inject(Hash.new(0)) do |h, t| extreme = case t when -Infinity..(quartile1 - factor * iqr) :very_low when (quartile1 - factor * iqr)..(quartile1 - half_factor * iqr) :low when (quartile1 + half_factor * iqr)..(quartile3 + factor * iqr) :high when (quartile3 + factor * iqr)..Infinity :very_high end and h[extreme] += 1 h end unless result.empty? result[:median] = median result[:iqr] = iqr result[:factor] = factor result end end```

### #durbin_watson_statistic ⇒ Object

Returns the d-value for the Durbin-Watson statistic. The value is d << 2 for positive, d >> 2 for negative and d around 2 for no autocorrelation.

 ``` 1283 1284 1285 1286 1287 1288``` ```# File 'lib/bullshit.rb', line 1283 def durbin_watson_statistic e = linear_regression.residues e.size <= 1 and return 2.0 (1...e.size).inject(0.0) { |s, i| s + (e[i] - e[i - 1]) ** 2 } / e.inject(0.0) { |s, x| s + x ** 2 } end```

### #geometric_mean ⇒ Object

Returns the geometric mean of the measurements. If any of the measurements is less than 0.0, this method returns NaN.

 ``` 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148``` ```# File 'lib/bullshit.rb', line 1127 def geometric_mean @geometric_mean ||= ( sum = @measurements.inject(0.0) { |s, t| case when t > 0 s + Math.log(t) when t == 0 break :null else break nil end } case sum when :null 0.0 when Float Math.exp(sum / size) else 0 / 0.0 end ) end```

### #harmonic_mean ⇒ Object

Returns the harmonic mean of the measurements. If any of the measurements is less than or equal to 0.0, this method returns NaN.

 ``` 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123``` ```# File 'lib/bullshit.rb', line 1112 def harmonic_mean @harmonic_mean ||= ( sum = @measurements.inject(0.0) { |s, t| if t > 0 s + 1.0 / t else break nil end } sum ? size / sum : 0 / 0.0 ) end```

### #histogram(bins) ⇒ Object

Returns a Histogram instance with `bins` as the number of bins for this analysis' measurements.

 ``` 1362 1363 1364``` ```# File 'lib/bullshit.rb', line 1362 def histogram(bins) Histogram.new(self, bins) end```

### #linear_regression ⇒ Object

Returns the LinearRegression object for the equation a * x + b which represents the line computed by the linear regression algorithm.

 ``` 1356 1357 1358``` ```# File 'lib/bullshit.rb', line 1356 def linear_regression @linear_regression ||= LinearRegression.new @measurements end```

### #ljung_box_statistic(lags = 20) ⇒ Object

Returns the q value of the Ljung-Box statistic for the number of lags `lags`. A higher value might indicate autocorrelation in the measurements of this Analysis instance. This method returns nil if there weren't enough (at least lags) lags available.

 ``` 1294 1295 1296 1297 1298 1299``` ```# File 'lib/bullshit.rb', line 1294 def ljung_box_statistic(lags = 20) r = autocorrelation lags >= r.size and return n = size n * (n + 2) * (1..lags).inject(0.0) { |s, i| s + r[i] ** 2 / (n - i) } end```

### #max ⇒ Object

Returns the maximum of the measurements.

 ``` 1156 1157 1158``` ```# File 'lib/bullshit.rb', line 1156 def max @max ||= @measurements.max end```

### #min ⇒ Object

Returns the minimum of the measurements.

 ``` 1151 1152 1153``` ```# File 'lib/bullshit.rb', line 1151 def min @min ||= @measurements.min end```

### #percentile(p = 50) ⇒ ObjectAlso known as: median

Returns the `p`-percentile of the measurements. There are many methods to compute the percentile, this method uses the the weighted average at x_(n + 1)p, which allows p to be in 0…100 (excluding the 100).

 ``` 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181``` ```# File 'lib/bullshit.rb', line 1164 def percentile(p = 50) (0...100).include?(p) or raise ArgumentError, "p = #{p}, but has to be in (0...100)" p /= 100.0 @sorted ||= @measurements.sort r = p * (@sorted.size + 1) r_i = r.to_i r_f = r - r_i if r_i >= 1 result = @sorted[r_i - 1] if r_i < @sorted.size result += r_f * (@sorted[r_i] - @sorted[r_i - 1]) end else result = @sorted[0] end result end```

### #sample_standard_deviation ⇒ Object

Returns the sample standard deviation of the measurements.

 ``` 1088 1089 1090``` ```# File 'lib/bullshit.rb', line 1088 def sample_standard_deviation @sample_standard_deviation ||= Math.sqrt(sample_variance) end```

### #sample_standard_deviation_percentage ⇒ Object

Returns the sample standard deviation of the measurements in percentage of the arithmetic mean.

 ``` 1094 1095 1096``` ```# File 'lib/bullshit.rb', line 1094 def sample_standard_deviation_percentage @sample_standard_deviation_percentage ||= 100.0 * sample_standard_deviation / arithmetic_mean end```

### #sample_variance ⇒ Object

Returns the sample_variance of the measurements.

 ``` 1066 1067 1068``` ```# File 'lib/bullshit.rb', line 1066 def sample_variance @sample_variance ||= size > 1 ? sum_of_squares / (size - 1.0) : 0.0 end```

### #size ⇒ Object

Returns the number of measurements, on which the analysis is based.

 ``` 1056 1057 1058``` ```# File 'lib/bullshit.rb', line 1056 def size @measurements.size end```

### #standard_deviation ⇒ Object

Returns the standard deviation of the measurements.

 ``` 1077 1078 1079``` ```# File 'lib/bullshit.rb', line 1077 def standard_deviation @sample_deviation ||= Math.sqrt(variance) end```

### #standard_deviation_percentage ⇒ Object

Returns the standard deviation of the measurements in percentage of the arithmetic mean.

 ``` 1083 1084 1085``` ```# File 'lib/bullshit.rb', line 1083 def standard_deviation_percentage @standard_deviation_percentage ||= 100.0 * standard_deviation / arithmetic_mean end```

### #suggested_sample_size(other, alpha = 0.05, beta = 0.05) ⇒ Object

Compute a sample size, that will more likely yield a mean difference between this instance's measurements and those of `other`. Use `alpha` and `beta` as levels for the first- and second-order errors.

 ``` 1235 1236 1237 1238 1239 1240 1241 1242 1243``` ```# File 'lib/bullshit.rb', line 1235 def suggested_sample_size(other, alpha = 0.05, beta = 0.05) alpha, beta = alpha.abs, beta.abs signal = arithmetic_mean - other.arithmetic_mean df = size + other.size - 2 pooled_variance_estimate = (sum_of_squares + other.sum_of_squares) / df td = TDistribution.new df (((td.inverse_probability(alpha) + td.inverse_probability(beta)) * Math.sqrt(pooled_variance_estimate)) / signal) ** 2 end```

### #sum ⇒ Object

Returns the sum of all measurements.

 ``` 1099 1100 1101``` ```# File 'lib/bullshit.rb', line 1099 def sum @sum ||= @measurements.inject(0.0) { |s, t| s + t } end```

### #sum_of_squares ⇒ Object

Returns the sum of squares (the sum of the squared deviations) of the measurements.

 ``` 1072 1073 1074``` ```# File 'lib/bullshit.rb', line 1072 def sum_of_squares @sum_of_squares ||= @measurements.inject(0.0) { |s, t| s + (t - arithmetic_mean) ** 2 } end```

### #t_student(other) ⇒ Object

Returns the t value of the Student's t-test between this Analysis instance and the `other`.

 ``` 1224 1225 1226 1227 1228 1229 1230``` ```# File 'lib/bullshit.rb', line 1224 def t_student(other) signal = arithmetic_mean - other.arithmetic_mean noise = common_standard_deviation(other) * Math.sqrt(size ** -1 + size ** -1) rescue Errno::EDOM 0.0 end```

### #t_welch(other) ⇒ Object

Returns the t value of the Welch's t-test between this Analysis instance and the `other`.

 ``` 1195 1196 1197 1198 1199 1200 1201 1202``` ```# File 'lib/bullshit.rb', line 1195 def t_welch(other) signal = arithmetic_mean - other.arithmetic_mean noise = Math.sqrt(sample_variance / size + other.sample_variance / other.size) signal / noise rescue Errno::EDOM 0.0 end```

### #variance ⇒ Object

Returns the variance of the measurements.

 ``` 1061 1062 1063``` ```# File 'lib/bullshit.rb', line 1061 def variance @variance ||= sum_of_squares / size end```