Module: Enumerable
- Included in:
- FixedRange
- Defined in:
- lib/just_enumerable_stats.rb
Instance Attribute Summary collapse
-
#range_class_args ⇒ Object
readonly
Returns the value of attribute range_class_args.
Instance Method Summary collapse
-
#average(&block) ⇒ Object
(also: #mean, #avg)
The arithmetic mean, uses a block or default block.
-
#cartesian_product(other, &block) ⇒ Object
(also: #cp, #permutations)
Finds the cartesian product, excluding duplicates items and self- referential pairs.
-
#categories ⇒ Object
Takes the range_class and returns its map.
-
#compliment(other) ⇒ Object
Everything on the left hand side except what’s shared on the right hand side.
-
#correlation(other) ⇒ Object
(also: #cor)
Finds the correlation between two enumerables.
-
#cum_max(&block) ⇒ Object
(also: #cumulative_max)
Example: [1,2,3,0,5].cum_max # => [1,2,3,3,5].
-
#cum_min(&block) ⇒ Object
(also: #cumulative_min)
Example: [1,2,3,0,5].cum_min # => [1,1,1,0,0].
-
#cum_prod(sorted = false, &block) ⇒ Object
(also: #cumulative_product)
The cummulative product.
-
#cum_sum(sorted = false, &block) ⇒ Object
(also: #cumulative_sum)
The cummulative sum.
-
#default_block ⇒ Object
The block called to filter the values in the object.
-
#default_block=(block) ⇒ Object
Allows me to setup a block for a series of operations.
-
#euclidian_distance(other) ⇒ Object
Returns the Euclidian distance between all points of a set of enumerables.
-
#exclusive_not(other) ⇒ Object
Everything but what’s shared.
-
#intersect(other) ⇒ Object
What’s shared on the left and right hand sides “The intersection of x and y”.
- #is_numeric? ⇒ Boolean
-
#max(&block) ⇒ Object
Returns the max, using an optional block.
-
#max_index(&block) ⇒ Object
Returns the first index of the max value.
-
#max_of_lists(*enums) ⇒ Object
Returns the max of two or more enumerables.
-
#median(ratio = 0.5, &block) ⇒ Object
The slow way is to iterate up to the middle point.
-
#min(&block) ⇒ Object
Min of any number of items.
-
#min_index(&block) ⇒ Object
Returns the first index of the min value.
-
#min_of_lists(*enums) ⇒ Object
Returns the min of two or more enumerables.
-
#new_sort(&block) ⇒ Object
I don’t pass the block to the sort, because a sort block needs to look something like: {|x,y| x <=> y}.
-
#order(&block) ⇒ Object
Given values like [10,5,5,1] Rank should produce something like [4,2,2,1] And order should produce something like [4,2,3,1] The trick is that rank skips as many as were duplicated, so there could not be a 3 in the rank from the example above.
- #original_max ⇒ Object
- #original_min ⇒ Object
-
#product ⇒ Object
Multiplies the values: >> product(1,2,3) => 6.0.
-
#quantile(&block) ⇒ Object
First quartile: nth_split_by_m(1, 4) Third quartile: nth_split_by_m(3, 4) Median: nth_split_by_m(1, 2) Doesn’t match R, and it’s silly to try to.
-
#rand_in_range(*args) ⇒ Object
Returns a random integer in the range for any number of lists.
-
#range(&block) ⇒ Object
Just an array of [min, max] to comply with R uses of the work.
-
#range_as_range(&block) ⇒ Object
(also: #range_instance)
Actually instantiates the range, instead of producing a min and max array.
-
#range_class ⇒ Object
When creating a range, what class will it be? Defaults to Range, but other classes are sometimes useful.
-
#rank(&block) ⇒ Object
Doesn’t overwrite things like Matrix#rank.
-
#set_range_class(klass, *args) ⇒ Object
Useful for setting a real range class (FixedRange).
-
#sigma_pairs(other, z = zero, &block) ⇒ Object
Sigma of pairs.
-
#standard_deviation(&block) ⇒ Object
(also: #std)
The standard deviation.
-
#sum ⇒ Object
Adds up the list.
-
#tanimoto_pairs(other) ⇒ Object
(also: #tanimoto_correlation)
Finds the tanimoto coefficient: the intersection set size / union set size.
-
#to_pairs(other, &block) ⇒ Object
There are going to be a lot more of these kinds of things, so pay attention.
-
#union(other) ⇒ Object
All of the left and right hand sides, excluding duplicates.
-
#variance(&block) ⇒ Object
(also: #var)
The variance, uses a block or default block.
-
#yield_transpose(*enums, &block) ⇒ Object
Transposes arrays of arrays and yields a block on the value.
Instance Attribute Details
#range_class_args ⇒ Object (readonly)
Returns the value of attribute range_class_args.
242 243 244 |
# File 'lib/just_enumerable_stats.rb', line 242 def range_class_args @range_class_args end |
Instance Method Details
#average(&block) ⇒ Object Also known as: mean, avg
The arithmetic mean, uses a block or default block.
107 108 109 |
# File 'lib/just_enumerable_stats.rb', line 107 def average(&block) sum(&block)/size end |
#cartesian_product(other, &block) ⇒ Object Also known as: cp, permutations
Finds the cartesian product, excluding duplicates items and self- referential pairs. Yields the block value if given.
450 451 452 453 454 455 456 457 |
# File 'lib/just_enumerable_stats.rb', line 450 def cartesian_product(other, &block) x,y = self.uniq.dup, other.uniq.dup pairs = x.inject([]) do |cp, i| cp | y.map{|b| i == b ? nil : [i,b]}.compact end return pairs unless block_given? pairs.map{|p| yield p.first, p.last} end |
#categories ⇒ Object
Takes the range_class and returns its map. Example: require ‘mathn’ a = [1,2,3] a range_class = FixedRange, a.min, a.max, 1/4 a.categories
> [1, 5/4, 3/2, 7/4, 2, 9/4, 5/2, 11/4, 3]
For non-numeric values, returns a unique set, ordered if possible.
217 218 219 220 221 222 223 |
# File 'lib/just_enumerable_stats.rb', line 217 def categories if self.is_numeric? self.range_instance.map else self.uniq.sort rescue self.uniq end end |
#compliment(other) ⇒ Object
Everything on the left hand side except what’s shared on the right hand side. “The relative compliment of y in x”
439 440 441 |
# File 'lib/just_enumerable_stats.rb', line 439 def compliment(other) self - other end |
#correlation(other) ⇒ Object Also known as: cor
Finds the correlation between two enumerables. Example: [1,2,3].cor [2,3,5] returns 0.981980506061966
495 496 497 498 499 500 501 502 503 504 505 506 507 508 |
# File 'lib/just_enumerable_stats.rb', line 495 def correlation(other) n = [self.size, other.size].min sum_of_products_of_pairs = self.sigma_pairs(other) {|a, b| a * b} self_sum = self.sum other_sum = other.sum sum_of_squared_self_scores = self.sum { |e| e * e } sum_of_squared_other_scores = other.sum { |e| e * e } numerator = (n * sum_of_products_of_pairs) - (self_sum * other_sum) self_denominator = ((n * sum_of_squared_self_scores) - (self_sum ** 2)) other_denominator = ((n * sum_of_squared_other_scores) - (other_sum ** 2)) denominator = Math.sqrt(self_denominator * other_denominator) return numerator / denominator end |
#cum_max(&block) ⇒ Object Also known as: cumulative_max
Example:
- 1,2,3,0,5].cum_max # => [1,2,3,3,5
377 378 379 380 381 382 |
# File 'lib/just_enumerable_stats.rb', line 377 def cum_max(&block) morph_list(&block).inject([]) do |list, e| found = (list | [e]).max list << (found ? found : e) end end |
#cum_min(&block) ⇒ Object Also known as: cumulative_min
Example:
- 1,2,3,0,5].cum_min # => [1,1,1,0,0
387 388 389 390 391 392 |
# File 'lib/just_enumerable_stats.rb', line 387 def cum_min(&block) morph_list(&block).inject([]) do |list, e| found = (list | [e]).min list << (found ? found : e) end end |
#cum_prod(sorted = false, &block) ⇒ Object Also known as: cumulative_product
The cummulative product. Example:
- 1,2,3].cum_prod # => [1.0, 2.0, 6.0
350 351 352 353 354 355 356 357 358 359 360 |
# File 'lib/just_enumerable_stats.rb', line 350 def cum_prod(sorted=false, &block) prod = one obj = sorted ? self.new_sort : self if block_given? obj.map { |i| prod *= yield(i) } elsif default_block obj.map { |i| prod *= default_block[*i] } else obj.map { |i| prod *= i } end end |
#cum_sum(sorted = false, &block) ⇒ Object Also known as: cumulative_sum
The cummulative sum. Example:
- 1,2,3].cum_sum # => [1, 3, 6
335 336 337 338 339 340 341 342 343 344 345 |
# File 'lib/just_enumerable_stats.rb', line 335 def cum_sum(sorted=false, &block) sum = zero obj = sorted ? self.new_sort : self if block_given? obj.map { |i| sum += yield(i) } elsif default_block obj.map { |i| sum += default_block[*i] } else obj.map { |i| sum += i } end end |
#default_block ⇒ Object
The block called to filter the values in the object.
68 69 70 |
# File 'lib/just_enumerable_stats.rb', line 68 def default_block @default_stat_block end |
#default_block=(block) ⇒ Object
Allows me to setup a block for a series of operations. Example: a = [1,2,3] a.sum # => 6.0 a.default_block = lambda{|e| 1 / e} a.sum # => 1.0
77 78 79 |
# File 'lib/just_enumerable_stats.rb', line 77 def default_block=(block) @default_stat_block = block end |
#euclidian_distance(other) ⇒ Object
Returns the Euclidian distance between all points of a set of enumerables
469 470 471 |
# File 'lib/just_enumerable_stats.rb', line 469 def euclidian_distance(other) Math.sqrt(self.sigma_pairs(other) {|a, b| (a - b) ** 2}) end |
#exclusive_not(other) ⇒ Object
Everything but what’s shared
444 445 446 |
# File 'lib/just_enumerable_stats.rb', line 444 def exclusive_not(other) (self | other) - (self & other) end |
#intersect(other) ⇒ Object
What’s shared on the left and right hand sides “The intersection of x and y”
432 433 434 |
# File 'lib/just_enumerable_stats.rb', line 432 def intersect(other) self & other end |
#is_numeric? ⇒ Boolean
225 226 227 |
# File 'lib/just_enumerable_stats.rb', line 225 def is_numeric? self.all? {|e| e.is_a?(Numeric)} end |
#max(&block) ⇒ Object
Returns the max, using an optional block.
42 43 44 45 46 47 |
# File 'lib/just_enumerable_stats.rb', line 42 def max(&block) self.inject do |best, e| val = block_sorter(best, e, &block) best = val > 0 ? best : e end end |
#max_index(&block) ⇒ Object
Returns the first index of the max value
50 51 52 |
# File 'lib/just_enumerable_stats.rb', line 50 def max_index(&block) self.index(max(&block)) end |
#max_of_lists(*enums) ⇒ Object
Returns the max of two or more enumerables. >> [1,2,3].max_of_lists(, [0,2,9])
> [1, 5, 9]
523 524 525 |
# File 'lib/just_enumerable_stats.rb', line 523 def max_of_lists(*enums) yield_transpose(*enums) {|e| e.max} end |
#median(ratio = 0.5, &block) ⇒ Object
The slow way is to iterate up to the middle point. A faster way is to use the index, when available. If a block is supplied, always iterate to the middle point.
136 137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/just_enumerable_stats.rb', line 136 def median(ratio=0.5, &block) return iterate_midway(ratio, &block) if block_given? begin mid1, mid2 = middle_two sorted = new_sort med1, med2 = sorted[mid1], sorted[mid2] return med1 if med1 == med2 return med1 + ((med2 - med1) * ratio) rescue iterate_midway(ratio, &block) end end |
#min(&block) ⇒ Object
Min of any number of items
55 56 57 58 59 60 |
# File 'lib/just_enumerable_stats.rb', line 55 def min(&block) self.inject do |best, e| val = block_sorter(best, e, &block) best = val < 0 ? best : e end end |
#min_index(&block) ⇒ Object
Returns the first index of the min value
63 64 65 |
# File 'lib/just_enumerable_stats.rb', line 63 def min_index(&block) self.index(min(&block)) end |
#min_of_lists(*enums) ⇒ Object
Returns the min of two or more enumerables. >> [1,2,3].min_of_lists(, [0,2,9])
> [0, 2, 3]
530 531 532 |
# File 'lib/just_enumerable_stats.rb', line 530 def min_of_lists(*enums) yield_transpose(*enums) {|e| e.min} end |
#new_sort(&block) ⇒ Object
I don’t pass the block to the sort, because a sort block needs to look something like: {|x,y| x <=> y}. To get around this, set the default block on the object.
263 264 265 266 267 268 269 270 271 |
# File 'lib/just_enumerable_stats.rb', line 263 def new_sort(&block) if block_given? map { |i| yield(i) }.sort.dup elsif default_block map { |i| default_block[*i] }.sort.dup else sort().dup end end |
#order(&block) ⇒ Object
Given values like [10,5,5,1] Rank should produce something like [4,2,2,1] And order should produce something like [4,2,3,1] The trick is that rank skips as many as were duplicated, so there could not be a 3 in the rank from the example above.
293 294 295 296 297 298 299 300 301 302 |
# File 'lib/just_enumerable_stats.rb', line 293 def order(&block) hold = [] rank(&block).each do |x| while hold.include?(x) do x += 1 end hold << x end hold end |
#original_max ⇒ Object
26 |
# File 'lib/just_enumerable_stats.rb', line 26 alias :original_max :max |
#original_min ⇒ Object
27 |
# File 'lib/just_enumerable_stats.rb', line 27 alias :original_min :min |
#product ⇒ Object
Multiplies the values: >> product(1,2,3)
> 6.0
398 399 400 |
# File 'lib/just_enumerable_stats.rb', line 398 def product self.inject(one) {|sum, a| sum *= a} end |
#quantile(&block) ⇒ Object
First quartile: nth_split_by_m(1, 4) Third quartile: nth_split_by_m(3, 4) Median: nth_split_by_m(1, 2) Doesn’t match R, and it’s silly to try to. def nth_split_by_m(n, m)
sorted = new_sort
dividers = m - 1
if size % m == dividers # Divides evenly
# Because we have a 0-based list, we get the floor
i = ((size / m.to_f) * n).floor
j = i
else
# This reflects R's approach, which I don't think I agree with.
i = (((size / m.to_f) * n) - 1)
i = i > (size / m.to_f) ? i.floor : i.ceil
j = i + 1
end
sorted[i] + ((n / m.to_f) * (sorted[j] - sorted[i]))
end
323 324 325 326 327 328 329 330 331 |
# File 'lib/just_enumerable_stats.rb', line 323 def quantile(&block) [ min(&block), first_half(&block).median(0.25, &block), median(&block), second_half(&block).median(0.75, &block), max(&block) ] end |
#rand_in_range(*args) ⇒ Object
Returns a random integer in the range for any number of lists. This is a way to get a random vector that is tenable based on the sample data. For example, given two sets of numbers:
a = [1,2,3]; b = [8,8,8]
rand_in_pair_range will return a value >= 1 and <= 8 in the first place, >= 2 and <= 8 in the second place, and >= 3 and <= 8 in the last place. Works for integers. Rethink this for floats. May consider setting up FixedRange for floats. O(n*5)
484 485 486 487 488 489 490 |
# File 'lib/just_enumerable_stats.rb', line 484 def rand_in_range(*args) min = self.min_of_lists(*args) max = self.max_of_lists(*args) (0...size).inject([]) do |ary, i| ary << rand_between(min[i], max[i]) end end |
#range(&block) ⇒ Object
Just an array of [min, max] to comply with R uses of the work. Use range_as_range if you want a real Range.
231 232 233 |
# File 'lib/just_enumerable_stats.rb', line 231 def range(&block) [min(&block), max(&block)] end |
#range_as_range(&block) ⇒ Object Also known as: range_instance
Actually instantiates the range, instead of producing a min and max array.
251 252 253 254 255 256 257 |
# File 'lib/just_enumerable_stats.rb', line 251 def range_as_range(&block) if @range_class_args and not @range_class_args.empty? self.range_class.new(*@range_class_args) else self.range_class.new(min(&block), max(&block)) end end |
#range_class ⇒ Object
When creating a range, what class will it be? Defaults to Range, but other classes are sometimes useful.
246 247 248 |
# File 'lib/just_enumerable_stats.rb', line 246 def range_class @range_class ||= Range end |
#rank(&block) ⇒ Object
Doesn’t overwrite things like Matrix#rank
274 275 276 277 278 279 280 281 282 283 284 285 286 |
# File 'lib/just_enumerable_stats.rb', line 274 def rank(&block) sorted = new_sort(&block) if block_given? map { |i| sorted.index(yield(i)) + 1 } elsif default_block map { |i| sorted.index(default_block[*i]) + 1 } else map { |i| sorted.index(i) + 1 } end end |
#set_range_class(klass, *args) ⇒ Object
Useful for setting a real range class (FixedRange).
236 237 238 239 240 |
# File 'lib/just_enumerable_stats.rb', line 236 def set_range_class(klass, *args) @range_class = klass @range_class_args = args self.range_class end |
#sigma_pairs(other, z = zero, &block) ⇒ Object
Sigma of pairs. Returns a single float, or whatever object is sent in. Example: [1,2,3].sigma_pairs(, 0) {|x, y| x + y} returns 21 instead of 21.0.
464 465 466 |
# File 'lib/just_enumerable_stats.rb', line 464 def sigma_pairs(other, z=zero, &block) self.to_pairs(other,&block).inject(z) {|sum, i| sum += i} end |
#standard_deviation(&block) ⇒ Object Also known as: std
The standard deviation. Uses a block or default block.
128 129 130 |
# File 'lib/just_enumerable_stats.rb', line 128 def standard_deviation(&block) Math::sqrt(variance(&block)) end |
#sum ⇒ Object
Adds up the list. Uses a block or default block if present.
94 95 96 97 98 99 100 101 102 103 104 |
# File 'lib/just_enumerable_stats.rb', line 94 def sum sum = zero if block_given? each{|i| sum += yield(i)} elsif default_block each{|i| sum += default_block[*i]} else each{|i| sum += i} end sum end |
#tanimoto_pairs(other) ⇒ Object Also known as: tanimoto_correlation
Finds the tanimoto coefficient: the intersection set size / union set size. This is used to find the distance between two vectors. >> [1,2,3].cor()
> 0.981980506061966
> 0.5
415 416 417 |
# File 'lib/just_enumerable_stats.rb', line 415 def tanimoto_pairs(other) intersect(other).size / union(other).size.to_f end |
#to_pairs(other, &block) ⇒ Object
There are going to be a lot more of these kinds of things, so pay attention.
404 405 406 407 |
# File 'lib/just_enumerable_stats.rb', line 404 def to_pairs(other, &block) n = [self.size, other.size].min (0...n).map {|i| block.call(self[i], other[i]) } end |
#union(other) ⇒ Object
All of the left and right hand sides, excluding duplicates. “The union of x and y”
426 427 428 |
# File 'lib/just_enumerable_stats.rb', line 426 def union(other) self | other end |
#variance(&block) ⇒ Object Also known as: var
The variance, uses a block or default block.
114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/just_enumerable_stats.rb', line 114 def variance(&block) m = mean(&block) sum_of_differences = if block_given? sum{ |i| j=yield(i); (m - j) ** 2 } elsif default_block sum{ |i| j=default_block[*i]; (m - j) ** 2 } else sum{ |i| (m - i) ** 2 } end sum_of_differences / (size - 1) end |
#yield_transpose(*enums, &block) ⇒ Object
Transposes arrays of arrays and yields a block on the value. The regular Array#transpose ignores blocks
513 514 515 516 517 518 |
# File 'lib/just_enumerable_stats.rb', line 513 def yield_transpose(*enums, &block) enums.unshift(self) n = enums.map{ |x| x.size}.min block ||= lambda{|e| e} (0...n).map { |i| block.call enums.map{ |x| x[i] } } end |