Module: Enumerable
- Included in:
- FixedRange
- Defined in:
- lib/just_enumerable_stats.rb
Instance Attribute Summary collapse
-
#range_class_args ⇒ Object
readonly
The arguments needed to instantiate the custom-defined range class.
-
#range_hash ⇒ Object
readonly
The hash of lambdas that are used to categorize the enumerable.
Instance Method Summary collapse
-
#average(&block) ⇒ Object
(also: #mean, #avg)
The arithmetic mean, uses a block or default block.
-
#cartesian_product(other, &block) ⇒ Object
(also: #cp, #permutations)
Finds the cartesian product, excluding duplicates items and self- referential pairs.
-
#categories ⇒ Object
Takes the range_class and returns its map.
-
#category_values(reset = false) ⇒ Object
Returns a Hash or Dictionary (if available) for each category with a value as the set of matching values as an array.
-
#compliment(other) ⇒ Object
Everything on the left hand side except what’s shared on the right hand side.
-
#correlation(other) ⇒ Object
(also: #cor)
Finds the correlation between two enumerables.
-
#count_if(&block) ⇒ Object
Counts each element where the block evaluates to true Example: a = [1,2,3] a.count_if {|e| e % 2 == 0}.
-
#cum_max(&block) ⇒ Object
(also: #cumulative_max)
Example: [1,2,3,0,5].cum_max # => [1,2,3,3,5].
-
#cum_min(&block) ⇒ Object
(also: #cumulative_min)
Example: [1,2,3,0,5].cum_min # => [1,1,1,0,0].
-
#cum_prod(sorted = false, &block) ⇒ Object
(also: #cumulative_product)
The cummulative product.
-
#cum_sum(sorted = false, &block) ⇒ Object
(also: #cumulative_sum)
The cummulative sum.
-
#default_block ⇒ Object
The block called to filter the values in the object.
-
#default_block=(block) ⇒ Object
Allows me to setup a block for a series of operations.
-
#euclidian_distance(other) ⇒ Object
Returns the Euclidian distance between all points of a set of enumerables.
-
#exclusive_not(other) ⇒ Object
Everything but what’s shared.
-
#intersect(other) ⇒ Object
What’s shared on the left and right hand sides “The intersection of x and y”.
- #is_numeric? ⇒ Boolean
-
#max(&block) ⇒ Object
Returns the max, using an optional block.
-
#max_index(&block) ⇒ Object
Returns the first index of the max value.
-
#max_of_lists(*enums) ⇒ Object
Returns the max of two or more enumerables.
-
#median(ratio = 0.5, &block) ⇒ Object
The slow way is to iterate up to the middle point.
-
#min(&block) ⇒ Object
Min of any number of items.
-
#min_index(&block) ⇒ Object
Returns the first index of the min value.
-
#min_of_lists(*enums) ⇒ Object
Returns the min of two or more enumerables.
-
#new_sort(&block) ⇒ Object
I don’t pass the block to the sort, because a sort block needs to look something like: {|x,y| x <=> y}.
-
#order(&block) ⇒ Object
Given values like [10,5,5,1] Rank should produce something like [4,2,2,1] And order should produce something like [4,2,3,1] The trick is that rank skips as many as were duplicated, so there could not be a 3 in the rank from the example above.
- #original_max ⇒ Object
- #original_min ⇒ Object
-
#product ⇒ Object
Multiplies the values: >> product(1,2,3) => 6.0.
-
#quantile(&block) ⇒ Object
First quartile: nth_split_by_m(1, 4) Third quartile: nth_split_by_m(3, 4) Median: nth_split_by_m(1, 2) Doesn’t match R, and it’s silly to try to.
-
#rand_in_range(*args) ⇒ Object
Returns a random integer in the range for any number of lists.
-
#range(&block) ⇒ Object
Just an array of [min, max] to comply with R uses of the work.
-
#range_as_range(&block) ⇒ Object
(also: #range_instance)
Actually instantiates the range, instead of producing a min and max array.
-
#range_class ⇒ Object
When creating a range, what class will it be? Defaults to Range, but other classes are sometimes useful.
-
#rank(&block) ⇒ Object
Doesn’t overwrite things like Matrix#rank.
-
#set_range(hash) ⇒ Object
Takes a hash of arrays for categories If Facets happens to be loaded on the computer, this keeps the order of the categories straight.
-
#set_range_class(klass, *args) ⇒ Object
Useful for setting a real range class (FixedRange).
-
#sigma_pairs(other, z = zero, &block) ⇒ Object
Sigma of pairs.
-
#standard_deviation(&block) ⇒ Object
(also: #std)
The standard deviation.
-
#sum ⇒ Object
Adds up the list.
-
#tanimoto_pairs(other) ⇒ Object
(also: #tanimoto_correlation)
Finds the tanimoto coefficient: the intersection set size / union set size.
-
#to_pairs(other, &block) ⇒ Object
There are going to be a lot more of these kinds of things, so pay attention.
-
#union(other) ⇒ Object
All of the left and right hand sides, excluding duplicates.
-
#variance(&block) ⇒ Object
(also: #var)
The variance, uses a block or default block.
-
#yield_transpose(*enums, &block) ⇒ Object
Transposes arrays of arrays and yields a block on the value.
Instance Attribute Details
#range_class_args ⇒ Object (readonly)
The arguments needed to instantiate the custom-defined range class.
269 270 271 |
# File 'lib/just_enumerable_stats.rb', line 269 def range_class_args @range_class_args end |
#range_hash ⇒ Object (readonly)
The hash of lambdas that are used to categorize the enumerable.
266 267 268 |
# File 'lib/just_enumerable_stats.rb', line 266 def range_hash @range_hash end |
Instance Method Details
#average(&block) ⇒ Object Also known as: mean, avg
The arithmetic mean, uses a block or default block.
113 114 115 |
# File 'lib/just_enumerable_stats.rb', line 113 def average(&block) sum(&block)/size end |
#cartesian_product(other, &block) ⇒ Object Also known as: cp, permutations
Finds the cartesian product, excluding duplicates items and self- referential pairs. Yields the block value if given.
511 512 513 514 515 516 517 518 |
# File 'lib/just_enumerable_stats.rb', line 511 def cartesian_product(other, &block) x,y = self.uniq.dup, other.uniq.dup pairs = x.inject([]) do |cp, i| cp | y.map{|b| i == b ? nil : [i,b]}.compact end return pairs unless block_given? pairs.map{|p| yield p.first, p.last} end |
#categories ⇒ Object
Takes the range_class and returns its map. Example: require ‘mathn’ a = [1,2,3] a range_class = FixedRange, a.min, a.max, 1/4 a.categories
> [1, 5/4, 3/2, 7/4, 2, 9/4, 5/2, 11/4, 3]
For non-numeric values, returns a unique set, ordered if possible.
223 224 225 226 227 228 229 230 231 |
# File 'lib/just_enumerable_stats.rb', line 223 def categories if @categories @categories elsif self.is_numeric? self.range_instance.map else self.uniq.sort rescue self.uniq end end |
#category_values(reset = false) ⇒ Object
Returns a Hash or Dictionary (if available) for each category with a value as the set of matching values as an array. Because this is supposed to be lean (just enumerables), but this is an expensive call, I’m going to cache it and offer a parameter to reset the cache. So, call category_values(true) if you need to reset the cache.
288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 |
# File 'lib/just_enumerable_stats.rb', line 288 def category_values(reset=false) @category_values = nil if reset return @category_values if @category_values container = defined?(Dictionary) ? Dictionary.new : Hash.new if self.range_hash @category_values = self.categories.inject(container) do |cont, cat| cont[cat] = self.find_all &self.range_hash[cat] cont end else @category_values = self.categories.inject(container) do |cont, cat| cont[cat] = self.find_all {|e| e == cat} cont end end end |
#compliment(other) ⇒ Object
Everything on the left hand side except what’s shared on the right hand side. “The relative compliment of y in x”
500 501 502 |
# File 'lib/just_enumerable_stats.rb', line 500 def compliment(other) self - other end |
#correlation(other) ⇒ Object Also known as: cor
Finds the correlation between two enumerables. Example: [1,2,3].cor [2,3,5] returns 0.981980506061966
556 557 558 559 560 561 562 563 564 565 566 567 568 569 |
# File 'lib/just_enumerable_stats.rb', line 556 def correlation(other) n = [self.size, other.size].min sum_of_products_of_pairs = self.sigma_pairs(other) {|a, b| a * b} self_sum = self.sum other_sum = other.sum sum_of_squared_self_scores = self.sum { |e| e * e } sum_of_squared_other_scores = other.sum { |e| e * e } numerator = (n * sum_of_products_of_pairs) - (self_sum * other_sum) self_denominator = ((n * sum_of_squared_self_scores) - (self_sum ** 2)) other_denominator = ((n * sum_of_squared_other_scores) - (other_sum ** 2)) denominator = Math.sqrt(self_denominator * other_denominator) return numerator / denominator end |
#count_if(&block) ⇒ Object
Counts each element where the block evaluates to true Example: a = [1,2,3] a.count_if {|e| e % 2 == 0}
275 276 277 278 279 280 |
# File 'lib/just_enumerable_stats.rb', line 275 def count_if(&block) self.inject(0) do |s, e| s += 1 if block.call(e) s end end |
#cum_max(&block) ⇒ Object Also known as: cumulative_max
Example:
- 1,2,3,0,5].cum_max # => [1,2,3,3,5
438 439 440 441 442 443 |
# File 'lib/just_enumerable_stats.rb', line 438 def cum_max(&block) morph_list(&block).inject([]) do |list, e| found = (list | [e]).max list << (found ? found : e) end end |
#cum_min(&block) ⇒ Object Also known as: cumulative_min
Example:
- 1,2,3,0,5].cum_min # => [1,1,1,0,0
448 449 450 451 452 453 |
# File 'lib/just_enumerable_stats.rb', line 448 def cum_min(&block) morph_list(&block).inject([]) do |list, e| found = (list | [e]).min list << (found ? found : e) end end |
#cum_prod(sorted = false, &block) ⇒ Object Also known as: cumulative_product
The cummulative product. Example:
- 1,2,3].cum_prod # => [1.0, 2.0, 6.0
411 412 413 414 415 416 417 418 419 420 421 |
# File 'lib/just_enumerable_stats.rb', line 411 def cum_prod(sorted=false, &block) prod = one obj = sorted ? self.new_sort : self if block_given? obj.map { |i| prod *= yield(i) } elsif default_block obj.map { |i| prod *= default_block[*i] } else obj.map { |i| prod *= i } end end |
#cum_sum(sorted = false, &block) ⇒ Object Also known as: cumulative_sum
The cummulative sum. Example:
- 1,2,3].cum_sum # => [1, 3, 6
396 397 398 399 400 401 402 403 404 405 406 |
# File 'lib/just_enumerable_stats.rb', line 396 def cum_sum(sorted=false, &block) sum = zero obj = sorted ? self.new_sort : self if block_given? obj.map { |i| sum += yield(i) } elsif default_block obj.map { |i| sum += default_block[*i] } else obj.map { |i| sum += i } end end |
#default_block ⇒ Object
The block called to filter the values in the object.
74 75 76 |
# File 'lib/just_enumerable_stats.rb', line 74 def default_block @default_stat_block end |
#default_block=(block) ⇒ Object
Allows me to setup a block for a series of operations. Example: a = [1,2,3] a.sum # => 6.0 a.default_block = lambda{|e| 1 / e} a.sum # => 1.0
83 84 85 |
# File 'lib/just_enumerable_stats.rb', line 83 def default_block=(block) @default_stat_block = block end |
#euclidian_distance(other) ⇒ Object
Returns the Euclidian distance between all points of a set of enumerables
530 531 532 |
# File 'lib/just_enumerable_stats.rb', line 530 def euclidian_distance(other) Math.sqrt(self.sigma_pairs(other) {|a, b| (a - b) ** 2}) end |
#exclusive_not(other) ⇒ Object
Everything but what’s shared
505 506 507 |
# File 'lib/just_enumerable_stats.rb', line 505 def exclusive_not(other) (self | other) - (self & other) end |
#intersect(other) ⇒ Object
What’s shared on the left and right hand sides “The intersection of x and y”
493 494 495 |
# File 'lib/just_enumerable_stats.rb', line 493 def intersect(other) self & other end |
#is_numeric? ⇒ Boolean
233 234 235 |
# File 'lib/just_enumerable_stats.rb', line 233 def is_numeric? self.all? {|e| e.is_a?(Numeric)} end |
#max(&block) ⇒ Object
Returns the max, using an optional block.
48 49 50 51 52 53 |
# File 'lib/just_enumerable_stats.rb', line 48 def max(&block) self.inject do |best, e| val = block_sorter(best, e, &block) best = val > 0 ? best : e end end |
#max_index(&block) ⇒ Object
Returns the first index of the max value
56 57 58 |
# File 'lib/just_enumerable_stats.rb', line 56 def max_index(&block) self.index(max(&block)) end |
#max_of_lists(*enums) ⇒ Object
Returns the max of two or more enumerables. >> [1,2,3].max_of_lists(, [0,2,9])
> [1, 5, 9]
584 585 586 |
# File 'lib/just_enumerable_stats.rb', line 584 def max_of_lists(*enums) yield_transpose(*enums) {|e| e.max} end |
#median(ratio = 0.5, &block) ⇒ Object
The slow way is to iterate up to the middle point. A faster way is to use the index, when available. If a block is supplied, always iterate to the middle point.
142 143 144 145 146 147 148 149 150 151 152 153 |
# File 'lib/just_enumerable_stats.rb', line 142 def median(ratio=0.5, &block) return iterate_midway(ratio, &block) if block_given? begin mid1, mid2 = middle_two sorted = new_sort med1, med2 = sorted[mid1], sorted[mid2] return med1 if med1 == med2 return med1 + ((med2 - med1) * ratio) rescue iterate_midway(ratio, &block) end end |
#min(&block) ⇒ Object
Min of any number of items
61 62 63 64 65 66 |
# File 'lib/just_enumerable_stats.rb', line 61 def min(&block) self.inject do |best, e| val = block_sorter(best, e, &block) best = val < 0 ? best : e end end |
#min_index(&block) ⇒ Object
Returns the first index of the min value
69 70 71 |
# File 'lib/just_enumerable_stats.rb', line 69 def min_index(&block) self.index(min(&block)) end |
#min_of_lists(*enums) ⇒ Object
Returns the min of two or more enumerables. >> [1,2,3].min_of_lists(, [0,2,9])
> [0, 2, 3]
591 592 593 |
# File 'lib/just_enumerable_stats.rb', line 591 def min_of_lists(*enums) yield_transpose(*enums) {|e| e.min} end |
#new_sort(&block) ⇒ Object
I don’t pass the block to the sort, because a sort block needs to look something like: {|x,y| x <=> y}. To get around this, set the default block on the object.
324 325 326 327 328 329 330 331 332 |
# File 'lib/just_enumerable_stats.rb', line 324 def new_sort(&block) if block_given? map { |i| yield(i) }.sort.dup elsif default_block map { |i| default_block[*i] }.sort.dup else sort().dup end end |
#order(&block) ⇒ Object
Given values like [10,5,5,1] Rank should produce something like [4,2,2,1] And order should produce something like [4,2,3,1] The trick is that rank skips as many as were duplicated, so there could not be a 3 in the rank from the example above.
354 355 356 357 358 359 360 361 362 363 |
# File 'lib/just_enumerable_stats.rb', line 354 def order(&block) hold = [] rank(&block).each do |x| while hold.include?(x) do x += 1 end hold << x end hold end |
#original_max ⇒ Object
32 |
# File 'lib/just_enumerable_stats.rb', line 32 alias :original_max :max |
#original_min ⇒ Object
33 |
# File 'lib/just_enumerable_stats.rb', line 33 alias :original_min :min |
#product ⇒ Object
Multiplies the values: >> product(1,2,3)
> 6.0
459 460 461 |
# File 'lib/just_enumerable_stats.rb', line 459 def product self.inject(one) {|sum, a| sum *= a} end |
#quantile(&block) ⇒ Object
First quartile: nth_split_by_m(1, 4) Third quartile: nth_split_by_m(3, 4) Median: nth_split_by_m(1, 2) Doesn’t match R, and it’s silly to try to. def nth_split_by_m(n, m)
sorted = new_sort
dividers = m - 1
if size % m == dividers # Divides evenly
# Because we have a 0-based list, we get the floor
i = ((size / m.to_f) * n).floor
j = i
else
# This reflects R's approach, which I don't think I agree with.
i = (((size / m.to_f) * n) - 1)
i = i > (size / m.to_f) ? i.floor : i.ceil
j = i + 1
end
sorted[i] + ((n / m.to_f) * (sorted[j] - sorted[i]))
end
384 385 386 387 388 389 390 391 392 |
# File 'lib/just_enumerable_stats.rb', line 384 def quantile(&block) [ min(&block), first_half(&block).median(0.25, &block), median(&block), second_half(&block).median(0.75, &block), max(&block) ] end |
#rand_in_range(*args) ⇒ Object
Returns a random integer in the range for any number of lists. This is a way to get a random vector that is tenable based on the sample data. For example, given two sets of numbers:
a = [1,2,3]; b = [8,8,8]
rand_in_pair_range will return a value >= 1 and <= 8 in the first place, >= 2 and <= 8 in the second place, and >= 3 and <= 8 in the last place. Works for integers. Rethink this for floats. May consider setting up FixedRange for floats. O(n*5)
545 546 547 548 549 550 551 |
# File 'lib/just_enumerable_stats.rb', line 545 def rand_in_range(*args) min = self.min_of_lists(*args) max = self.max_of_lists(*args) (0...size).inject([]) do |ary, i| ary << rand_between(min[i], max[i]) end end |
#range(&block) ⇒ Object
Just an array of [min, max] to comply with R uses of the work. Use range_as_range if you want a real Range.
239 240 241 |
# File 'lib/just_enumerable_stats.rb', line 239 def range(&block) [min(&block), max(&block)] end |
#range_as_range(&block) ⇒ Object Also known as: range_instance
Actually instantiates the range, instead of producing a min and max array.
312 313 314 315 316 317 318 |
# File 'lib/just_enumerable_stats.rb', line 312 def range_as_range(&block) if @range_class_args and not @range_class_args.empty? self.range_class.new(*@range_class_args) else self.range_class.new(min(&block), max(&block)) end end |
#range_class ⇒ Object
When creating a range, what class will it be? Defaults to Range, but other classes are sometimes useful.
307 308 309 |
# File 'lib/just_enumerable_stats.rb', line 307 def range_class @range_class ||= Range end |
#rank(&block) ⇒ Object
Doesn’t overwrite things like Matrix#rank
335 336 337 338 339 340 341 342 343 344 345 346 347 |
# File 'lib/just_enumerable_stats.rb', line 335 def rank(&block) sorted = new_sort(&block) if block_given? map { |i| sorted.index(yield(i)) + 1 } elsif default_block map { |i| sorted.index(default_block[*i]) + 1 } else map { |i| sorted.index(i) + 1 } end end |
#set_range(hash) ⇒ Object
Takes a hash of arrays for categories If Facets happens to be loaded on the computer, this keeps the order of the categories straight.
253 254 255 256 257 258 259 260 261 262 263 |
# File 'lib/just_enumerable_stats.rb', line 253 def set_range(hash) if defined?(Dictionary) @range_hash = Dictionary.new @range_hash.merge!(hash) @categories = @range_hash.keys else @categories = hash.keys @range_hash = hash end @categories end |
#set_range_class(klass, *args) ⇒ Object
Useful for setting a real range class (FixedRange).
244 245 246 247 248 |
# File 'lib/just_enumerable_stats.rb', line 244 def set_range_class(klass, *args) @range_class = klass @range_class_args = args self.range_class end |
#sigma_pairs(other, z = zero, &block) ⇒ Object
Sigma of pairs. Returns a single float, or whatever object is sent in. Example: [1,2,3].sigma_pairs(, 0) {|x, y| x + y} returns 21 instead of 21.0.
525 526 527 |
# File 'lib/just_enumerable_stats.rb', line 525 def sigma_pairs(other, z=zero, &block) self.to_pairs(other,&block).inject(z) {|sum, i| sum += i} end |
#standard_deviation(&block) ⇒ Object Also known as: std
The standard deviation. Uses a block or default block.
134 135 136 |
# File 'lib/just_enumerable_stats.rb', line 134 def standard_deviation(&block) Math::sqrt(variance(&block)) end |
#sum ⇒ Object
Adds up the list. Uses a block or default block if present.
100 101 102 103 104 105 106 107 108 109 110 |
# File 'lib/just_enumerable_stats.rb', line 100 def sum sum = zero if block_given? each{|i| sum += yield(i)} elsif default_block each{|i| sum += default_block[*i]} else each{|i| sum += i} end sum end |
#tanimoto_pairs(other) ⇒ Object Also known as: tanimoto_correlation
Finds the tanimoto coefficient: the intersection set size / union set size. This is used to find the distance between two vectors. >> [1,2,3].cor()
> 0.981980506061966
> 0.5
476 477 478 |
# File 'lib/just_enumerable_stats.rb', line 476 def tanimoto_pairs(other) intersect(other).size / union(other).size.to_f end |
#to_pairs(other, &block) ⇒ Object
There are going to be a lot more of these kinds of things, so pay attention.
465 466 467 468 |
# File 'lib/just_enumerable_stats.rb', line 465 def to_pairs(other, &block) n = [self.size, other.size].min (0...n).map {|i| block.call(self[i], other[i]) } end |
#union(other) ⇒ Object
All of the left and right hand sides, excluding duplicates. “The union of x and y”
487 488 489 |
# File 'lib/just_enumerable_stats.rb', line 487 def union(other) self | other end |
#variance(&block) ⇒ Object Also known as: var
The variance, uses a block or default block.
120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/just_enumerable_stats.rb', line 120 def variance(&block) m = mean(&block) sum_of_differences = if block_given? sum{ |i| j=yield(i); (m - j) ** 2 } elsif default_block sum{ |i| j=default_block[*i]; (m - j) ** 2 } else sum{ |i| (m - i) ** 2 } end sum_of_differences / (size - 1) end |
#yield_transpose(*enums, &block) ⇒ Object
Transposes arrays of arrays and yields a block on the value. The regular Array#transpose ignores blocks
574 575 576 577 578 579 |
# File 'lib/just_enumerable_stats.rb', line 574 def yield_transpose(*enums, &block) enums.unshift(self) n = enums.map{ |x| x.size}.min block ||= lambda{|e| e} (0...n).map { |i| block.call enums.map{ |x| x[i] } } end |