Module: Enumerable

Included in:
FixedRange
Defined in:
lib/just_enumerable_stats.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#range_class_argsObject (readonly)

Returns the value of attribute range_class_args.



242
243
244
# File 'lib/just_enumerable_stats.rb', line 242

def range_class_args
  @range_class_args
end

Instance Method Details

#average(&block) ⇒ Object Also known as: mean, avg

The arithmetic mean, uses a block or default block.



107
108
109
# File 'lib/just_enumerable_stats.rb', line 107

def average(&block)
  sum(&block)/size
end

#cartesian_product(other, &block) ⇒ Object Also known as: cp, permutations

Finds the cartesian product, excluding duplicates items and self- referential pairs. Yields the block value if given.



450
451
452
453
454
455
456
457
# File 'lib/just_enumerable_stats.rb', line 450

def cartesian_product(other, &block)
  x,y = self.uniq.dup, other.uniq.dup
  pairs = x.inject([]) do |cp, i|
    cp | y.map{|b| i == b ? nil : [i,b]}.compact
  end
  return pairs unless block_given?
  pairs.map{|p| yield p.first, p.last}
end

#categoriesObject

Takes the range_class and returns its map. Example: require ‘mathn’ a = [1,2,3] a range_class = FixedRange, a.min, a.max, 1/4 a.categories

> [1, 5/4, 3/2, 7/4, 2, 9/4, 5/2, 11/4, 3]

For non-numeric values, returns a unique set, ordered if possible.



217
218
219
220
221
222
223
# File 'lib/just_enumerable_stats.rb', line 217

def categories
  if self.is_numeric?
    self.range_instance.map
  else
    self.uniq.sort rescue self.uniq
  end
end

#compliment(other) ⇒ Object

Everything on the left hand side except what’s shared on the right hand side. “The relative compliment of y in x”



439
440
441
# File 'lib/just_enumerable_stats.rb', line 439

def compliment(other)
  self - other
end

#correlation(other) ⇒ Object Also known as: cor

Finds the correlation between two enumerables. Example: [1,2,3].cor [2,3,5] returns 0.981980506061966



495
496
497
498
499
500
501
502
503
504
505
506
507
508
# File 'lib/just_enumerable_stats.rb', line 495

def correlation(other)
  n = [self.size, other.size].min
  sum_of_products_of_pairs = self.sigma_pairs(other) {|a, b| a * b}
  self_sum = self.sum
  other_sum = other.sum
  sum_of_squared_self_scores = self.sum { |e| e * e }
  sum_of_squared_other_scores = other.sum { |e| e * e }
  
  numerator = (n * sum_of_products_of_pairs) - (self_sum * other_sum)
  self_denominator = ((n * sum_of_squared_self_scores) - (self_sum ** 2))
  other_denominator = ((n * sum_of_squared_other_scores) - (other_sum ** 2))
  denominator = Math.sqrt(self_denominator * other_denominator)
  return numerator / denominator
end

#cum_max(&block) ⇒ Object Also known as: cumulative_max

Example:

1,2,3,0,5].cum_max # => [1,2,3,3,5


377
378
379
380
381
382
# File 'lib/just_enumerable_stats.rb', line 377

def cum_max(&block)
  morph_list(&block).inject([]) do |list, e|
    found = (list | [e]).max
    list << (found ? found : e)
  end
end

#cum_min(&block) ⇒ Object Also known as: cumulative_min

Example:

1,2,3,0,5].cum_min # => [1,1,1,0,0


387
388
389
390
391
392
# File 'lib/just_enumerable_stats.rb', line 387

def cum_min(&block)
  morph_list(&block).inject([]) do |list, e|
    found = (list | [e]).min
    list << (found ? found : e)
  end
end

#cum_prod(sorted = false, &block) ⇒ Object Also known as: cumulative_product

The cummulative product. Example:

1,2,3].cum_prod # => [1.0, 2.0, 6.0


350
351
352
353
354
355
356
357
358
359
360
# File 'lib/just_enumerable_stats.rb', line 350

def cum_prod(sorted=false, &block)
  prod = one
  obj = sorted ? self.new_sort : self
  if block_given?
    obj.map { |i| prod *= yield(i) }
  elsif default_block
    obj.map { |i| prod *= default_block[*i] }
  else
    obj.map { |i| prod *= i }
  end
end

#cum_sum(sorted = false, &block) ⇒ Object Also known as: cumulative_sum

The cummulative sum. Example:

1,2,3].cum_sum # => [1, 3, 6


335
336
337
338
339
340
341
342
343
344
345
# File 'lib/just_enumerable_stats.rb', line 335

def cum_sum(sorted=false, &block)
  sum = zero
  obj = sorted ? self.new_sort : self
  if block_given?
    obj.map { |i| sum += yield(i) }
  elsif default_block
    obj.map { |i| sum += default_block[*i] }
  else
    obj.map { |i| sum += i }
  end
end

#default_blockObject

The block called to filter the values in the object.



68
69
70
# File 'lib/just_enumerable_stats.rb', line 68

def default_block
  @default_stat_block 
end

#default_block=(block) ⇒ Object

Allows me to setup a block for a series of operations. Example: a = [1,2,3] a.sum # => 6.0 a.default_block = lambda{|e| 1 / e} a.sum # => 1.0



77
78
79
# File 'lib/just_enumerable_stats.rb', line 77

def default_block=(block)
  @default_stat_block = block
end

#euclidian_distance(other) ⇒ Object

Returns the Euclidian distance between all points of a set of enumerables



469
470
471
# File 'lib/just_enumerable_stats.rb', line 469

def euclidian_distance(other)
  Math.sqrt(self.sigma_pairs(other) {|a, b| (a - b) ** 2})
end

#exclusive_not(other) ⇒ Object

Everything but what’s shared



444
445
446
# File 'lib/just_enumerable_stats.rb', line 444

def exclusive_not(other)
  (self | other) - (self & other)
end

#intersect(other) ⇒ Object

What’s shared on the left and right hand sides “The intersection of x and y”



432
433
434
# File 'lib/just_enumerable_stats.rb', line 432

def intersect(other)
  self & other
end

#is_numeric?Boolean

Returns:

  • (Boolean)


225
226
227
# File 'lib/just_enumerable_stats.rb', line 225

def is_numeric?
  self.all? {|e| e.is_a?(Numeric)}
end

#max(&block) ⇒ Object

Returns the max, using an optional block.



42
43
44
45
46
47
# File 'lib/just_enumerable_stats.rb', line 42

def max(&block)
  self.inject do |best, e|
    val = block_sorter(best, e, &block)
    best = val > 0 ? best : e
  end
end

#max_index(&block) ⇒ Object

Returns the first index of the max value



50
51
52
# File 'lib/just_enumerable_stats.rb', line 50

def max_index(&block)
  self.index(max(&block))
end

#max_of_lists(*enums) ⇒ Object

Returns the max of two or more enumerables. >> [1,2,3].max_of_lists(, [0,2,9])

> [1, 5, 9]



523
524
525
# File 'lib/just_enumerable_stats.rb', line 523

def max_of_lists(*enums)
  yield_transpose(*enums) {|e| e.max}
end

#median(ratio = 0.5, &block) ⇒ Object

The slow way is to iterate up to the middle point. A faster way is to use the index, when available. If a block is supplied, always iterate to the middle point.



136
137
138
139
140
141
142
143
144
145
146
147
# File 'lib/just_enumerable_stats.rb', line 136

def median(ratio=0.5, &block)
  return iterate_midway(ratio, &block) if block_given?
  begin
    mid1, mid2 = middle_two
    sorted = new_sort
    med1, med2 = sorted[mid1], sorted[mid2]
    return med1 if med1 == med2
    return med1 + ((med2 - med1) * ratio)
  rescue
    iterate_midway(ratio, &block)
  end
end

#min(&block) ⇒ Object

Min of any number of items



55
56
57
58
59
60
# File 'lib/just_enumerable_stats.rb', line 55

def min(&block)
  self.inject do |best, e|
    val = block_sorter(best, e, &block)
    best = val < 0 ? best : e
  end
end

#min_index(&block) ⇒ Object

Returns the first index of the min value



63
64
65
# File 'lib/just_enumerable_stats.rb', line 63

def min_index(&block)
  self.index(min(&block))
end

#min_of_lists(*enums) ⇒ Object

Returns the min of two or more enumerables. >> [1,2,3].min_of_lists(, [0,2,9])

> [0, 2, 3]



530
531
532
# File 'lib/just_enumerable_stats.rb', line 530

def min_of_lists(*enums)
  yield_transpose(*enums) {|e| e.min}
end

#new_sort(&block) ⇒ Object

I don’t pass the block to the sort, because a sort block needs to look something like: {|x,y| x <=> y}. To get around this, set the default block on the object.



263
264
265
266
267
268
269
270
271
# File 'lib/just_enumerable_stats.rb', line 263

def new_sort(&block)
  if block_given?
    map { |i| yield(i) }.sort.dup
  elsif default_block
    map { |i| default_block[*i] }.sort.dup
  else
    sort().dup
  end
end

#order(&block) ⇒ Object

Given values like [10,5,5,1] Rank should produce something like [4,2,2,1] And order should produce something like [4,2,3,1] The trick is that rank skips as many as were duplicated, so there could not be a 3 in the rank from the example above.



293
294
295
296
297
298
299
300
301
302
# File 'lib/just_enumerable_stats.rb', line 293

def order(&block)
  hold = []
  rank(&block).each do |x|
    while hold.include?(x) do
      x += 1
    end
    hold << x
  end
  hold
end

#original_maxObject



26
# File 'lib/just_enumerable_stats.rb', line 26

alias :original_max :max

#original_minObject



27
# File 'lib/just_enumerable_stats.rb', line 27

alias :original_min :min

#productObject

Multiplies the values: >> product(1,2,3)

> 6.0



398
399
400
# File 'lib/just_enumerable_stats.rb', line 398

def product
  self.inject(one) {|sum, a| sum *= a}
end

#quantile(&block) ⇒ Object

First quartile: nth_split_by_m(1, 4) Third quartile: nth_split_by_m(3, 4) Median: nth_split_by_m(1, 2) Doesn’t match R, and it’s silly to try to. def nth_split_by_m(n, m)

sorted  = new_sort
dividers = m - 1
if size % m == dividers # Divides evenly
  # Because we have a 0-based list, we get the floor
  i = ((size / m.to_f) * n).floor
  j = i
else
  # This reflects R's approach, which I don't think I agree with.
  i = (((size / m.to_f) * n) - 1)
  i = i > (size / m.to_f) ? i.floor : i.ceil
  j = i + 1
end
sorted[i] + ((n / m.to_f) * (sorted[j] - sorted[i]))

end



323
324
325
326
327
328
329
330
331
# File 'lib/just_enumerable_stats.rb', line 323

def quantile(&block)
  [
    min(&block), 
    first_half(&block).median(0.25, &block), 
    median(&block), 
    second_half(&block).median(0.75, &block), 
    max(&block)
  ]
end

#rand_in_range(*args) ⇒ Object

Returns a random integer in the range for any number of lists. This is a way to get a random vector that is tenable based on the sample data. For example, given two sets of numbers:

a = [1,2,3]; b = [8,8,8]

rand_in_pair_range will return a value >= 1 and <= 8 in the first place, >= 2 and <= 8 in the second place, and >= 3 and <= 8 in the last place. Works for integers. Rethink this for floats. May consider setting up FixedRange for floats. O(n*5)



484
485
486
487
488
489
490
# File 'lib/just_enumerable_stats.rb', line 484

def rand_in_range(*args)
  min = self.min_of_lists(*args)
  max = self.max_of_lists(*args)
  (0...size).inject([]) do |ary, i|
    ary << rand_between(min[i], max[i])
  end
end

#range(&block) ⇒ Object

Just an array of [min, max] to comply with R uses of the work. Use range_as_range if you want a real Range.



231
232
233
# File 'lib/just_enumerable_stats.rb', line 231

def range(&block)
  [min(&block), max(&block)]
end

#range_as_range(&block) ⇒ Object Also known as: range_instance

Actually instantiates the range, instead of producing a min and max array.



251
252
253
254
255
256
257
# File 'lib/just_enumerable_stats.rb', line 251

def range_as_range(&block)
  if @range_class_args and not @range_class_args.empty?
    self.range_class.new(*@range_class_args)
  else
    self.range_class.new(min(&block), max(&block))
  end
end

#range_classObject

When creating a range, what class will it be? Defaults to Range, but other classes are sometimes useful.



246
247
248
# File 'lib/just_enumerable_stats.rb', line 246

def range_class
  @range_class ||= Range
end

#rank(&block) ⇒ Object

Doesn’t overwrite things like Matrix#rank



274
275
276
277
278
279
280
281
282
283
284
285
286
# File 'lib/just_enumerable_stats.rb', line 274

def rank(&block)

  sorted = new_sort(&block)

  if block_given?
    map { |i| sorted.index(yield(i)) + 1 }
  elsif default_block
    map { |i| sorted.index(default_block[*i]) + 1 }
  else
    map { |i| sorted.index(i) + 1 }
  end

end

#set_range_class(klass, *args) ⇒ Object

Useful for setting a real range class (FixedRange).



236
237
238
239
240
# File 'lib/just_enumerable_stats.rb', line 236

def set_range_class(klass, *args)
  @range_class = klass
  @range_class_args = args
  self.range_class
end

#sigma_pairs(other, z = zero, &block) ⇒ Object

Sigma of pairs. Returns a single float, or whatever object is sent in. Example: [1,2,3].sigma_pairs(, 0) {|x, y| x + y} returns 21 instead of 21.0.



464
465
466
# File 'lib/just_enumerable_stats.rb', line 464

def sigma_pairs(other, z=zero, &block)
  self.to_pairs(other,&block).inject(z) {|sum, i| sum += i}
end

#standard_deviation(&block) ⇒ Object Also known as: std

The standard deviation. Uses a block or default block.



128
129
130
# File 'lib/just_enumerable_stats.rb', line 128

def standard_deviation(&block)
  Math::sqrt(variance(&block))
end

#sumObject

Adds up the list. Uses a block or default block if present.



94
95
96
97
98
99
100
101
102
103
104
# File 'lib/just_enumerable_stats.rb', line 94

def sum
  sum = zero
  if block_given?
    each{|i| sum += yield(i)}
  elsif default_block
    each{|i| sum += default_block[*i]}
  else
    each{|i| sum += i}
  end
  sum
end

#tanimoto_pairs(other) ⇒ Object Also known as: tanimoto_correlation

Finds the tanimoto coefficient: the intersection set size / union set size. This is used to find the distance between two vectors. >> [1,2,3].cor()

> 0.981980506061966

>> [1,2,3].tanimoto_pairs()

> 0.5



415
416
417
# File 'lib/just_enumerable_stats.rb', line 415

def tanimoto_pairs(other)
  intersect(other).size / union(other).size.to_f
end

#to_pairs(other, &block) ⇒ Object

There are going to be a lot more of these kinds of things, so pay attention.



404
405
406
407
# File 'lib/just_enumerable_stats.rb', line 404

def to_pairs(other, &block)
  n = [self.size, other.size].min
  (0...n).map {|i| block.call(self[i], other[i]) }
end

#union(other) ⇒ Object

All of the left and right hand sides, excluding duplicates. “The union of x and y”



426
427
428
# File 'lib/just_enumerable_stats.rb', line 426

def union(other)
  self | other
end

#variance(&block) ⇒ Object Also known as: var

The variance, uses a block or default block.



114
115
116
117
118
119
120
121
122
123
124
# File 'lib/just_enumerable_stats.rb', line 114

def variance(&block)
  m = mean(&block)
  sum_of_differences = if block_given?
    sum{ |i| j=yield(i); (m - j) ** 2 }
  elsif default_block
    sum{ |i| j=default_block[*i]; (m - j) ** 2 }
  else
    sum{ |i| (m - i) ** 2 }
  end
  sum_of_differences / (size - 1)
end

#yield_transpose(*enums, &block) ⇒ Object

Transposes arrays of arrays and yields a block on the value. The regular Array#transpose ignores blocks



513
514
515
516
517
518
# File 'lib/just_enumerable_stats.rb', line 513

def yield_transpose(*enums, &block)
  enums.unshift(self)
  n = enums.map{ |x| x.size}.min
  block ||= lambda{|e| e}
  (0...n).map { |i| block.call enums.map{ |x| x[i] } }
end