Class: AliasTable

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/aliastable.rb

Overview

Generate values from a categorical distribution in constant time, regardless of the number of categories. This clever algorithm uses conditional probability to construct a table comprised of columns which have a primary value and an alias. Generating a value consists of picking any column (with equal probabilities), and then picking between the primary and the alias based on appropriate conditional probabilities.

Instance Method Summary collapse

Constructor Details

#initialize(x_set, p_value) ⇒ AliasTable

Construct an alias table from a set of values and their associated probabilities. Values and their probabilities must be synchronized, i.e., they must be arrays of the same length. Values can be anything, but the probabilities must be positive Rational numbers that sum to one.

Arguments
  • x_set -> the set of values from which to generate.

  • p_value -> the synchronized set of probabilities associated with the value set. These values should be Rationals to avoid rounding errors.

Raises
  • RuntimeError if x_set and p_values are different lengths.

  • RuntimeError if any p_value is negative.

  • RuntimeError if p_values don’t sum to one. Rationals will avoid this.



29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'lib/aliastable.rb', line 29

def initialize(x_set, p_value)
  fail 'x_set & p_value must have same length.' if x_set.size != p_value.size
  fail 'p_values must be positive' unless p_value.all? { |value| value > 0 }
  p_primary = p_value.map(&:rationalize)
  fail 'p_values must sum to 1' unless p_primary.reduce(:+) == Rational(1)
  x = x_set.clone.freeze
  len = x.length
  col_alias = Array.new(len)
  parity = Rational(1, len)
  group = p_primary.each_index.group_by { |i| p_primary[i] <=> parity }
  deficit_set = group[-1]
  surplus_set = group[1]
  if deficit_set.nil?
    @enum = Enumerator.new { |y| loop { y << x[rand(len)] } }.lazy
  else
    until deficit_set.empty?
      deficit = deficit_set.pop
      surplus = surplus_set.pop
      p_primary[surplus] -= parity - p_primary[deficit]
      p_primary[deficit] /= parity
      col_alias[deficit] = x[surplus]
      if p_primary[surplus] == parity
        p_primary[surplus] = Rational(1)
      else
        (p_primary[surplus] < parity ? deficit_set : surplus_set) << surplus
      end
    end
    @enum = Enumerator.new do |y|
      while true
        column = rand(len)
        y << ((rand <= p_primary[column]) ? x[column] : col_alias[column])
      end
    end.lazy
  end
end

Instance Method Details

#each(&block) ⇒ Object



65
66
67
# File 'lib/aliastable.rb', line 65

def each(&block)
  @enum.each(&block)
end

#nextObject Also known as: generate

Return a random outcome from this object’s distribution. The next (aka generate) method is O(1) time, but is not an inversion since two uniforms are used for each value that gets generated. The exception is that when all probabilities are equal, it is a true inversion.



75
76
77
# File 'lib/aliastable.rb', line 75

def next
  @enum.next
end