Class: Quantile::Estimator

Inherits:
Object
  • Object
show all
Defined in:
lib/quantile/estimator.rb

Overview

Note:

Estimator is not concurrency safe.

Estimate quantile values efficiently where both the rank and the inaccuracy allowance are known a priori. This is accomplished via Graham Cormode and S. Muthukrishnan’s Effective Computation of Biased Quantiles over Data Streams in ICDE’05.

Instance Method Summary collapse

Constructor Details

#initialize(*invariants) ⇒ Estimator

Create a streaming quantile estimator.

Parameters:

  • invariants (Quantile)

    The quantile estimation targets that are provided a priori.



34
35
36
37
38
39
40
41
42
43
44
45
# File 'lib/quantile/estimator.rb', line 34

def initialize(*invariants)
  if invariants.empty?
    invariants = [Quantile.new(0.5, 0.05), Quantile.new(0.90, 0.01), Quantile.new(0.99, 0.001)]
  end

  @invariants = invariants
  @buffer = []
  @head = nil

  @observations = 0
  @items = 0
end

Instance Method Details

#observe(value) ⇒ Object

Observe a sample value with this Quantile::Estimator.

Parameters:

  • value (Numeric)

    The value to catalog for later analysis.



52
53
54
55
56
57
# File 'lib/quantile/estimator.rb', line 52

def observe(value)
  @buffer << value
  if @buffer.size == BUFFER_SIZE
    flush
  end
end

#query(rank) ⇒ Numeric

Get a quantile value for a given rank.

Parameters:

  • rank (Float)

    The target quantile to retrieve. It must be one of the invariants provided in the constructor.

Returns:

  • (Numeric)

    The quantile value for the rank.



67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/quantile/estimator.rb', line 67

def query(rank)
  flush

  current = @head
  if current.nil?
    return 0
  end

  mid_rank = (rank * @observations).floor
  max_rank = mid_rank + (invariant(mid_rank, @observations) / 2).floor

  rank = 0.0
  while !current.successor.nil?
    rank += current.rank
    if rank + current.successor.rank + current.successor.delta > max_rank
      return current.value
    end

    current = current.successor
  end

  return current.value
end