Class: Hetchy::Reservoir
- Inherits:
-
Object
- Object
- Hetchy::Reservoir
- Defined in:
- lib/hetchy/reservoir.rb
Instance Attribute Summary collapse
-
#count ⇒ Object
readonly
number of samples processed.
-
#pool ⇒ Object
readonly
number of samples processed.
-
#size ⇒ Object
readonly
number of samples processed.
Instance Method Summary collapse
-
#<<(values) ⇒ Object
Add one or more values to the reservoir.
-
#clear ⇒ Object
Empty/reset the reservoir.
-
#initialize(opts = {}) ⇒ Reservoir
constructor
Create a reservoir.
-
#percentile(perc) ⇒ Object
Calculate a percentile based on the current state of the reservoir.
-
#snapshot ⇒ Object
Capture a moment in time for the reservoir for analysis.
Constructor Details
#initialize(opts = {}) ⇒ Reservoir
Create a reservoir.
11 12 13 14 15 |
# File 'lib/hetchy/reservoir.rb', line 11 def initialize(opts={}) @size = opts.fetch(:size, 1000) @lock = Mutex.new initialize_pool end |
Instance Attribute Details
#count ⇒ Object (readonly)
number of samples processed
4 5 6 |
# File 'lib/hetchy/reservoir.rb', line 4 def count @count end |
#pool ⇒ Object (readonly)
number of samples processed
4 5 6 |
# File 'lib/hetchy/reservoir.rb', line 4 def pool @pool end |
#size ⇒ Object (readonly)
number of samples processed
4 5 6 |
# File 'lib/hetchy/reservoir.rb', line 4 def size @size end |
Instance Method Details
#<<(values) ⇒ Object
Add one or more values to the reservoir.
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# File 'lib/hetchy/reservoir.rb', line 22 def << (values) Array(values).each do |value| @lock.synchronize do # sampling strategy is Vitter's algo R if count < size @pool[count] = value else index = rand(count+1) if index < @size @pool[index] = value end end @count += 1 end end end |
#clear ⇒ Object
Empty/reset the reservoir
40 41 42 |
# File 'lib/hetchy/reservoir.rb', line 40 def clear initialize_pool end |
#percentile(perc) ⇒ Object
Calculate a percentile based on the current state of the reservoir.
If you are going to calculate multiple percentiles it will be faster to #snapshot and then calculate them off of the generated Dataset.
50 51 52 |
# File 'lib/hetchy/reservoir.rb', line 50 def percentile(perc) snapshot.percentile(perc) end |
#snapshot ⇒ Object
Capture a moment in time for the reservoir for analysis. Since sampling may be ongoing this ensures we are working with data from our intended period.
58 59 60 61 62 |
# File 'lib/hetchy/reservoir.rb', line 58 def snapshot data = nil @lock.synchronize { data = @pool.dup } Dataset.new(data.compact) end |