Class: Wukong::Processor::Sample

Inherits:
Filter show all
Defined in:
lib/wukong/widget/filters.rb

Overview

A widget which samples a certain fraction of input records.

Examples:

Sampling records on the command line


$ cat input
1
2
3
4
$ cat input | wu-local sample --fraction=0.5
1
3

Sampling records in a dataflow


Wukong.dataflow(:uses_sample) do
  ... | sample(fraction: 0.5) ...
end

See Also:

Constant Summary

Constants inherited from Wukong::Processor

SerializerError

Instance Method Summary collapse

Methods inherited from Filter

#process, #reject?

Methods inherited from Wukong::Processor

configure, consumes, description, #expected_record_type, #expected_serialization, #finalize, #perform_action, #process, produces, #receive_action, #setup, #stop, valid_serializer?, validate_and_set_serialization

Methods included from Logging

included

Methods included from Hanuman::StageClassMethods

#builder, #label, #register, #set_builder

Instance Method Details

#select?(record) ⇒ true, false

Selects a record randomly, with a probability given the the fraction for this widget.

Parameters:

  • record (Object)

Returns:

  • (true, false)


313
314
315
# File 'lib/wukong/widget/filters.rb', line 313

def select?(record)
  rand() < fraction
end