Module: DaruLite::DataFrame::Pivotable
- Included in:
- DaruLite::DataFrame
- Defined in:
- lib/daru_lite/data_frame/pivotable.rb
Instance Method Summary collapse
-
#pivot_table(opts = {}) ⇒ Object
Pivots a data frame on specified vectors and applies an aggregate function to quickly generate a summary.
Instance Method Details
#pivot_table(opts = {}) ⇒ Object
Pivots a data frame on specified vectors and applies an aggregate function to quickly generate a summary.
Options
:index - Keys to group by on the pivot table row index. Pass vector names contained in an Array.
:vectors - Keys to group by on the pivot table column index. Pass vector names contained in an Array.
:agg - Function to aggregate the grouped values. Default to :mean. Can use any of the statistics functions applicable on Vectors that can be found in the DaruLite::Statistics::Vector module.
:values - Columns to aggregate. Will consider all numeric columns not specified in :index or :vectors. Optional.
Usage
df = DaruLite::DataFrame.new({
a: ['foo' , 'foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'bar', 'bar'],
b: ['one' , 'one', 'one', 'two', 'two', 'one', 'one', 'two', 'two'],
c: ['small','large','large','small','small','large','small','large','small'],
d: [1,2,2,3,3,4,5,6,7],
e: [2,4,4,6,6,8,10,12,14]
})
df.pivot_table(index: [:a], vectors: [:b], agg: :sum, values: :e)
#=>
# #<DaruLite::DataFrame:88342020 @name = 08cdaf4e-b154-4186-9084-e76dd191b2c9 @size = 2>
# [:e, :one] [:e, :two]
# [:bar] 18 26
# [:foo] 10 12
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/daru_lite/data_frame/pivotable.rb', line 38 def pivot_table(opts = {}) raise ArgumentError, 'Specify grouping index' if Array(opts[:index]).empty? index = opts[:index] vectors = opts[:vectors] || [] aggregate_function = opts[:agg] || :mean values = prepare_pivot_values index, vectors, opts raise IndexError, 'No numeric vectors to aggregate' if values.empty? grouped = group_by(index) return grouped.send(aggregate_function) if vectors.empty? super_hash = make_pivot_hash grouped, vectors, values, aggregate_function pivot_dataframe super_hash end |