# Module: Daru::Maths::Statistics::DataFrame

Included in:
DataFrame
Defined in:
lib/daru/maths/statistics/dataframe.rb

## Instance Method Summary collapse

• Calculate Autocorrelation coefficient.

• #correlation ⇒ Object (also: #corr)

Calculate the correlation between the numeric vectors.

• Count the number of non-nil values in each vector.

• #covariance ⇒ Object (also: #cov)

Calculate sample variance-covariance between the numeric vectors.

• Calculate cumulative sum of each numeric Vector.

• Create a summary of mean, standard deviation, count, max and min of each numeric vector in the dataframe in one shot.

• Calculate exponential moving average.

• Calculate the maximum value of each numeric vector.

• Calculate mean of numeric vectors.

• Calculate median of numeric vectors.

• Calculate the minimum value of each numeric vector.

• Calculate mode of numeric vectors.

• The percent_change method computes the percent change over the given number of periods for numeric vectors.

• Compute the product of each numeric vector.

• Calculate range of numeric vectors.

• Calculate moving non-missing count.

• Calculate moving max.

• Calculate moving averages.

• Calculate moving median.

• Calculate moving min.

• Calculate moving standard deviation.

• Calculate moving variance.

• Standardize each Vector.

• #std ⇒ Object (also: #sds)

Calculate sample standard deviation of numeric vectors.

• Calculate sum of numeric vectors.

• #variance_sample ⇒ Object (also: #variance)

Calculate sample variance of numeric vectors.

## Instance Method Details

### #acf(max_lags) ⇒ Object

Calculate Autocorrelation coefficient

Parameters:

• max_lags (Integer)

(nil) Number of initial lags

 ``` 73 74 75 76 77 78 79 80``` ```# File 'lib/daru/maths/statistics/dataframe.rb', line 73 %i[ cumsum standardize acf ema rolling_mean rolling_median rolling_max rolling_min rolling_count rolling_std rolling_variance rolling_sum ].each do |meth| define_method(meth) do |*args| apply_method_to_numerics meth, *args end end```

### #correlation ⇒ ObjectAlso known as: corr

Calculate the correlation between the numeric vectors.

 ``` 154 155 156 157 158 159 160 161 162``` ```# File 'lib/daru/maths/statistics/dataframe.rb', line 154 def correlation standard_deviation = std.to_matrix corr_arry = cov .to_matrix .elementwise_division(standard_deviation.transpose * standard_deviation).to_a Daru::DataFrame.rows(corr_arry, index: numeric_vectors, order: numeric_vectors) end```

### #count ⇒ Object

Count the number of non-nil values in each vector

### #covariance ⇒ ObjectAlso known as: cov

Calculate sample variance-covariance between the numeric vectors.

 ``` 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149``` ```# File 'lib/daru/maths/statistics/dataframe.rb', line 134 def covariance cache = Hash.new do |h, (col, row)| value = vector_cov(self[row],self[col]) h[[col, row]] = value h[[row, col]] = value end vectors = numeric_vectors mat_rows = vectors.collect do |row| vectors.collect do |col| row == col ? self[row].variance : cache[[col,row]] end end Daru::DataFrame.rows(mat_rows, index: numeric_vectors, order: numeric_vectors) end```

### #cumsum ⇒ Object

Calculate cumulative sum of each numeric Vector

### #describe(methods = nil) ⇒ Object

Create a summary of mean, standard deviation, count, max and min of each numeric vector in the dataframe in one shot.

## Arguments

`methods` - An array with aggregation methods specified as symbols to be applied to numeric vectors. Default is [:count, :mean, :std, :max, :min]. Methods will be applied in the specified order.

 ``` 90 91 92 93 94 95 96 97 98``` ```# File 'lib/daru/maths/statistics/dataframe.rb', line 90 def describe methods=nil methods ||= %i[count mean std min max] description_hash = {} numeric_vectors.each do |vec| description_hash[vec] = methods.map { |m| self[vec].send(m) } end Daru::DataFrame.new(description_hash, index: methods) end```

### #ema(n, wilder) ⇒ Object

Calculate exponential moving average.

Parameters:

• n (Integer)

(10) Loopback length.

• wilder (TrueClass, FalseClass, NilClass)

(false) If true, 1/n value is used for smoothing; if false, uses 2/(n+1) value.

### #max(opts = {}) ⇒ Object

Calculate the maximum value of each numeric vector.

 ``` 32 33 34 35 36 37 38``` ```# File 'lib/daru/maths/statistics/dataframe.rb', line 32 def max opts={} if opts[:vector] row[*self[opts[:vector]].max_index.index.to_a] else compute_stats :max end end```

### #mean ⇒ Object

Calculate mean of numeric vectors

### #median ⇒ Object

Calculate median of numeric vectors

### #min ⇒ Object

Calculate the minimum value of each numeric vector

### #mode ⇒ Object

Calculate mode of numeric vectors

### #percent_change(periods = 1) ⇒ Object

The percent_change method computes the percent change over the given number of periods for numeric vectors.

Examples:

``````
df = Daru::DataFrame.new({
'col0' => [1,2,3,4,5,6],
'col2' => ['a','b','c','d','e','f'],
'col1' => [11,22,33,44,55,66]
},
index: ['one', 'two', 'three', 'four', 'five', 'six'],
order: ['col0', 'col1', 'col2'])
df.percent_change
#=>
#   <Daru::DataFrame:23513280 @rows: 6 @cols: 2>
#              col0                col1
#   one
#   two	   1.0	               1.0
#   three	   0.5                 0.5
#   four	   0.3333333333333333  0.3333333333333333
#   five       0.25                0.25
#   six        0.2                 0.2``````

Parameters:

• periods (Integer) (defaults to: 1)

(1) number of nils to insert at the beginning.

 ``` 124 125 126 127 128 129 130 131``` ```# File 'lib/daru/maths/statistics/dataframe.rb', line 124 def percent_change periods=1 df_numeric = only_numerics.vectors.to_a df = Daru::DataFrame.new({}, order: @order, index: @index, name: @name) df_numeric.each do |vec| df[vec] = self[vec].percent_change periods end df end```

### #product ⇒ Object

Compute the product of each numeric vector

### #range ⇒ Object

Calculate range of numeric vectors

### #rolling_count(n) ⇒ Object

Calculate moving non-missing count

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #rolling_max(n) ⇒ Object

Calculate moving max

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #rolling_mean(n) ⇒ Object

Calculate moving averages

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #rolling_median(n) ⇒ Object

Calculate moving median

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #rolling_min(n) ⇒ Object

Calculate moving min

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #rolling_std(n) ⇒ Object

Calculate moving standard deviation

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #rolling_variance(n) ⇒ Object

Calculate moving variance

Parameters:

• n (Integer)

(10) Loopback length. Default to 10.

### #standardize ⇒ Object

Standardize each Vector

### #std ⇒ ObjectAlso known as: sds

Calculate sample standard deviation of numeric vectors

### #sum ⇒ Object

Calculate sum of numeric vectors

### #variance_sample ⇒ ObjectAlso known as: variance

Calculate sample variance of numeric vectors

