Class: Statsample::Regression::Multiple::MatrixEngine

Inherits:
BaseEngine show all
Defined in:
lib/statsample/regression/multiple/matrixengine.rb

Overview

Pure Ruby Class for Multiple Regression Analysis, based on a covariance or correlation matrix.

Use Statsample::Regression::Multiple::RubyEngine if you have a Dataset, to avoid setting all details.

Remember: NEVER use a Covariance data if you have missing data. Use only correlation matrix on that case.

Example:

matrix=[[1.0, 0.5, 0.2], [0.5, 1.0, 0.7], [0.2, 0.7, 1.0]]

lr=Statsample::Regression::Multiple::MatrixEngine.new(matrix,2)

Direct Known Subclasses

RubyEngine

Instance Attribute Summary collapse

Attributes inherited from BaseEngine

#digits, #name, #total_cases, #valid_cases

Instance Method Summary collapse

Methods inherited from BaseEngine

#anova, #assign_names, #coeffs_t, #coeffs_tolerances, #estimated_variance_covariance_matrix, #f, #mse, #msr, #predicted, #probability, #process, #r2_adjusted, #report_building, #residuals, #se_estimate, #se_r2, #sse, #sse_direct, #ssr, #ssr_direct, #standarized_predicted, univariate?

Methods included from Summarizable

#summary

Constructor Details

#initialize(matrix, y_var, opts = Hash.new) ⇒ MatrixEngine

Create object

Raises:



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 36

def initialize(matrix,y_var, opts=Hash.new)
  matrix.extend Statsample::CovariateMatrix
  raise "#{y_var} variable should be on data" unless matrix.fields.include? y_var
  if matrix._type==:covariance
    @matrix_cov=matrix
    @matrix_cor=matrix.correlation
    @no_covariance=false
  else
    @matrix_cor=matrix
    @matrix_cov=matrix
    @no_covariance=true
  end
  
  @y_var=y_var
  @fields=matrix.fields-[y_var]
  
  @n_predictors=@fields.size
  @predictors_n=@n_predictors
  @matrix_x= @matrix_cor.submatrix(@fields)
  @matrix_x_cov= @matrix_cov.submatrix(@fields)
  raise LinearDependency, "Regressors are linearly dependent" if @matrix_x.determinant<1e-15

  
  @matrix_y = @matrix_cor.submatrix(@fields, [y_var])
  @matrix_y_cov = @matrix_cov.submatrix(@fields, [y_var])
  

  
  @y_sd=Math::sqrt(@matrix_cov.submatrix([y_var])[0,0])
  
  @x_sd=@n_predictors.times.inject({}) {|ac,i|
    ac[@matrix_x_cov.fields[i]]=Math::sqrt(@matrix_x_cov[i,i])
    ac;
  }
  
  @cases=nil
  @x_mean=@fields.inject({}) {|ac,f|
    ac[f]=0.0
    ac;
  }
  
  @y_mean=0.0
  @name=_("Multiple reggresion of %s on %s") % [@fields.join(","), @y_var]
  
  opts_default={:digits=>3}
  opts=opts_default.merge opts
  opts.each{|k,v|
      self.send("#{k}=",v) if self.respond_to? k
  }
    result_matrix=@matrix_x_cov.inverse * @matrix_y_cov

  if matrix._type==:covariance
    @coeffs=result_matrix.column(0).to_a
    @coeffs_stan=coeffs.collect {|k,v|
      coeffs[k]*@x_sd[k].quo(@y_sd)
    }
  else
    @coeffs_stan=result_matrix.column(0).to_a
    @coeffs=standarized_coeffs.collect {|k,v|
      standarized_coeffs[k]*@y_sd.quo(@x_sd[k])
    } 
  end
  @total_cases=@valid_cases=@cases
end

Instance Attribute Details

#casesObject



100
101
102
103
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 100

def cases
  raise "You should define the number of valid cases first" if @cases.nil?
  @cases
end

#digits=(value) ⇒ Object (writeonly)

Sets the attribute digits

Parameters:

  • value

    the value to set the attribute digits to.



33
34
35
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 33

def digits=(value)
  @digits = value
end

#x_meanObject

Hash of mean for predictors. By default, set to 0



26
27
28
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 26

def x_mean
  @x_mean
end

#x_sdObject

Hash of standard deviation of predictors. Only useful for Correlation Matrix, because by default is set to 1



21
22
23
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 21

def x_sd
  @x_sd
end

#y_meanObject

Mean for criteria. By default, set to 0



29
30
31
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 29

def y_mean
  @y_mean
end

#y_sdObject

Standard deviation of criterion Only useful for Correlation Matrix, because by default is set to 1



24
25
26
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 24

def y_sd
  @y_sd
end

Instance Method Details

#coeffsObject

Hash of b or raw coefficients



123
124
125
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 123

def coeffs
  assign_names(@coeffs)    
end

#coeffs_seObject

Standard Error for coefficients. Standard error of a coefficients depends on

  • Tolerance of the coeffients: Higher tolerances implies higher error

  • Higher r2 implies lower error

Reference:

  • Cohen et al. (2003). Applied Multiple Reggression / Correlation Analysis for the Behavioral Sciences



161
162
163
164
165
166
167
168
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 161

def coeffs_se
  out={}
  #mse=sse.quo(df_e)
  coeffs.each {|k,v|
    out[k]=@y_sd.quo(@x_sd[k])*Math::sqrt( 1.quo(tolerance(k)))*Math::sqrt((1-r2).quo(df_e))
  }
  out
end

#constantObject

Value of constant



118
119
120
121
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 118

def constant
  c=coeffs
  @y_mean - @fields.inject(0){|a,k| a + (c[k] * @x_mean[k])}
end

#constant_seObject

Standard error for constant. This method recreates the estimaded variance-covariance matrix using means, standard deviation and covariance matrix. So, needs the covariance matrix.



178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 178

def constant_se
  return nil if @no_covariance
  means=@x_mean
  #means[@y_var]=@y_mean
  means[:constant]=1
  sd=@x_sd
  #sd[@y_var]=@y_sd
  sd[:constant]=0
  fields=[:constant]+@matrix_cov.fields-[@y_var]
  # Recreate X'X using the variance-covariance matrix
  xt_x=Matrix.rows(fields.collect {|i|
    fields.collect {|j|
      if i==:constant or j==:constant
        cov=0
      elsif i==j
        cov=sd[i]**2
      else
        cov=@matrix_cov.submatrix(i..i,j..j)[0,0]
      end
      cov*(@cases-1)+@cases*means[i]*means[j]
    }
  })
  matrix=xt_x.inverse * mse
  matrix.collect {|i| Math::sqrt(i) if i>0 }[0,0]
end

#constant_tObject

t value for constant



170
171
172
173
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 170

def constant_t
  return nil if constant_se.nil?
  constant.to_f / constant_se
end

#df_eObject

Degrees of freedom for error



141
142
143
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 141

def df_e
  cases-@n_predictors-1
end

#df_rObject

Degrees of freedom for regression



137
138
139
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 137

def df_r
  @n_predictors
end

#rObject

Multiple correlation, on random models.



114
115
116
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 114

def r
  Math::sqrt(r2)
end

#r2Object

Get R^2 for the regression For fixed models is the coefficient of determination. On random models, is the ‘squared-multiple correlation’ Equal to

  • 1-(|R| / |R_x|) or

  • Sum(b_i*r_yi) <- used



110
111
112
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 110

def r2
  @n_predictors.times.inject(0) {|ac,i| ac+@coeffs_stan[i]* @matrix_y[i,0]} 
end

#sstObject

Total sum of squares



132
133
134
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 132

def sst
  @y_sd**2*(cases-1.0)
end

#standarized_coeffsObject

Hash of beta or standarized coefficients



128
129
130
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 128

def standarized_coeffs
  assign_names(@coeffs_stan)
end

#tolerance(var) ⇒ Object

Tolerance for a given variable defined as (1-R^2) of regression of other independent variables over the selected

Reference:



149
150
151
152
153
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 149

def tolerance(var)
  return 1 if @matrix_x.column_size==1
  lr=Statsample::Regression::Multiple::MatrixEngine.new(@matrix_x, var)
  1-lr.r2
end