Class: Statsample::Regression::Multiple::MatrixEngine

Inherits:
BaseEngine show all
Defined in:
lib/statsample/regression/multiple/matrixengine.rb

Overview

Pure Ruby Class for Multiple Regression Analysis, based on a covariance or correlation matrix.

Use Statsample::Regression::Multiple::RubyEngine if you have a Dataset, to avoid setting all details.

Remember: NEVER use a Covariance data if you have missing data. Use only correlation matrix on that case.

Example:

matrix=[[1.0, 0.5, 0.2], [0.5, 1.0, 0.7], [0.2, 0.7, 1.0]]

lr=Statsample::Regression::Multiple::MatrixEngine.new(matrix,2)

Direct Known Subclasses

RubyEngine

Instance Attribute Summary collapse

Attributes inherited from BaseEngine

#digits, #name, #total_cases, #valid_cases

Instance Method Summary collapse

Methods inherited from BaseEngine

#anova, #assign_names, #coeffs_t, #coeffs_tolerances, #estimated_variance_covariance_matrix, #f, #mse, #msr, #predicted, #probability, #process, #r2_adjusted, #report_building, #residuals, #se_estimate, #se_r2, #sse, #sse_direct, #ssr, #ssr_direct, #standarized_predicted, univariate?

Methods included from Summarizable

#summary

Constructor Details

#initialize(matrix, y_var, opts = Hash.new) ⇒ MatrixEngine

Create object

Raises:



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 36

def initialize(matrix,y_var, opts=Hash.new)
  matrix.extend Statsample::CovariateMatrix
  raise "#{y_var} variable should be on data" unless matrix.fields.include? y_var
  if matrix._type==:covariance
    @matrix_cov=matrix
    @matrix_cor=matrix.correlation
    @no_covariance=false
  else
    @matrix_cor=matrix
    @matrix_cov=matrix
    @no_covariance=true
  end
  
  @y_var=y_var
  @fields=matrix.fields-[y_var]
  
  @n_predictors=@fields.size
  @predictors_n=@n_predictors
  @matrix_x= @matrix_cor.submatrix(@fields)
  @matrix_x_cov= @matrix_cov.submatrix(@fields)
  raise LinearDependency, "Regressors are linearly dependent" if @matrix_x.determinant<1e-15

  
  @matrix_y = @matrix_cor.submatrix(@fields, [y_var])
  @matrix_y_cov = @matrix_cov.submatrix(@fields, [y_var])
  
  @y_sd=Math::sqrt(@matrix_cov.submatrix([y_var])[0,0])
  
  @x_sd=@n_predictors.times.inject({}) {|ac,i|
    ac[@matrix_x_cov.fields[i]]=Math::sqrt(@matrix_x_cov[i,i])
    ac;
  }
  
  @cases=nil
  @x_mean=@fields.inject({}) {|ac,f|
    ac[f]=0.0
    ac;
  }
  
  @y_mean=0.0
  @name=_("Multiple reggresion of %s on %s") % [@fields.join(","), @y_var]
  
  opts_default = {:digits=>3}
  opts         = opts_default.merge opts
  opts.each{|k,v|
      self.send("#{k}=",v) if self.respond_to? k
  }
    result_matrix=@matrix_x_cov.inverse * @matrix_y_cov

  if matrix._type == :covariance
    @coeffs=result_matrix.column(0).to_a
    @coeffs_stan=coeffs.collect {|k,v|
      coeffs[k]*@x_sd[k].quo(@y_sd)
    }
  else
    @coeffs_stan=result_matrix.column(0).to_a
    @coeffs=standarized_coeffs.collect {|k,v|
      standarized_coeffs[k]*@y_sd.quo(@x_sd[k])
    } 
  end
  @total_cases=@valid_cases=@cases
end

Instance Attribute Details

#casesObject



98
99
100
101
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 98

def cases
  raise "You should define the number of valid cases first" if @cases.nil?
  @cases
end

#digits=(value) ⇒ Object (writeonly)

Sets the attribute digits

Parameters:

  • value

    the value to set the attribute digits to.



33
34
35
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 33

def digits=(value)
  @digits = value
end

#x_meanObject

Hash of mean for predictors. By default, set to 0



26
27
28
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 26

def x_mean
  @x_mean
end

#x_sdObject

Hash of standard deviation of predictors. Only useful for Correlation Matrix, because by default is set to 1



21
22
23
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 21

def x_sd
  @x_sd
end

#y_meanObject

Mean for criteria. By default, set to 0



29
30
31
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 29

def y_mean
  @y_mean
end

#y_sdObject

Standard deviation of criterion Only useful for Correlation Matrix, because by default is set to 1



24
25
26
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 24

def y_sd
  @y_sd
end

Instance Method Details

#coeffsObject

Hash of b or raw coefficients



121
122
123
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 121

def coeffs
  assign_names(@coeffs)
end

#coeffs_seObject

Standard Error for coefficients. Standard error of a coefficients depends on

  • Tolerance of the coeffients: Higher tolerances implies higher error

  • Higher r2 implies lower error

Reference:

  • Cohen et al. (2003). Applied Multiple Reggression / Correlation Analysis for the Behavioral Sciences



159
160
161
162
163
164
165
166
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 159

def coeffs_se
  out={}
  #mse=sse.quo(df_e)
  coeffs.each {|k,v|
    out[k]=@y_sd.quo(@x_sd[k])*Math::sqrt( 1.quo(tolerance(k)))*Math::sqrt((1-r2).quo(df_e))
  }
  out
end

#constantObject

Value of constant



116
117
118
119
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 116

def constant
  c = coeffs
  @y_mean - @fields.inject(0) { |a,k| a + (c[k] * @x_mean[k])}
end

#constant_seObject

Standard error for constant. This method recreates the estimaded variance-covariance matrix using means, standard deviation and covariance matrix. So, needs the covariance matrix.



176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 176

def constant_se
  return nil if @no_covariance
  means=@x_mean
  #means[@y_var]=@y_mean
  means[:constant]=1
  sd=@x_sd
  #sd[@y_var]=@y_sd
  sd[:constant]=0
  fields=[:constant]+@matrix_cov.fields-[@y_var]
  # Recreate X'X using the variance-covariance matrix
  xt_x=::Matrix.rows(fields.collect {|i|
    fields.collect {|j|
      if i==:constant or j==:constant
        cov=0
      elsif i==j
        cov=sd[i]**2
      else
        cov=@matrix_cov.submatrix(i..i,j..j)[0,0]
      end
      cov*(@cases-1)+@cases*means[i]*means[j]
    }
  })
  matrix=xt_x.inverse * mse
  matrix.collect {|i| Math::sqrt(i) if i>0 }[0,0]
end

#constant_tObject

t value for constant



168
169
170
171
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 168

def constant_t
  return nil if constant_se.nil?
  constant.to_f / constant_se
end

#df_eObject

Degrees of freedom for error



139
140
141
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 139

def df_e
  cases-@n_predictors-1
end

#df_rObject

Degrees of freedom for regression



135
136
137
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 135

def df_r
  @n_predictors
end

#rObject

Multiple correlation, on random models.



112
113
114
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 112

def r
  Math::sqrt(r2)
end

#r2Object

Get R^2 for the regression For fixed models is the coefficient of determination. On random models, is the ‘squared-multiple correlation’ Equal to

  • 1-(|R| / |R_x|) or

  • Sum(b_i*r_yi) <- used



108
109
110
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 108

def r2
  @n_predictors.times.inject(0) {|ac,i| ac+@coeffs_stan[i]* @matrix_y[i,0]} 
end

#sstObject

Total sum of squares



130
131
132
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 130

def sst
  @y_sd**2*(cases-1.0)
end

#standarized_coeffsObject

Hash of beta or standarized coefficients



126
127
128
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 126

def standarized_coeffs
  assign_names(@coeffs_stan)
end

#tolerance(var) ⇒ Object

Tolerance for a given variable defined as (1-R^2) of regression of other independent variables over the selected

Reference:



147
148
149
150
151
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 147

def tolerance(var)
  return 1 if @matrix_x.column_size==1
  lr=Statsample::Regression::Multiple::MatrixEngine.new(@matrix_x, var)
  1-lr.r2
end