Class: Statsample::Bivariate::Tetrachoric

Inherits:
Object
  • Object
show all
Includes:
GetText
Defined in:
lib/statsample/bivariate/tetrachoric.rb

Overview

Compute tetrachoric correlation.

The tetrachoric correlation is a measure of bivariate association arising when both observed variates are categorical variables that result from dichotomizing the two undelying continuous variables (Drasgow, 2006). The tetrachoric correlation is a good way to measure rater agreement (Uebersax, 2006)

This class uses Brown (1977) algorithm. You can see FORTRAN code on lib.stat.cmu.edu/apstat/116

Usage

With two variables x and y on a crosstab like this:

      -------------
      | y=0 | y=1 |
      -------------
x = 0 |  a  |  b  |
      -------------
x = 1 |  c  |  d  |
      -------------

The code will be

tc=Statsample::Bivariate::Tetrachoric.new(a,b,c,d)
tc.r # correlation
tc.se # standard error
tc.threshold_y # threshold for y variable
tc.threshold_x # threshold for x variable

References:

  • Brown, MB. (1977) Algorithm AS 116: the tetrachoric correlation and its standard error. Applied Statistics, 26, 343-351.

  • Drasgow F. (2006). Polychoric and polyserial correlations. In Kotz L, Johnson NL (Eds.), Encyclopedia of statistical sciences. Vol. 7 (pp. 69-74). New York: Wiley.

  • Uebersax, J.S. (2006). The tetrachoric and polychoric correlation coefficients. Statistical Methods for Rater Agreement web site. 2006. Available at: john-uebersax.com/stat/tetra.htm . Accessed February, 11, 2010

Constant Summary collapse

TWOPI =
Math::PI*2
SQT2PI =
2.50662827
RLIMIT =
0.9999
RCUT =
0.95
UPLIM =
5.0
CONST =
1E-36
CHALF =
1E-18
CONV =
1E-8
CITER =
1E-6
NITER =
25
X =
[0,0.9972638618,  0.9856115115,  0.9647622556, 0.9349060759,  0.8963211558, 0.8493676137, 0.7944837960, 0.7321821187, 0.6630442669, 0.5877157572, 0.5068999089, 0.4213512761, 0.3318686023, 0.2392873623, 0.1444719616, 0.0483076657]
W =
[0, 0.0070186100,  0.0162743947,  0.0253920653, 0.0342738629,  0.0428358980,  0.0509980593, 0.0586840935,  0.0658222228,  0.0723457941, 0.0781938958, 0.0833119242, 0.0876520930, 0.0911738787, 0.0938443991, 0.0956387201, 0.0965400885]

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(a, b, c, d) ⇒ Tetrachoric

Creates a new tetrachoric object for analysis



137
138
139
140
141
142
143
144
145
# File 'lib/statsample/bivariate/tetrachoric.rb', line 137

def initialize(a,b,c,d)
  @a,@b,@c,@d=a,b,c,d
  @name=_("Tetrachoric correlation")
  #
  #       CHECK IF ANY CELL FREQUENCY IS NEGATIVE
  #
  raise "All frequencies should be positive" if  (@a < 0 or @b < 0 or @c < 0  or @d < 0)
  compute
end

Instance Attribute Details

#nameObject

Returns the value of attribute name.



63
64
65
# File 'lib/statsample/bivariate/tetrachoric.rb', line 63

def name
  @name
end

#rObject (readonly)

Returns the value of attribute r.



62
63
64
# File 'lib/statsample/bivariate/tetrachoric.rb', line 62

def r
  @r
end

Class Method Details

.new_with_matrix(m) ⇒ Object

Creates a Tetrachoric object based on a 2x2 Matrix.



78
79
80
# File 'lib/statsample/bivariate/tetrachoric.rb', line 78

def self.new_with_matrix(m)
  Tetrachoric.new(m[0,0], m[0,1], m[1,0],m[1,1])
end

.new_with_vectors(v1, v2) ⇒ Object

Creates a Tetrachoric object based on two vectors. The vectors are dichotomized previously.



83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# File 'lib/statsample/bivariate/tetrachoric.rb', line 83

def self.new_with_vectors(v1,v2)
  v1a, v2a=Statsample.only_valid(v1,v2)
  v1a=v1a.dichotomize
  v2a=v2a.dichotomize
  raise "v1 have only 0" if v1a.factors==[0]
  raise "v2 have only 0" if v2a.factors==[0]
  a,b,c,d = 0,0,0,0
  v1a.each_index{|i|
    x,y=v1a[i],v2a[i]
    a+=1 if x==0 and y==0
    b+=1 if x==0 and y==1
    c+=1 if x==1 and y==0
    d+=1 if x==1 and y==1
  }
  Tetrachoric.new(a,b,c,d)
end

Instance Method Details

#report_building(generator) ⇒ Object

:nodoc:



120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# File 'lib/statsample/bivariate/tetrachoric.rb', line 120

def report_building(generator) # :nodoc:
  section=ReportBuilder::Section.new(:name=>@name)
  t=ReportBuilder::Table.new(:name=>_("Contingence Table"),:header=>["","Y=0","Y=1", "T"])
  t.row(["X=0", @a,@b,@a+@b])
  t.row(["X=1", @c,@d,@c+@d])
  t.hr
  t.row(["T", @a+@c,@b+@d,@a+@b+@c+@d])
  section.add(t)
  #generator.parse_element(t)
  section.add(sprintf("r: %0.3f",r))
  section.add(_("SE: %0.3f") % se)
  section.add(_("Threshold X: %0.3f ") % [threshold_x] )
  section.add(_("Threshold Y: %0.3f ") % [threshold_y] )
  generator.parse_element(section)
end

#seObject

Standard error



100
101
102
# File 'lib/statsample/bivariate/tetrachoric.rb', line 100

def se
  @sdr
end

#summaryObject

Summary of the analysis



116
117
118
# File 'lib/statsample/bivariate/tetrachoric.rb', line 116

def summary
  ReportBuilder.new(:name=>@name).add(self).to_text
end

#threshold_xObject

Threshold for variable x (rows) Point on gauss curve under X rater select cases



105
106
107
# File 'lib/statsample/bivariate/tetrachoric.rb', line 105

def threshold_x
  @zab
end

#threshold_yObject

Threshold for variable y (columns) Point on gauss curve under Y rater select cases



112
113
114
# File 'lib/statsample/bivariate/tetrachoric.rb', line 112

def threshold_y
  @zac
end