Class: Statsample::StratifiedSample
- Inherits:
-
Object
- Object
- Statsample::StratifiedSample
- Defined in:
- lib/statsample/multiset.rb
Class Method Summary collapse
- .calculate_n_total(es) ⇒ Object
-
.mean(*vectors) ⇒ Object
mean for an array of vectors.
- .proportion_sd_esd_wor(es) ⇒ Object
- .proportion_sd_ksd_wor(es) ⇒ Object
- .proportion_sd_ksd_wr(es) ⇒ Object
- .proportion_variance_esd_wor(es) ⇒ Object
- .proportion_variance_ksd_wor(es) ⇒ Object
- .proportion_variance_ksd_wr(es) ⇒ Object
- .standard_error_esd_wor(es) ⇒ Object
- .standard_error_esd_wr(es) ⇒ Object
- .standard_error_ksd_wor(es) ⇒ Object
- .standard_error_ksd_wr(es) ⇒ Object
- .variance_esd_wor(es) ⇒ Object
-
.variance_esd_wr(es) ⇒ Object
Based on stattrek.com/Lesson6/STRAnalysis.aspx.
-
.variance_ksd_wor(es) ⇒ Object
Source : Cochran (1972).
- .variance_ksd_wr(es) ⇒ Object
Instance Method Summary collapse
-
#initialize(ms, strata_sizes) ⇒ StratifiedSample
constructor
A new instance of StratifiedSample.
-
#mean(field) ⇒ Object
Population mean based on strata.
-
#population_size ⇒ Object
Population size.
-
#proportion(field, v = 1) ⇒ Object
Population proportion based on strata.
- #proportion_sd_esd_wor(field, v = 1) ⇒ Object
- #proportion_standard_error(field, v = 1) ⇒ Object
-
#sample_size ⇒ Object
Sample size.
-
#standard_error_wor(field) ⇒ Object
Standard error with estimated population variance and without replacement.
-
#standard_error_wor_2(field) ⇒ Object
Standard error with estimated population variance and without replacement.
- #standard_error_wr(field) ⇒ Object
-
#strata_number ⇒ Object
Number of strata.
-
#stratum_ponderation(h) ⇒ Object
(also: #wh)
Stratum ponderation.
-
#stratum_size(h) ⇒ Object
Size of stratum x.
-
#variance_pst(field, v = 1) ⇒ Object
Cochran(1971), p.
- #vectors_by_field(field) ⇒ Object
Constructor Details
#initialize(ms, strata_sizes) ⇒ StratifiedSample
Returns a new instance of StratifiedSample.
157 158 159 160 161 162 163 164 165 |
# File 'lib/statsample/multiset.rb', line 157 def initialize(ms,strata_sizes) raise TypeError,"ms should be a Multiset" unless ms.is_a? Statsample::Multiset @ms=ms raise ArgumentError,"You should put a strata size for each dataset" if strata_sizes.keys.sort!=ms.datasets_names @strata_sizes=strata_sizes @population_size=@strata_sizes.inject(0) {|a,x| a+x[1]} @strata_number=@ms.n_datasets @sample_size=@ms.datasets.inject(0) {|a,x| a+x[1].cases} end |
Class Method Details
.calculate_n_total(es) ⇒ Object
75 76 77 |
# File 'lib/statsample/multiset.rb', line 75 def calculate_n_total(es) es.inject(0) {|a,h| a+h['N'] } end |
.mean(*vectors) ⇒ Object
mean for an array of vectors
53 54 55 56 57 58 59 60 |
# File 'lib/statsample/multiset.rb', line 53 def mean(*vectors) n_total=0 means=vectors.inject(0){|a,v| n_total+=v.size a+v.sum } means.to_f/n_total end |
.proportion_sd_esd_wor(es) ⇒ Object
152 153 154 |
# File 'lib/statsample/multiset.rb', line 152 def proportion_sd_esd_wor(es) Math::sqrt(proportion_variance_ksd_wor(es)) end |
.proportion_sd_ksd_wor(es) ⇒ Object
126 127 128 |
# File 'lib/statsample/multiset.rb', line 126 def proportion_sd_ksd_wor(es) Math::sqrt(proportion_variance_ksd_wor(es)) end |
.proportion_sd_ksd_wr(es) ⇒ Object
131 132 133 134 135 136 137 138 |
# File 'lib/statsample/multiset.rb', line 131 def proportion_sd_ksd_wr(es) n_total=calculate_n_total(es) sum=es.inject(0){|a,h| val= (h['N']**2 * h['p']*(1-h['p'])) / h['n'].to_f a+val } Math::sqrt(sum) * (1.0/n_total) end |
.proportion_variance_esd_wor(es) ⇒ Object
143 144 145 146 147 148 149 150 151 |
# File 'lib/statsample/multiset.rb', line 143 def proportion_variance_esd_wor(es) n_total=n_total=calculate_n_total(es) sum=es.inject(0){|a,h| a=(h['N']**2 * (h['N']-h['n']) * h['p']*(1.0-h['p'])) / ((h['n']-1)*(h['N']-1)) a+val } Math::sqrt(sum) * (1.0/n_total**2) end |
.proportion_variance_ksd_wor(es) ⇒ Object
119 120 121 122 123 124 125 |
# File 'lib/statsample/multiset.rb', line 119 def proportion_variance_ksd_wor(es) n_total=calculate_n_total(es) es.inject(0){|a,h| val= (((h['N'].to_f / n_total)**2 * h['p']*(1-h['p'])) / (h['n'])) * (1- (h['n'].to_f / h['N'])) a+val } end |
.proportion_variance_ksd_wr(es) ⇒ Object
139 140 141 |
# File 'lib/statsample/multiset.rb', line 139 def proportion_variance_ksd_wr(es) proportion_variance_ksd_wor(es)**2 end |
.standard_error_esd_wor(es) ⇒ Object
103 104 105 |
# File 'lib/statsample/multiset.rb', line 103 def standard_error_esd_wor(es) Math::sqrt(variance_ksd_wor(es)) end |
.standard_error_esd_wr(es) ⇒ Object
115 116 117 |
# File 'lib/statsample/multiset.rb', line 115 def standard_error_esd_wr(es) Math::sqrt(variance_esd_wr(es)) end |
.standard_error_ksd_wor(es) ⇒ Object
87 88 89 |
# File 'lib/statsample/multiset.rb', line 87 def standard_error_ksd_wor(es) Math::sqrt(variance_ksd_wor(es)) end |
.standard_error_ksd_wr(es) ⇒ Object
62 63 64 65 66 67 68 69 |
# File 'lib/statsample/multiset.rb', line 62 def standard_error_ksd_wr(es) n_total=0 sum=es.inject(0){|a,h| n_total+=h['N'] a+((h['N']**2 * h['s']**2) / h['n'].to_f) } (1.to_f / n_total)*Math::sqrt(sum) end |
.variance_esd_wor(es) ⇒ Object
93 94 95 96 97 98 99 100 |
# File 'lib/statsample/multiset.rb', line 93 def variance_esd_wor(es) n_total=calculate_n_total(es) sum=es.inject(0){|a,h| val=h['N']*(h['N']-h['n'])*(h['s']**2 / h['n'].to_f) a+val } (1.0/(n_total**2))*sum end |
.variance_esd_wr(es) ⇒ Object
107 108 109 110 111 112 113 114 |
# File 'lib/statsample/multiset.rb', line 107 def variance_esd_wr(es) n_total=calculate_n_total(es) sum=es.inject(0){|a,h| val= ((h['s']**2 * h['N']**2) / h['n'].to_f) a+val } (1.0/(n_total**2))*sum end |
.variance_ksd_wor(es) ⇒ Object
Source : Cochran (1972)
80 81 82 83 84 85 86 |
# File 'lib/statsample/multiset.rb', line 80 def variance_ksd_wor(es) n_total=calculate_n_total(es) es.inject(0){|a,h| val=((h['N'].to_f / n_total)**2) * (h['s']**2 / h['n'].to_f) * (1 - (h['n'].to_f / h['N'])) a+val } end |
.variance_ksd_wr(es) ⇒ Object
72 73 74 |
# File 'lib/statsample/multiset.rb', line 72 def variance_ksd_wr(es) standard_error_ksd_wr(es)**2 end |
Instance Method Details
#mean(field) ⇒ Object
Population mean based on strata
202 203 204 205 206 |
# File 'lib/statsample/multiset.rb', line 202 def mean(field) @ms.sum_field(field) {|s_name,vector| stratum_ponderation(s_name)*vector.mean } end |
#population_size ⇒ Object
Population size. Equal to sum of strata sizes Symbol: N<sub>h</sub>
172 173 174 |
# File 'lib/statsample/multiset.rb', line 172 def population_size @population_size end |
#proportion(field, v = 1) ⇒ Object
Population proportion based on strata
189 190 191 192 193 |
# File 'lib/statsample/multiset.rb', line 189 def proportion(field, v=1) @ms.sum_field(field) {|s_name,vector| stratum_ponderation(s_name)*vector.proportion(v) } end |
#proportion_sd_esd_wor(field, v = 1) ⇒ Object
235 236 237 238 239 240 241 |
# File 'lib/statsample/multiset.rb', line 235 def proportion_sd_esd_wor(field,v=1) es=@ms.collect_vector(field) {|s_n, vector| {'N'=>@strata_sizes[s_n],'n'=>vector.size, 'p'=>vector.proportion(v)} } StratifiedSample.proportion_sd_esd_wor(es) end |
#proportion_standard_error(field, v = 1) ⇒ Object
243 244 245 246 247 248 249 250 251 |
# File 'lib/statsample/multiset.rb', line 243 def proportion_standard_error(field,v=1) prop=proportion(field,v) sum=@ms.sum_field(field) {|s_name,vector| nh=vector.size s_size=@strata_sizes[s_name] (s_size**2 * (1-(nh/s_size)) * prop * (1-prop) / (nh -1 )) } (1.quo(@population_size)) * Math::sqrt(sum) end |
#sample_size ⇒ Object
Sample size. Equal to sum of sample of each stratum
176 177 178 |
# File 'lib/statsample/multiset.rb', line 176 def sample_size @sample_size end |
#standard_error_wor(field) ⇒ Object
Standard error with estimated population variance and without replacement. Source: Cochran (1972)
209 210 211 212 213 214 215 |
# File 'lib/statsample/multiset.rb', line 209 def standard_error_wor(field) es=@ms.collect_vector(field) {|s_n, vector| {'N'=>@strata_sizes[s_n],'n'=>vector.size, 's'=>vector.sds} } StratifiedSample.standard_error_esd_wor(es) end |
#standard_error_wor_2(field) ⇒ Object
Standard error with estimated population variance and without replacement. Source: stattrek.com/Lesson6/STRAnalysis.aspx
220 221 222 223 224 225 226 |
# File 'lib/statsample/multiset.rb', line 220 def standard_error_wor_2(field) sum=@ms.sum_field(field) {|s_name,vector| s_size=@strata_sizes[s_name] (s_size**2 * (1-(vector.size.to_f / s_size)) * vector.variance_sample / vector.size.to_f) } (1/@population_size.to_f)*Math::sqrt(sum) end |
#standard_error_wr(field) ⇒ Object
228 229 230 231 232 233 234 |
# File 'lib/statsample/multiset.rb', line 228 def standard_error_wr(field) es=@ms.collect_vector(field) {|s_n, vector| {'N'=>@strata_sizes[s_n],'n'=>vector.size, 's'=>vector.sds} } StratifiedSample.standard_error_esd_wr(es) end |
#strata_number ⇒ Object
Number of strata
167 168 169 |
# File 'lib/statsample/multiset.rb', line 167 def strata_number @strata_number end |
#stratum_ponderation(h) ⇒ Object Also known as: wh
Stratum ponderation. Symbol: W<sub>h</sub>
196 197 198 |
# File 'lib/statsample/multiset.rb', line 196 def stratum_ponderation(h) @strata_sizes[h].to_f / @population_size end |
#stratum_size(h) ⇒ Object
Size of stratum x
180 181 182 |
# File 'lib/statsample/multiset.rb', line 180 def stratum_size(h) @strata_sizes[h] end |
#variance_pst(field, v = 1) ⇒ Object
Cochran(1971), p. 150
253 254 255 256 257 258 259 260 261 262 263 |
# File 'lib/statsample/multiset.rb', line 253 def variance_pst(field,v=1) sum=@ms.datasets.inject(0) {|a,da| stratum_name=da[0] ds=da[1] nh=ds.cases.to_f s_size=@strata_sizes[stratum_name] prop=ds[field].proportion(v) a + (((s_size**2 * (s_size-nh)) / (s_size-1))*(prop*(1-prop) / (nh-1))) } (1/@population_size.to_f ** 2)*sum end |
#vectors_by_field(field) ⇒ Object
183 184 185 186 187 |
# File 'lib/statsample/multiset.rb', line 183 def vectors_by_field(field) @ms.datasets.collect{|k,ds| ds[field] } end |