Class: SpatialStats::Local::Stat
- Inherits:
-
Object
- Object
- SpatialStats::Local::Stat
- Defined in:
- lib/spatial_stats/local/stat.rb
Overview
Stat is the abstract base class for local stats. It defines the methods that are common between all classes and will raise a NotImplementedError on those that are specific for each type of statistic.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#field ⇒ Object
Returns the value of attribute field.
-
#scope ⇒ Object
Returns the value of attribute scope.
-
#weights ⇒ Object
Returns the value of attribute weights.
Class Method Summary collapse
-
.from_observations(x, weights) ⇒ Stat
A new instance of Stat, from vector and weights.
Instance Method Summary collapse
-
#crand(permutations, rng) ⇒ Numo::Int32
Conditional randomization algorithm used in permutation testing.
- #expectation ⇒ Object
-
#initialize(scope, field, weights) ⇒ Stat
constructor
Base class for local stats.
-
#mc(permutations = 99, seed = nil) ⇒ Array
Permutation test to determine a pseudo p-values of the
#stat
method. -
#mc_bv(permutations, seed) ⇒ Array
Permutation test to determine a pseudo p-values of the
#stat
method. -
#quads ⇒ Array
Determines what quadrant an observation is in.
- #stat ⇒ Object
-
#summary(permutations = 99, seed = nil) ⇒ Array
Summary of the statistic.
- #variance ⇒ Object
- #x=(values) ⇒ Object (also: #z=)
- #y=(values) ⇒ Object
-
#z_score ⇒ Array
Z-score for each observation of the statistic.
Constructor Details
#initialize(scope, field, weights) ⇒ Stat
Base class for local stats
12 13 14 15 16 |
# File 'lib/spatial_stats/local/stat.rb', line 12 def initialize(scope, field, weights) @scope = scope @field = field @weights = weights.standardize end |
Instance Attribute Details
#field ⇒ Object
Returns the value of attribute field.
17 18 19 |
# File 'lib/spatial_stats/local/stat.rb', line 17 def field @field end |
#scope ⇒ Object
Returns the value of attribute scope.
17 18 19 |
# File 'lib/spatial_stats/local/stat.rb', line 17 def scope @scope end |
#weights ⇒ Object
Returns the value of attribute weights.
17 18 19 |
# File 'lib/spatial_stats/local/stat.rb', line 17 def weights @weights end |
Class Method Details
.from_observations(x, weights) ⇒ Stat
A new instance of Stat, from vector and weights.
26 27 28 29 30 31 32 |
# File 'lib/spatial_stats/local/stat.rb', line 26 def self.from_observations(x, weights) raise ArgumentError, 'Data size != weights.n' if x.size != weights.n instance = new(nil, nil, weights.standardize) instance.x = x instance end |
Instance Method Details
#crand(permutations, rng) ⇒ Numo::Int32
Conditional randomization algorithm used in permutation testing. Returns a matrix with permuted index values that will be used for selecting values from the original data set.
The width of the matrix is the max number of neighbors + 1 which is way less than it would be if the original vector was shuffled in full.
This is super important because most weight matrices are very sparse so the amount of shuffling/multiplication that is done is reduced drastically.
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
# File 'lib/spatial_stats/local/stat.rb', line 84 def crand(permutations, rng) # basing this off the ESDA method # need to get k for max_neighbors # and wc for cardinalities of each item # this returns an array of length n with # (permutations x neighbors) Numo Arrays. # This helps reduce computation time because # we are only dealing with neighbors for each # entry not the entire list of permutations for each entry. n_1 = weights.n - 1 # weight counts wc = weights.wc k = wc.max + 1 prange = (0..permutations - 1).to_a ids_perm = (0..n_1 - 1).to_a Numo::Int32.cast(prange.map { ids_perm.sample(k, random: rng) }) end |
#expectation ⇒ Object
38 39 40 |
# File 'lib/spatial_stats/local/stat.rb', line 38 def expectation raise NotImplementedError, 'method expectation not implemented' end |
#mc(permutations = 99, seed = nil) ⇒ Array
Permutation test to determine a pseudo p-values of the #stat
method. Shuffles x values, recomputes #stat
for each variation, then compares to the computed one. The ratio of more extreme values to permutations is returned for each observation.
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'lib/spatial_stats/local/stat.rb', line 116 def mc(permutations = 99, seed = nil) # For local tests, we need to shuffle the values # but for each item, hold its value in place and shuffle # its neighbors. Then we will only test for that item instead # of the entire set. This will be done for each item. rng = gen_rng(seed) rids = crand(permutations, rng) n_1 = weights.n - 1 sparse = weights.sparse row_index = sparse.row_index ws = sparse.values wc = weights.wc stat_orig = stat arr = Numo::DFloat.cast(x) ids = (0..n_1).to_a observations = Array.new(weights.n) (0..n_1).each do |idx| idsi = ids.dup idsi.delete_at(idx) idsi.shuffle!(random: rng) idsi = Numo::Int32.cast(idsi) sample = arr[idsi[rids[true, 0..wc[idx] - 1]]] # account for case where there are no neighbors row_range = row_index[idx]..(row_index[idx + 1] - 1) if row_range.size.zero? observations[idx] = permutations next end wi = Numo::DFloat.cast(ws[row_range]) stat_i_new = mc_i(wi, sample, idx) stat_i_orig = stat_orig[idx] observations[idx] = mc_observation_calc(stat_i_orig, stat_i_new, permutations) end observations.map do |ri| (ri + 1.0) / (permutations + 1.0) end end |
#mc_bv(permutations, seed) ⇒ Array
Permutation test to determine a pseudo p-values of the #stat
method. Shuffles y values, hold x values, recomputes #stat
for each variation, then compares to the computed one. The ratio of more extreme values to permutations is returned for each observation.
172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
# File 'lib/spatial_stats/local/stat.rb', line 172 def mc_bv(permutations, seed) rng = gen_rng(seed) rids = crand(permutations, rng) n_1 = weights.n - 1 sparse = weights.sparse row_index = sparse.row_index ws = sparse.values wc = weights.wc stat_orig = stat arr = Numo::DFloat.cast(y) ids = (0..n_1).to_a observations = Array.new(weights.n) (0..n_1).each do |idx| idsi = ids.dup idsi.delete_at(idx) idsi.shuffle!(random: rng) idsi = Numo::Int32.cast(idsi) sample = arr[idsi[rids[true, 0..wc[idx] - 1]]] # account for case where there are no neighbors row_range = row_index[idx]..(row_index[idx + 1] - 1) if row_range.size.zero? observations[idx] = permutations next end wi = Numo::DFloat.cast(ws[row_range]) stat_i_new = mc_i(wi, sample, idx) stat_i_orig = stat_orig[idx] observations[idx] = mc_observation_calc(stat_i_orig, stat_i_new, permutations) end observations.map do |ri| (ri + 1.0) / (permutations + 1.0) end end |
#quads ⇒ Array
Determines what quadrant an observation is in. Based on its value compared to its neighbors. This does not work for all stats, since it requires that values be negative.
In a standardized array of z, high values are values greater than 0 and it’s neighbors are determined by the spatial lag and if that is positive then it’s neighbors would be high, low otherwise.
Quadrants are:
- HH
-
a high value surrounded by other high values
- LH
-
a low value surrounded by high values
- LL
-
a low value surrounded by low values
- HL
-
a high value surrounded by low values
228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 |
# File 'lib/spatial_stats/local/stat.rb', line 228 def quads # https://github.com/pysal/esda/blob/master/esda/moran.py#L925 z_lag = SpatialStats::Utils::Lag.neighbor_average(weights, z) zp = z.map(&:positive?) lp = z_lag.map(&:positive?) # hh = zp & lp # lh = zp ^ true & lp # ll = zp ^ true & lp ^ true # hl = zp next to lp ^ true hh = zp.each_with_index.map { |v, idx| v & lp[idx] } lh = zp.each_with_index.map { |v, idx| (v ^ true) & lp[idx] } ll = zp.each_with_index.map { |v, idx| (v ^ true) & (lp[idx] ^ true) } hl = zp.each_with_index.map { |v, idx| v & (lp[idx] ^ true) } # now zip lists and map them to proper terms quad_terms = %w[HH LH LL HL] hh.zip(lh, ll, hl).map do |feature| quad_terms[feature.index(true)] end end |
#stat ⇒ Object
34 35 36 |
# File 'lib/spatial_stats/local/stat.rb', line 34 def stat raise NotImplementedError, 'method stat not defined' end |
#summary(permutations = 99, seed = nil) ⇒ Array
Summary of the statistic. Computes stat
, mc
, and groups
then returns the values in a hash array.
258 259 260 261 262 263 264 |
# File 'lib/spatial_stats/local/stat.rb', line 258 def summary(permutations = 99, seed = nil) p_vals = mc(permutations, seed) data = weights.keys.zip(stat, p_vals, groups) data.map do |row| { key: row[0], stat: row[1], p: row[2], group: row[3] } end end |
#variance ⇒ Object
42 43 44 |
# File 'lib/spatial_stats/local/stat.rb', line 42 def variance raise NotImplementedError, 'method variance not implemented' end |
#x=(values) ⇒ Object Also known as: z=
46 47 48 |
# File 'lib/spatial_stats/local/stat.rb', line 46 def x=(values) @x = values.standardize end |
#y=(values) ⇒ Object
51 52 53 |
# File 'lib/spatial_stats/local/stat.rb', line 51 def y=(values) @y = values.standardize end |
#z_score ⇒ Array
Z-score for each observation of the statistic.
59 60 61 62 63 64 65 |
# File 'lib/spatial_stats/local/stat.rb', line 59 def z_score numerators = stat.map { |v| v - expectation } denominators = variance.map { |v| Math.sqrt(v) } numerators.each_with_index.map do |numerator, idx| numerator / denominators[idx] end end |