Class: Stamina::Abbadingo::RandomSample

Inherits:
Object
  • Object
show all
Defined in:
lib/stamina-induction/stamina/abbadingo/random_sample.rb

Overview

Generates a random Sample using the Abbadingo protocol.

Defined Under Namespace

Classes: StringEnumerator

Class Method Summary collapse

Class Method Details

.execute(classifier, max_length = classifier.depth + 3) ⇒ Object

Generates a Sample instance with nb strings randomly sampled with a uniform distribution over all strings up



111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 111

def self.execute(classifier, max_length = classifier.depth + 3)
  enum = StringEnumerator.new(max_length)

  # We generate 1800 strings for the test set plus n^2/2 strings for
  # the training set. If there are no enough strings available, we generate
  # the maximum we can
  seen = {}
  nb = Math.min(1800 + (classifier.state_count**2), enum.max)

  # Let's go now
  enum.each do |s|
    seen[s] = true
    seen.size < nb
  end

  # Make them
  strings = seen.keys.collect{|s| InputString.new(s, classifier.accepts?(s))}
  pos, neg = strings.partition{|s| s.positive?}

  # Split them, 1800 in test and the rest in training set
  if (pos.size > 900) && (neg.size > 900)
    pos_test, pos_training = pos[0...900], pos[900..-1]
    neg_test, neg_training = neg[0...900], neg[900..-1]
  else
    pos_test, pos_training = pos.partition{|s| Kernel.rand < 0.5}
    neg_test, neg_training = neg.partition{|s| Kernel.rand < 0.5}
  end
  flusher = lambda{|x,y| Kernel.rand < 0.5 ? 1 : -1}
  training = (pos_training + neg_training).sort &flusher
  test = (pos_test + neg_test).sort &flusher
  [Sample.new(training), Sample.new(test)]
end