Class: Ai4r::Experiment::ClassifierEvaluator

Inherits:
Object
  • Object
show all
Defined in:
lib/ai4r/experiment/classifier_evaluator.rb

Overview

The ClassifierEvaluator is useful to compare different classifiers algorithms. The evaluator builds the Classifiers using the same data examples, and provides methods to evalute their performance in parallel. It is a nice tool to compare and evaluate the performance of different algorithms, the same algorithm with different parameters, or your own new algorithm against the classic classifiers.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeObject



19
20
21
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 19

def initialize
  @classifiers = []
end

Instance Attribute Details

#build_timesObject (readonly)

Returns the value of attribute build_times.



16
17
18
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 16

def build_times
  @build_times
end

#classifiersObject (readonly)

Returns the value of attribute classifiers.



16
17
18
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 16

def classifiers
  @classifiers
end

#eval_timesObject (readonly)

Returns the value of attribute eval_times.



16
17
18
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 16

def eval_times
  @eval_times
end

Instance Method Details

#add_classifier(classifier) ⇒ Object Also known as: <<

Add a classifier instance to the test batch

Parameters:

  • classifier (Object)

Returns:

  • (Object)


26
27
28
29
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 26

def add_classifier(classifier)
  @classifiers << classifier
  self
end

#build(data_set) ⇒ Object

Build all classifiers, using data examples found in data_set. The last attribute of each item is considered as the item class. Building times are measured by separate, and can be accessed through build_times attribute reader.

Parameters:

  • data_set (Object)

Returns:

  • (Object)


40
41
42
43
44
45
46
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 40

def build(data_set)
  @build_times = []
  @classifiers.each do |classifier|
    @build_times << Benchmark.measure { classifier.build data_set }
  end
  self
end

#cross_validate(data_set, k:) ⇒ Ai4r::Data::DataSet

Perform k-fold cross validation on all classifiers. The dataset is split into k folds using the Split utility. For each fold, classifiers are trained on the remaining folds and then tested on the held-out fold. The method returns a DataSet with the average time (build and test) and accuracy for each classifier.

Parameters:

Returns:



92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 92

def cross_validate(data_set, k:)
  folds = Split.split(data_set, k: k)
  times = Array.new(@classifiers.length, 0.0)
  accuracies = Array.new(@classifiers.length, 0.0)

  folds.each_with_index do |test_set, i|
    train_items = []
    folds.each_with_index do |fold, j|
      next if i == j

      train_items.concat(fold.data_items)
    end
    train_set = Ai4r::Data::DataSet.new(
      data_items: train_items,
      data_labels: data_set.data_labels
    )

    @classifiers.each_with_index do |classifier, idx|
      build_time = Benchmark.measure { classifier.build(train_set) }.real
      result = test_classifier(classifier, test_set)
      times[idx] += build_time + result[1]
      accuracies[idx] += result[3]
    end
  end

  result_items = @classifiers.each_index.map do |idx|
    [@classifiers[idx], times[idx] / k, accuracies[idx] / k]
  end
  Ai4r::Data::DataSet.new(
    data_items: result_items,
    data_labels: ['Classifier', 'Avg. Time', 'Avg. Success rate']
  )
end

#eval(data) ⇒ Object

You can evaluate new data, predicting its class. e.g.

classifier.eval(['New York',  '<30', 'F'])
=> ['Y', 'Y', 'Y', 'N', 'Y', 'Y', 'N']

Evaluation times are measured by separate, and can be accessed through eval_times attribute reader.

Parameters:

  • data (Object)

Returns:

  • (Object)


56
57
58
59
60
61
62
63
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 56

def eval(data)
  @eval_times = []
  results = []
  @classifiers.each do |classifier|
    @eval_times << Benchmark.measure { results << classifier.eval(data) }
  end
  results
end

#test(data_set) ⇒ Object

Test classifiers using a data set. The last attribute of each item is considered as the expected class. Data items are evaluated using all classifiers: evalution times, sucess rate, and quantity of classification errors are returned in a data set. The return data set has a row for every classifier tested, and the following attributes:

["Classifier", "Testing Time", "Errors", "Success rate"]

Parameters:

  • data_set (Object)

Returns:

  • (Object)


74
75
76
77
78
79
80
81
82
# File 'lib/ai4r/experiment/classifier_evaluator.rb', line 74

def test(data_set)
  result_data_items = @classifiers.map do |classifier|
    test_classifier(classifier, data_set)
  end

  Ai4r::Data::DataSet.new(data_items: result_data_items,
                          data_labels: ['Classifier',
                                        'Testing Time', 'Errors', 'Success rate'])
end