Class: Classifier
- Inherits:
-
Object
- Object
- Classifier
- Defined in:
- lib/rbbt/bow/classifier.rb
Overview
This class uses R to build and use classification models. It needs the ‘e1071’ R package.
Instance Attribute Summary collapse
-
#terms ⇒ Object
readonly
Returns the value of attribute terms.
Class Method Summary collapse
-
.create_model(featuresfile, modelfile, dictfile = nil) ⇒ Object
Given the path to a features file, which specifies a number of instances along with their classes and features in a tab separated format, it uses R to build a svm model which is save to file in the path specified as modelfile.
Instance Method Summary collapse
-
#classify(input) ⇒ Object
This is a polymorphic method.
-
#classify_feature_array(input) ⇒ Object
:nodoc:.
-
#classify_feature_hash(input) ⇒ Object
:nodoc:.
-
#classify_text_array(input) ⇒ Object
:nodoc:.
-
#classify_text_hash(input) ⇒ Object
:nodoc:.
-
#initialize(modelfile) ⇒ Classifier
constructor
Loads an R interpreter which loads the svm model under modelfile.
Constructor Details
#initialize(modelfile) ⇒ Classifier
Loads an R interpreter which loads the svm model under modelfile.
25 26 27 28 29 30 31 32 33 34 |
# File 'lib/rbbt/bow/classifier.rb', line 25 def initialize(modelfile) @r = RSRuby.instance @r.library('e1071') @r.source(File.join(Rbbt.datadir, 'classifier/R/classify.R')) @r.load(modelfile) @model = @r.svm_model @terms = @r.eval_R("terms = unlist(attr(attr(svm.model$terms,'factors'),'dimnames')[2])") end |
Instance Attribute Details
#terms ⇒ Object (readonly)
Returns the value of attribute terms.
22 23 24 |
# File 'lib/rbbt/bow/classifier.rb', line 22 def terms @terms end |
Class Method Details
.create_model(featuresfile, modelfile, dictfile = nil) ⇒ Object
Given the path to a features file, which specifies a number of instances along with their classes and features in a tab separated format, it uses R to build a svm model which is save to file in the path specified as modelfile.
13 14 15 16 17 18 19 20 |
# File 'lib/rbbt/bow/classifier.rb', line 13 def self.create_model(featuresfile, modelfile, dictfile = nil) r = RSRuby.instance r.source(File.join(Rbbt.datadir, 'classifier/R/classify.R')) r.BOW_classification_model(featuresfile, modelfile) nil end |
Instance Method Details
#classify(input) ⇒ Object
This is a polymorphic method. The input variable may be a single input, in which case the results will be just the class, a hash of inputs, in which case the result will be a hash with the results for each input, or an array, in which case the result is an array of the results in the same order. Each input may also be in the form of a string, in which case it will be transformed into a feature vector, or an array in which case it will be considered as an feature vector itself.
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
# File 'lib/rbbt/bow/classifier.rb', line 90 def classify(input) if input.is_a? String return classify_text_array([input]).first end if input.is_a? Hash return {} if input.empty? if input.values.first.is_a? String return classify_text_hash(input) elsif input.values.first.is_a? Array return classify_feature_hash(input) end end if input.is_a? Array return [] if input.empty? if input.first.is_a? String return classify_text_array(input) elsif input.first.is_a? Array return classify_feature_array(input) end end end |
#classify_feature_array(input) ⇒ Object
:nodoc:
36 37 38 39 40 41 42 43 44 45 |
# File 'lib/rbbt/bow/classifier.rb', line 36 def classify_feature_array(input) #:nodoc: @r.assign('input', input) @r.eval_R('input = t(as.data.frame(input))') @r.eval_R('rownames(input) <- NULL') @r.eval_R('colnames(input) <- terms') results = @r.eval_R('BOW.classification.classify(svm.model, input, svm.weights)') results.sort.collect{|p| p[1]} end |
#classify_feature_hash(input) ⇒ Object
:nodoc:
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
# File 'lib/rbbt/bow/classifier.rb', line 47 def classify_feature_hash(input) #:nodoc: names = [] features = [] input.each{|name, feats| names << name.to_s features << feats } @r.assign('input', features) @r.assign('input.names', names) @r.eval_R('input = t(as.data.frame(input))') @r.eval_R('rownames(input) <- input.names') @r.eval_R('colnames(input) <- terms') @r.eval_R('BOW.classification.classify(svm.model, input, svm.weights)') end |
#classify_text_array(input) ⇒ Object
:nodoc:
65 66 67 68 69 70 71 |
# File 'lib/rbbt/bow/classifier.rb', line 65 def classify_text_array(input) #:nodoc: features = input.collect{|text| BagOfWords.features(text, @terms) } classify_feature_array(features) end |
#classify_text_hash(input) ⇒ Object
:nodoc:
73 74 75 76 77 78 79 80 |
# File 'lib/rbbt/bow/classifier.rb', line 73 def classify_text_hash(input) #:nodoc: features = {} input.each{|key,text| features[key] = BagOfWords.features(text, @terms) } classify_feature_hash(features) end |