Class: Basset::Document

Inherits:
Object
  • Object
show all
Defined in:
lib/basset/document.rb

Overview

A class for representing a document as a vector of features. It takes the text of the document and the classification. The vector of features representation is just a basic bag of words approach.

Direct Known Subclasses

DocumentOverrideExample

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(text, classification = nil) ⇒ Document

Returns a new instance of Document.



9
10
11
12
# File 'lib/basset/document.rb', line 9

def initialize(text, classification = nil)
  @text           = text
  @classification = classification
end

Instance Attribute Details

#classificationObject (readonly)

Returns the value of attribute classification.



7
8
9
# File 'lib/basset/document.rb', line 7

def classification
  @classification
end

#textObject (readonly)

Returns the value of attribute text.



7
8
9
# File 'lib/basset/document.rb', line 7

def text
  @text
end

Instance Method Details

#vector_of_featuresObject



14
15
16
# File 'lib/basset/document.rb', line 14

def vector_of_features
  @feature_vector ||= vector_of_features_from_terms_hash( terms_hash_from_words_array( stemmed_words ) )
end