Class: Hoatzin::FeatureVector::Builder
- Inherits:
-
Object
- Object
- Hoatzin::FeatureVector::Builder
- Defined in:
- lib/feature_vector/builder.rb
Overview
A algebraic model for representing text documents as vectors of identifiers. A document is represented as a vector. Each dimension of the vector corresponds to a separate term. If a term occurs in the document, then the value in the vector is non-zero.
Instance Attribute Summary collapse
-
#vector_keyword_index ⇒ Object
Returns the value of attribute vector_keyword_index.
Instance Method Summary collapse
- #build_document_matrix(documents) ⇒ Object
- #build_query_vector(text) ⇒ Object
-
#initialize(options = {}) ⇒ Builder
constructor
A new instance of Builder.
- #marshal_dump ⇒ Object
- #marshal_load(ary) ⇒ Object
Constructor Details
#initialize(options = {}) ⇒ Builder
Returns a new instance of Builder.
12 13 14 15 16 |
# File 'lib/feature_vector/builder.rb', line 12 def initialize(={}) @parser = .delete(:parser) = @parsed_document_cache = [] end |
Instance Attribute Details
#vector_keyword_index ⇒ Object
Returns the value of attribute vector_keyword_index.
10 11 12 |
# File 'lib/feature_vector/builder.rb', line 10 def vector_keyword_index @vector_keyword_index end |
Instance Method Details
#build_document_matrix(documents) ⇒ Object
18 19 20 21 22 23 24 25 |
# File 'lib/feature_vector/builder.rb', line 18 def build_document_matrix(documents) @vector_keyword_index = build_vector_keyword_index(documents) document_matrix = [] document_matrix += documents.enum_for(:each_with_index).map{|document,document_id| build_vector(document, document_id)} Model.new(document_matrix, @vector_keyword_index) end |
#build_query_vector(text) ⇒ Object
27 28 29 |
# File 'lib/feature_vector/builder.rb', line 27 def build_query_vector(text) build_vector(text) end |
#marshal_dump ⇒ Object
31 32 33 |
# File 'lib/feature_vector/builder.rb', line 31 def marshal_dump [@parser, , @parsed_document_cache, @vector_keyword_index] end |
#marshal_load(ary) ⇒ Object
35 36 37 |
# File 'lib/feature_vector/builder.rb', line 35 def marshal_load(ary) @parser, , @parsed_document_cache, @vector_keyword_index = ary end |