Class: RSemantic::Corpus
- Inherits:
-
Object
- Object
- RSemantic::Corpus
- Defined in:
- lib/rsemantic/corpus.rb
Instance Attribute Summary collapse
- #documents ⇒ Array<Document> readonly
Instance Method Summary collapse
-
#add_document(document) ⇒ void
(also: #<<)
Adds a new document to the index.
-
#build_index ⇒ void
Build the index.
- #find_keywords(document, num = 5) ⇒ Object
- #find_related_document(document) ⇒ Object
-
#initialize(documents = [], options = {}) ⇒ Corpus
constructor
TODO document options.
- #search(*words) ⇒ Object
- #to_s ⇒ Object
Constructor Details
#initialize(documents = [], options = {}) ⇒ Corpus
TODO document options
10 11 12 13 14 |
# File 'lib/rsemantic/corpus.rb', line 10 def initialize(documents = [], = {}) @documents = documents @options = @search = nil end |
Instance Attribute Details
#documents ⇒ Array<Document> (readonly)
4 5 6 |
# File 'lib/rsemantic/corpus.rb', line 4 def documents @documents end |
Instance Method Details
#add_document(document) ⇒ void Also known as: <<
This method returns an undefined value.
Adds a new document to the index.
20 21 22 23 |
# File 'lib/rsemantic/corpus.rb', line 20 def add_document(document) @documents << document document.corpora << self end |
#build_index ⇒ void
This method returns an undefined value.
Build the index. This is required to be able to search for words or compute related documents.
If you add new documents, you have to rebuild the index.
32 33 34 |
# File 'lib/rsemantic/corpus.rb', line 32 def build_index @search = RSemantic::Search.new(@documents.map(&:text), @options) end |
#find_keywords(document, num = 5) ⇒ Object
52 53 54 55 |
# File 'lib/rsemantic/corpus.rb', line 52 def find_keywords(document, num = 5) # TODO allow limiting keywords to words that occur in this document end |
#find_related_document(document) ⇒ Object
45 46 47 48 49 50 |
# File 'lib/rsemantic/corpus.rb', line 45 def (document) @search.(@documents.index(document)).map.with_index { |result, index| document = @documents[index] RSemantic::SearchResult.new(document, result) }.sort end |
#search(*words) ⇒ Object
36 37 38 39 40 41 42 43 |
# File 'lib/rsemantic/corpus.rb', line 36 def search(*words) # TODO raise if no index built yet results = @search.search(words) results.map.with_index { |result, index| document = @documents[index] RSemantic::SearchResult.new(document, result) }.sort end |
#to_s ⇒ Object
57 58 59 |
# File 'lib/rsemantic/corpus.rb', line 57 def to_s "#<%s %d documents, @options=%s>" % [self.class.name, @documents.size, @options.inspect] end |