Class: Company::Mapping::InverseDocumentFrequency
- Inherits:
-
Object
- Object
- Company::Mapping::InverseDocumentFrequency
- Defined in:
- lib/company/mapping/tfidf/idf/inverse_document_frequency.rb
Overview
InverseDocumentFrequency consists the basic implementation of inverse document frequency. It is the logarithmically scaled inverse fraction of the documents that contain the token, obtained by dividing the total number of documents by the number of documents containing the token, and then taking the logarithm of that quotient.
Instance Method Summary collapse
-
#calculate ⇒ Object
Calculates the basic Inverse Document Frequency of each token contained in a corpus of documents.
-
#initialize(corpus) ⇒ InverseDocumentFrequency
constructor
A new instance of InverseDocumentFrequency.
- #maxIDF ⇒ Object
Constructor Details
#initialize(corpus) ⇒ InverseDocumentFrequency
Returns a new instance of InverseDocumentFrequency.
9 10 11 |
# File 'lib/company/mapping/tfidf/idf/inverse_document_frequency.rb', line 9 def initialize(corpus) @corpus = corpus end |
Instance Method Details
#calculate ⇒ Object
Calculates the basic Inverse Document Frequency of each token contained in a corpus of documents.
14 15 16 17 18 |
# File 'lib/company/mapping/tfidf/idf/inverse_document_frequency.rb', line 14 def calculate document_frequency.each_with_object({}) do |(word, freq), idf| idf[word] = Math.log(@corpus.size/freq) end end |
#maxIDF ⇒ Object
20 21 22 |
# File 'lib/company/mapping/tfidf/idf/inverse_document_frequency.rb', line 20 def maxIDF Math.log(@corpus.size * 1.0) end |