Class: Company::Mapping::InverseDocumentFrequency

Inherits:
Object
  • Object
show all
Defined in:
lib/company/mapping/tfidf/idf/inverse_document_frequency.rb

Overview

InverseDocumentFrequency consists the basic implementation of inverse document frequency. It is the logarithmically scaled inverse fraction of the documents that contain the token, obtained by dividing the total number of documents by the number of documents containing the token, and then taking the logarithm of that quotient.

Instance Method Summary collapse

Constructor Details

#initialize(corpus) ⇒ InverseDocumentFrequency

Returns a new instance of InverseDocumentFrequency.



9
10
11
# File 'lib/company/mapping/tfidf/idf/inverse_document_frequency.rb', line 9

def initialize(corpus)
  @corpus = corpus
end

Instance Method Details

#calculateObject

Calculates the basic Inverse Document Frequency of each token contained in a corpus of documents.



14
15
16
17
18
# File 'lib/company/mapping/tfidf/idf/inverse_document_frequency.rb', line 14

def calculate
  document_frequency.each_with_object({}) do |(word, freq), idf|
    idf[word] = Math.log(@corpus.size/freq)
  end
end

#maxIDFObject



20
21
22
# File 'lib/company/mapping/tfidf/idf/inverse_document_frequency.rb', line 20

def maxIDF
  Math.log(@corpus.size * 1.0)
end