Class: Company::Mapping::CompanyMapper
- Inherits:
-
Object
- Object
- Company::Mapping::CompanyMapper
- Defined in:
- lib/company/mapping/company_mapper.rb
Overview
CompanyMapper given a corpus of documents (that contains company names) can map a new document with an existing one if one exists
Instance Method Summary collapse
-
#initialize(corpus) ⇒ CompanyMapper
constructor
A new instance of CompanyMapper.
-
#map(company, threshold) ⇒ Object
maps a given company to a company exists to the given corpus.
Constructor Details
#initialize(corpus) ⇒ CompanyMapper
Returns a new instance of CompanyMapper.
8 9 10 11 12 |
# File 'lib/company/mapping/company_mapper.rb', line 8 def initialize(corpus) @corpus = corpus @tfidf = TFIDF.new(@corpus) @tfidf.calculate end |
Instance Method Details
#map(company, threshold) ⇒ Object
maps a given company to a company exists to the given corpus. If the maximum name similarity found exceeds the given threshold then the company’s id is returned as a match
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/company/mapping/company_mapper.rb', line 16 def map(company, threshold) if (company.is_a? String) content = company company = TextDocument.new company.contents = content company.id = "new_comp" end @tfidf.calculate_tfidf_weights_of_new_document(company) maxSim = 0.0 mapped_company = "" @corpus.each do |d| similarity = @tfidf.similarity(d.id, company.id) next unless maxSim < similarity maxSim = similarity mapped_company = d.id break if maxSim == 1 end return unless maxSim > threshold mapped_company.to_s.sub(/\_.*/, "").to_i end |