Class: Treat::Workers::Extractors::Similarity::JaroWinkler

Inherits:
Object
  • Object
show all
Defined in:
lib/treat/workers/extractors/similarity/jaro_winkler.rb

Overview

Similarity measure for short strings such as person names. C extension won’t work for Unicode strings; need to set extension to “pure” in that case (FIX).

Constant Summary collapse

DefaultOptions =
{
  threshold: 0.7,
  implementation: nil
}
@@matcher =
nil

Class Method Summary collapse

Class Method Details

.similarity(entity, options = {}) ⇒ Object



15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# File 'lib/treat/workers/extractors/similarity/jaro_winkler.rb', line 15

def self.similarity(entity, options={})
  
  options = DefaultOptions.merge(options)
  
  unless options[:to]
    raise Treat::Exception, "Must supply " +
    "a string/entity to compare to using " +
    "the option :to for this worker."
  end
  
  unless @@matcher
    impl = options[:implementation]
    impl ||= defined?(JRUBY_VERSION) ? :pure : :native
    klass = FuzzyStringMatch::JaroWinkler
    @@matcher = klass.create(impl)
  end
  
  a, b = entity.to_s, options[:to].to_s
  
  @@matcher.getDistance(a, b)
      
end