loose_tight_dictionary

Match things based on string similarity (using the Pair Distance algorithm) and regular expressions.

Quickstart

>> right_records = [ 'seamus', 'andy', 'ben' ]
=> [...]
>> left_record = 'Shamus Heaney'
=> [...]
>> d = LooseTightDictionary.new right_records
=> [...]
>> puts d.left_to_right left_record
=> 'seamus'

Try running the included example file:

$ ruby examples/first_name_matching.rb 
Left side (input)
====================
Mr. Seamus
Sr. Andy
Master BenT

Right side (output)
====================
seamus
andy
ben

Results
====================
Left record (input)           Right record (output)         Prefix used (if any)          Score                         
Mr. Seamus                    seamus                        NULL                          0.666666666666667             
Sr. Andy                      andy                          NULL                          0.5                           
Master BenT                   ben                           NULL                          0.2

Improving dictionaries

Similarity matching will only get you so far.

TODO: regex usage

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 Seamus Abshere. See LICENSE for details.