human-name-rb
Ruby bindings for the Rust crate human_name
, a library for parsing and comparing human names.
See the human_name
docs for details.
Examples
require 'humanname'
doe_jane = HumanName.parse("Doe, Jane")
doe_jane.surname
=> "Doe"
doe_jane.given_name
=> "Jane"
doe_jane.initials
=> "J"
j_doe = HumanName.parse("J. Doe")
j_doe.surname
=> "Doe"
j_doe.given_name
=> nil
j_doe.initials
=> "J"
j_doe == doe_jane
=> true
j_doe == HumanName.parse("John Doe")
=> true
doe_jane == HumanName.parse("John Doe")
=> false
Supported environments
With just bundle
/gem install
, OS X 10.12+.
If you're willing to do a little more work, anywhere supported by Helix and the nightly Rust compiler:
curl -s https://static.rust-lang.org/rustup.sh | sh -s -- --channel=nightly
git clone [email protected]:djudd/human-name-rb.git
cd human-name-rb
bundle exec rake
That will give you a .gem file in pkg/ which should work in environments similar to the one in which it was built.
Benchmark results
Comparing to people
, namae
, and human_name_parser
,
on 16k real examples taken mostly from PubMed author fields:
$ bundle exec rake benchmark
people gem:
3.010000 0.030000 3.040000 ( 3.032075)
namae gem:
3.550000 0.080000 3.630000 ( 3.630643)
human_name_parser gem:
1.960000 0.030000 1.990000 ( 1.991358)
this gem:
0.100000 0.000000 0.100000 ( 0.107794)
Our implementation uses a similar strategy to people
and human_name_parser
but covers significantly more edge cases, and also supports comparison.
(human_name_parser
also covers fewer edge cases than people
, as of December
2015, which probably explains its speed advantage.)
namae
uses a formal grammar, unlike this gem, and so probably captures some
cases this does not, although it certainly also misses some which this captures.