What is distributed-trie

  • distributed-trie is a trie library on key-value store.
  • It is scalable.
  • It supports Tokyo Cabinet / memcached / gdbm / pure hash / Redis / DynamoDB / SimpleDB

The reason why i developed

  • I need a trie library for Sekka ( japanese input method ).
  • I need a trie library which written in pure Ruby.
  • I need a trie library which can scale out.

Installing

gem install distributed-trie

Features

  • Add keyword to trie.
  • Delete keyword to trie. ( not implemented... )
  • commonPrefixSearch by keyword.
  • fuzzySearch by jaro winker edit distance.
  • search with user-defined-function.

Architecture

  • distributed-trie gem only manage trie data structure.
  • You should manage your application data which corresponds to trie key.

Figure

Sample code

require 'distributedtrie'
require 'distributedtrie/kvs/tokyocabinet'
kvsTc = DistributedTrie::KvsTc.new( '/tmp/distributed-trie.tch' )
trie = DistributedTrie::Trie.new( kvsTc, "Sample::" )
trie.addKey!( "apple"       )
trie.addKey!( "application" )
trie.addKey!( "orange"      )
trie.commit!
result = trie.commonPrefixSearch( "app" )
print result
# =>  [ "apple", "application" ]
result = trie.fuzzySearch( "app", 0.80 )
print result
# =>  [[0.9066666666666667, "apple"], [0.8236914600550963, "application"]]

Requires

  • Ruby 1.9.1 or higher
  • JRuby 1.6.6 or higher
  • fuzzy-string-match gem

Author

See also

License

  • BSD License