Ruby/Odeum

Ruby/Odeum is a simple full text reverse indexer that lets you index a set of files and then search through them very quickly. It is similar to Java’s Lucene but at a lower level since it does not provide a query language or a lexical parser for documents. The extension is based on Mikio Hirabayashi’s QDBM library and includes a full distribution needed to use the extension right out of the box.

The library is very simple to use and has full documentation in ruby docs. Take a look at the test/test_odeum.rb and the bin/odeum_mgr files for examples of using the library in a simple way.

Features

Pretty much the same features that you’d get from Odeum and QDBM, but available in an idiomatic Ruby package. The big list is:

  1. Fast as hell. QDBM is one of the fastest libraries out there for this kind of thing.

  2. Simple interface involving two classes and maybe one small set of module functions.

  3. Indexes documents of any type, with arbitrary names and unlimited (well, sort of) meta-data.

  4. Searching by normalized words with returned “scores” for weighting.

  5. Locking at the thread level for the OS.

Mikio states that it is probably not suitable for document stores that are larger than about 1 million documents in size.

Building

Developers who want to work on the project should put Ruby files in lib and modifications to the extension in ext/odeum_index. You should then use the Rakefile and rake to build the application and run tests (build docs, etc.)

Installing

There’s a setup.rb you can use to compile and install the extension and odeum_mgr script for regular users.

Anyone feel like making a gem out of this? :-)

Contact

The Ruby extension is entirely my fault. Please do not contact Mikio about it unless it’s to say thanks for doing such a cool job on QDBM and Odeum. If you have problems with the extension then let me know. I’ll work with Mikio or fix them myself depending on what needs fixing.

You can contact me at zedshaw at zedshaw dot com.