Tf-Idf

en.wikipedia.org/wiki/Tf–idf

Install

gem sources -a http://gems.github.com
sudo gem install tf_idf

How To Use

require 'rubygems'
require 'tf_idf'

data = ['a a a a a a a a b b', 'a a']

# 1 is the ngram setter => http://en.wikipedia.org/wiki/N-gram
a = TfIdf.new(data, 1)

# To find the term frequencies
a.tf
  #=> [{'b' => 0.2, 'a' => etc...}, {'a' => 1}]

# To find the inverse document frequency
a.idf
  #=> {'b' => 0.301... etc...}

# And to find the tf-idf
a.tf_idf
  #=> [{'b' => 0.0602, 'a' => etc...}, {etc...}]

Copyright © 2009 Red Davis. See LICENSE for details.