Carrot2

Ruby client for Carrot2 - the open-source document clustering server

Installation

First, download and run the Carrot2 server. It’s the one on this page that begins with carrot2-dcs.

With Homebrew, use:

brew install carrot2
brew services start carrot2

Then add this line to your application’s Gemfile:

gem 'carrot2'

How to Use

To cluster documents, use:

documents = [
  "Sign up for an exclusive coupon.",
  "Exclusive members get a free coupon.",
  "Coupons are going fast.",
  "This is completely unrelated to the other documents."
]

carrot2 = Carrot2.new
carrot2.cluster(documents)

This returns:

{
  "processing-time-total"=>1,
  "clusters"=> [
    {
      "id"=>0,
      "size"=>3,
      "phrases"=>["Coupon"],
      "score"=>0.06462323710740674,
      "documents"=>[0, 1, 2],
      "attributes"=>{"score"=>0.06462323710740674}
    },
    {
      "id"=>1,
      "size"=>2,
      "phrases"=>["Exclusive"],
      "score"=>0.05873148311034013,
      "documents"=>[0, 1],
      "attributes"=>{"score"=>0.05873148311034013}
    },
    {
      "id"=>2,
      "size"=>1,
      "phrases"=>["Other Topics"],
      "score"=>0.0,
      "documents"=>[3],
      "attributes"=>{"other-topics"=>true, "score"=>0.0}
    }
  ],
  "processing-time-algorithm"=>1,
  "query"=>nil
}

Documents are numbered in the order provided, starting with 0.

Specify a language with:

carrot2.cluster(documents, language: "FRENCH")

All of these languages are supported

For other requests, use:

carrot2.request(
  "dcs.c2stream" => xml_str
)

Configuration

To specify the Carrot2 server, set ENV["CARROT2_URL"] or use:

Carrot2.new(url: "http://localhost:8080")

Set timeouts [master]

Carrot2.new(open_timeout: 3, read_timeout: 5)

Heroku

Carrot2 can be easily deployed to Heroku thanks to support for WAR deployment.

You can find the .war file in the war directory in the dcs download. Then run:

heroku plugins:install heroku-cli-deploy
heroku create <app_name>
heroku war:deploy carrot2-dcs.war --app <app_name>

And set ENV["CARROT2_URL"] in your application.

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help: