Carrot2
Ruby client for Carrot2 - the open-source document clustering server
Installation
First, download and run the Carrot2 server. It’s the one on this page that begins with carrot2-dcs
.
With Homebrew, use:
brew install carrot2
brew services start carrot2
Then add this line to your application’s Gemfile:
gem 'carrot2'
How to Use
To cluster documents, use:
documents = [
"Sign up for an exclusive coupon.",
"Exclusive members get a free coupon.",
"Coupons are going fast.",
"This is completely unrelated to the other documents."
]
carrot2 = Carrot2.new
carrot2.cluster(documents)
This returns:
{
"processing-time-total"=>1,
"clusters"=> [
{
"id"=>0,
"size"=>3,
"phrases"=>["Coupon"],
"score"=>0.06462323710740674,
"documents"=>[0, 1, 2],
"attributes"=>{"score"=>0.06462323710740674}
},
{
"id"=>1,
"size"=>2,
"phrases"=>["Exclusive"],
"score"=>0.05873148311034013,
"documents"=>[0, 1],
"attributes"=>{"score"=>0.05873148311034013}
},
{
"id"=>2,
"size"=>1,
"phrases"=>["Other Topics"],
"score"=>0.0,
"documents"=>[3],
"attributes"=>{"other-topics"=>true, "score"=>0.0}
}
],
"processing-time-algorithm"=>1,
"query"=>nil
}
Documents are numbered in the order provided, starting with 0.
Specify a language with:
carrot2.cluster(documents, language: "FRENCH")
All of these languages are supported
For other requests, use:
carrot2.request(
"dcs.c2stream" => xml_str
)
Configuration
To specify the Carrot2 server, set ENV["CARROT2_URL"]
or use:
Carrot2.new(url: "http://localhost:8080")
Set timeouts [master]
Carrot2.new(open_timeout: 3, read_timeout: 5)
Heroku
Carrot2 can be easily deployed to Heroku thanks to support for WAR deployment.
You can find the .war
file in the war
directory in the dcs download. Then run:
heroku plugins:install heroku-cli-deploy
heroku create <app_name>
heroku war:deploy carrot2-dcs.war --app <app_name>
And set ENV["CARROT2_URL"]
in your application.
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features