NekonekoGen

Easy to Use Ruby Text Classifier Generator.

Installation

Add this line to your application's Gemfile:

gem 'nekoneko_gen'

And then execute:

$ bundle

Or install it yourself as:

$ gem install nekoneko_gen

Usage

% mkdir data
% cd data
% wget -i http://www.udp.jp/misc/2ch_data/index1.txt
...
% cd ..
% nekoneko_gen -n game_thread_classifier data/dragon_quest.txt data/loveplus.txt
loading data/dragon_quest.txt... 35.5426s
loading data/loveplus.txt... 36.0522s
step   0... 0.879858, 3.7805s
step   1... 0.919624, 2.2018s
step   2... 0.932147, 2.1174s
step   3... 0.940959, 2.0569s
step   4... 0.946985, 1.8876s
step   5... 0.950891, 1.8564s
step   6... 0.953541, 1.8398s
step   7... 0.955464, 1.8204s
step   8... 0.957427, 1.8008s
step   9... 0.959056, 1.7912s
step  10... 0.961098, 1.8027s
step  11... 0.961745, 1.7716s
step  12... 0.962943, 1.7633s
step  13... 0.963610, 1.7477s
step  14... 0.964611, 1.6216s
step  15... 0.965259, 1.7291s
step  16... 0.965730, 1.7271s
step  17... 0.966613, 1.7225s
step  18... 0.967241, 1.5861s
step  19... 0.967712, 1.7113s
DRAGON_QUEST, LOVEPLUS : 71573 features
done nyan!

% ls -la
...
-rw-r--r--  1 ore users 2555555 2012-05-28 08:10 game_thread_classifier.rb
...

% cat > console.rb
# coding: utf-8
if (RUBY_VERSION < '1.9.0')
  $KCODE = 'u'
end
require './game_thread_classifier'

$stdout.sync = true
loop do
  print "> "
  line = $stdin.readline
  label = GameThreadClassifier.predict(line)
  puts "#{GameThreadClassifier::LABELS[label]}の話題です!!!"
end
^D

% ruby console.rb
> 彼女からメールが来た
LOVEPLUSの話題です!!!
> 日曜日はデートしてました
LOVEPLUSの話題です!!!
> 金欲しい
DRAGON_QUESTの話題です!!!
> 王様になりたい
DRAGON_QUESTの話題です!!!
> スライム
DRAGON_QUESTの話題です!!!
> スライムを彼女にプレゼント
LOVEPLUSの話題です!!!

%cat > test.rb
if (RUBY_VERSION < '1.9.0')
  $KCODE = 'u'
end
require './game_thread_classifier'

labels = Array.new(GameThreadClassifier.k, 0)
file = ARGV.shift
File.open(file) do |f|
  until f.eof?
    l = f.readline.chomp
    label = GameThreadClassifier.predict(l)
    labels[label] += 1
  end
end
count = labels.reduce(:+)
labels.each_with_index do |c, i|
  printf "%16s: %f\n", GameThreadClassifier::LABELS[i], c.to_f / count.to_f
end
^D

% ruby test.rb data/dragon_quest_test.txt
    DRAGON_QUEST: 0.932000
        LOVEPLUS: 0.068000
% ruby test.rb data/loveplus_test.txt
    DRAGON_QUEST: 0.124000
        LOVEPLUS: 0.876000
% ruby test.rb data/dragon_quest_test2.txt
    DRAGON_QUEST: 0.988000
        LOVEPLUS: 0.012000
% ruby test.rb data/loveplus_test2.txt
    DRAGON_QUEST: 0.012048
        LOVEPLUS: 0.987952


% nekoneko_gen -n game_thread_classifier data/dragon_quest.txt data/loveplus.txt data/skyrim.txt data/mhf.txt
...
...
...