word2vec-rb

Gem using word2vec functionality from https://code.google.com/archive/p/word2vec/

This gem was developed using the .c files of the Google word2vec as base. Mostly by applying copy-and-paste.

Installation

Add this line to your application's Gemfile:

gem 'word2vec-rb'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install word2vec-rb

Usage

Distance arithmetic: to find the nearest words, try:

require 'word2vec'

model = Word2vec::Model.load("./data/minimal.bin")
words = model.distance("from")
words.each do |w| 
  puts "#{w.first} #{w.last}"
end

Analogy arithmetic: to find the analogy with three words, try:

require 'word2vec'

model = Word2vec::Model.load("./data/minimal.bin")
words = model.analogy("spain", "madrid", "france")
# In a well prepared vectors file (high quality), first word would be "Paris"
words.each do |w| 
  puts "#{w.first} #{w.last}"
end

Accuray: test accuracy of the vectors:

Define a file with the analogies to test, format: : section heading Word1 Word2 Word3 Word4

Sample:

: capital-common-countries
Athens Greece Baghdad Iraq
Athens Greece Bangkok Thailand
require 'word2vec'

model = Word2vec::Model.load(file_name)
model.accuracy("./data/questions-words.txt")

# Outputs the results on terminal

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Build extension

$ rake build

Launch tests

$ rake spec

Build extension

$ rake compile

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/madcato/word2vec-rb.