K Nearest Neighbours

Simple KNN Ruby implementation

Install

gem sources -a -http://gemcutter.org
sudo gem install naive_bayes

How To Use

require 'rubygems'
require 'knn'

data = Array.new(100000) { Array.new(4) { rand } }

knn = KNN.new(data)

knn.nearest_neighbours([1,2,3,4], 4)  # ([data], k's)
  #=> [[4837, 7.43033158269445, [0.966558570073977, 0.903158898673566, 0.954567901514261, 0.988114355901207]], ...

# Data is returned in the format
# [data index, distance to the input, [data points]]

# So if we called queried the data array for 4837...
data[4837]
  #=> [0.966558570073977, 0.903158898673566, 0.954567901514261, 0.988114355901207]

Distance Measurements

KNN uses the Distance Measures Gem (github.com/reddavis/Distance-Measures) so we get quite a range of distance measurements.

The measurements currently available are:

euclidean_distance

cosine_similarity

jaccard_index

jaccard_distance

binary_jaccard_index

binary_jaccard_distance

tanimoto_coefficient

To specify a particular one to use in the KNN algorithm, just provide it as an option:

KNN.new(@data, :distance_measure => :jaccard_index)
KNN.new(@data, :distance_measure => :cosine_similarity)
KNN.new(@data, :distance_measure => :tanimoto_coefficient)

Copyright © 2009 Red Davis. See LICENSE for details.