Class: OpenTox::Algorithm::Similarity
Class Method Summary collapse
- .cosine(scaled_properties) ⇒ Float
-
.euclid(scaled_properties) ⇒ Float
Get Euclidean distance.
-
.remove_nils(scaled_properties) ⇒ Array<Array<Float>>
Remove nil values.
-
.tanimoto(fingerprints) ⇒ Float
Get Tanimoto similarity.
-
.weighted_cosine(scaled_properties) ⇒ Float
Get weighted cosine similarity stackoverflow.com/questions/1838806/euclidean-distance-vs-pearson-correlation-vs-cosine-similarity.
Class Method Details
.cosine(scaled_properties) ⇒ Float
Get cosine similarity
http://stackoverflow.com/questions/1838806/euclidean-distance-vs-pearson-correlation-vs-cosine-similarity
45 46 47 48 |
# File 'lib/similarity.rb', line 45 def self.cosine scaled_properties scaled_properties = remove_nils scaled_properties Algorithm::Vector.dot_product(scaled_properties[0], scaled_properties[1]) / (Algorithm::Vector.magnitude(scaled_properties[0]) * Algorithm::Vector.magnitude(scaled_properties[1])) end |
.euclid(scaled_properties) ⇒ Float
Get Euclidean distance
36 37 38 39 |
# File 'lib/similarity.rb', line 36 def self.euclid scaled_properties sq = scaled_properties[0].zip(scaled_properties[1]).map{|a,b| (a - b) ** 2} Math.sqrt(sq.inject(0) {|s,c| s + c}) end |
.remove_nils(scaled_properties) ⇒ Array<Array<Float>>
Remove nil values
71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/similarity.rb', line 71 def self.remove_nils scaled_properties a =[]; b = []; w = [] (0..scaled_properties.first.size-1).each do |i| if scaled_properties[0][i] and scaled_properties[1][i] and !scaled_properties[0][i].nan? and !scaled_properties[1][i].nan? a << scaled_properties[0][i] b << scaled_properties[1][i] w << scaled_properties[2][i] end end [a,b,w] end |
.tanimoto(fingerprints) ⇒ Float
Get Tanimoto similarity
25 26 27 |
# File 'lib/similarity.rb', line 25 def self.tanimoto fingerprints ( fingerprints[0] & fingerprints[1]).size/(fingerprints[0]|fingerprints[1]).size.to_f end |
.weighted_cosine(scaled_properties) ⇒ Float
Get weighted cosine similarity
http://stackoverflow.com/questions/1838806/euclidean-distance-vs-pearson-correlation-vs-cosine-similarity
54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/similarity.rb', line 54 def self.weighted_cosine scaled_properties a,b,w = remove_nils scaled_properties return cosine(scaled_properties) if w.uniq.size == 1 dot_product = 0 magnitude_a = 0 magnitude_b = 0 (0..a.size-1).each do |i| dot_product += w[i].abs*a[i]*b[i] magnitude_a += w[i].abs*a[i]**2 magnitude_b += w[i].abs*b[i]**2 end dot_product/(Math.sqrt(magnitude_a)*Math.sqrt(magnitude_b)) end |