Module: Measurable::Jaccard
Instance Method Summary collapse

#jaccard(u, v) ⇒ Object
callseq: jaccard(u, v) > Float.

#jaccard_index(u, v) ⇒ Object
callseq: jaccard_index(u, v) > Float.
Instance Method Details
#jaccard(u, v) ⇒ Object
callseq:
jaccard(u, v) > Float
The jaccard distance is a measure of dissimilarity between two sets. It is calculated as:
jaccard_distance = 1  jaccard_index
This is a proper metric, i.e. the following conditions hold:
 Symmetry: jaccard(u, v) == jaccard(v, u)
 Nonnegative: jaccard(u, v) >= 0
 Coincidence axiom: jaccard(u, v) == 0 if u == v
 Triangular inequality: jaccard(u, v) <= jaccard(u, w) + jaccard(w, v)
Arguments:

u
> Array. 
v
> Array.
Returns:

Float value representing the dissimilarity between
u
andv
.
Raises:

ArgumentError
> The size of the input arrays doesn't match.
52 53 54 
# File 'lib/measurable/jaccard.rb', line 52 def jaccard(u, v) 1  jaccard_index(u, v) end 
#jaccard_index(u, v) ⇒ Object
callseq:
jaccard_index(u, v) > Float
Give the similarity between two binary vectors u
and
v
. Calculated as:
jaccard_index = intersection / union
In which intersection and union refer to u
and v
and x is the cardinality of set x.
For example:
jaccard_index([1, 0], [1]) == 0.5
Because intersection = (1) = 1 and union = (0, 1) = 2.
See: en.wikipedia.org/wiki/Jaccard_coefficient
Arguments:

u
> Array. 
v
> Array.
Returns:

Float value representing the Jaccard similarity coefficient between
u
andv
.
26 27 28 29 30 
# File 'lib/measurable/jaccard.rb', line 26 def jaccard_index(u, v) intersection = u & v union = u  v intersection.length.to_f / union.length end 