Module: Aspeller

Defined in:
lib/aspell_edit_dist_stub.rb,
ext/aspell_edit_dist.cpp

Overview

module, through which the functionality of edit distance calculation is possible

Defined Under Namespace

Classes: EditDistanceWeights

Class Method Summary collapse

Class Method Details

.limit_edit_distance(strA, strB, limit, weights) ⇒ Object

limit_edit_distance finds the shortest edit distance but will stop and return a number at least as large as LARGE_NUM if it has to do more edits than a set limit. Note that this does NOT mean that the score returned is <= limit*w.max as “sub” vs “submarine” will return 6*(cost of insertion) no matter what the limit is. The edit distance is (cost of swap)(# of swaps) + (cost of deletion)(# of deletions)

+ (cost of insertion)(# of insertions)
+ (cost of substitutions)(# of substitutions)

Preconditions:

max(strlen(a), strlen(b))*max(of the edit weights) <= 2^15
  if violated than an incorrect result may be returned (which may be negative)
  due to overflow of a short integer
(limit+1)*w.min < limit*w.max
limit <= 5 (use edit_distance if limit > 5)

where w.min and w.max is the minimum and maximum cost of an edit respectfully.

The running time is asymptotically bounded above by (3^l)*n where l is the limit and n is the maxium of strlen(a),strlen(b) Based on my informal tests, however, the n does not really matter and the running time is more like (3^l).

limit_edit_distance, based on my informal tests, turns out to be faster than edit_dist for l < 5. For l == 5 it is about the smaller for short strings (<= 5) and less than for longer strings



60
# File 'lib/aspell_edit_dist_stub.rb', line 60

def self.limit_edit_distance(strA, strB, limit, weights); end