Module: Aspeller
- Defined in:
- lib/aspell_edit_dist_stub.rb,
ext/aspell_edit_dist.cpp
Overview
module, through which the functionality of edit distance calculation is possible
Defined Under Namespace
Classes: EditDistanceWeights
Class Method Summary collapse
-
.limit_edit_distance(strA, strB, limit, weights) ⇒ Object
limit_edit_distance finds the shortest edit distance but will stop and return a number at least as large as LARGE_NUM if it has to do more edits than a set limit.
Class Method Details
.limit_edit_distance(strA, strB, limit, weights) ⇒ Object
limit_edit_distance finds the shortest edit distance but will stop and return a number at least as large as LARGE_NUM if it has to do more edits than a set limit. Note that this does NOT mean that the score returned is <= limit*w.max as “sub” vs “submarine” will return 6*(cost of insertion) no matter what the limit is. The edit distance is (cost of swap)(# of swaps) + (cost of deletion)(# of deletions)
+ (cost of insertion)(# of insertions)
+ (cost of substitutions)(# of substitutions)
Preconditions:
max(strlen(a), strlen(b))*max(of the edit weights) <= 2^15
if violated than an incorrect result may be returned (which may be negative)
due to overflow of a short integer
(limit+1)*w.min < limit*w.max
limit <= 5 (use edit_distance if limit > 5)
where w.min and w.max is the minimum and maximum cost of an edit respectfully.
The running time is asymptotically bounded above by (3^l)*n where l is the limit and n is the maxium of strlen(a),strlen(b) Based on my informal tests, however, the n does not really matter and the running time is more like (3^l).
limit_edit_distance, based on my informal tests, turns out to be faster than edit_dist for l < 5. For l == 5 it is about the smaller for short strings (<= 5) and less than for longer strings
60 |
# File 'lib/aspell_edit_dist_stub.rb', line 60 def self.limit_edit_distance(strA, strB, limit, weights); end |