Edit Distance
The naive algorithm to compute the edit-distance of two strings (sequences) runs in time that is exponential in terms of the string length:
Distance = lambda A. lambda B.
if null A then length B
else if null B then length A
else
let As = tl A, Bs = tl B
in if hd A = hd B then Distance As Bs
else 1 + min (Distance As Bs)
(min (Distance As B)
(Distance A Bs))
The well-known dynamic programming algorithm (DPA) avoids doing repeated work by storing partial results in an array or other data-structure. This reduces the time complexity to O(|A|*|B|) where the two strings are A and B, e.g.:
Distance = lambda A. lambda B.
let rec
Rows = (0 :: count hd Rows B) {the first row }
:: EachRow A hd Rows {the other rows},
EachRow = lambda A. lambda lastrow.
if null A then nil
else
let rec
Ach = hd A,
DoRow = lambda B. lambda NW. lambda W. {NW N}
if null B then nil {W .}
else
let N = tl NW
in let me = if Ach = hd B then hd NW
else 1 + min W (min hd N hd NW)
in me :: DoRow tl B tl NW me,
thisrow = (1 + hd lastrow)
:: DoRow B lastrow hd thisrow
in thisrow :: EachRow tl A thisrow
in last (last Rows)
If the strings, A and B, have a small edit-distance and are of length ~n then we can do better [All92] with an algorithm that takes O(n*D(A,B)) time:
There are more λ-calculus examples here.
References
- [All92] Lloyd Allison,
'Lazy dynamic programming can be eager',
Information Processing Letters, 43, pp.207-212,
doi:10.1016/0020-0190(92)90202-7.
September 1992.
- And other publications.
And for the record, the last algorithm in Haskell:
module Edit_IPL_V43_N4 (d) where
-- compute the edit distance of sequences a and b [All92].
d a b =
let
-- diagonal from the top-left element
mainDiag = oneDiag a b (head uppers) ( -1 : (head lowers))
-- diagonals above the mainDiag
uppers = eachDiag a b (mainDiag : uppers)
-- diagonals below the mainDiag
lowers = eachDiag b a (mainDiag : lowers) -- !
oneDiag a b diagAbove diagBelow = -- \ \ \
let -- \ \ \
doDiag [] b nw n w = [] -- \ nw n
doDiag a [] nw n w = [] -- \ \
doDiag (a:as) (b:bs) nw n w = -- w me
let me = if a==b then nw -- dynamic programming DPA
else 1+min3 (head w) nw (head n)
in me : (doDiag as bs me (tail n) (tail w))
firstelt = 1+(head diagBelow)
thisdiag =
firstelt:(doDiag a b firstelt diagAbove (tail diagBelow))
in thisdiag
min3 x y z =
-- see L. Allison, Lazy Dynamic-Programming can be Eager
-- Inf. Proc. Letters 43(4) pp207-212, Sept' 1992
if x < y then x else min y z -- makes it O(|a|*D(a,b))
-- min x (min y z) -- makes it O(|a|*|b|)
-- the fast one does not always evaluate all three values.
eachDiag a [] diags = []
eachDiag a (b:bs) (lastDiag:diags) =
let nextDiag = head(tail diags)
in (oneDiag a bs nextDiag lastDiag):(eachDiag a bs diags)
-- which is the diagonal containing the bottom R.H. elt?
lab = (length a) - (length b)
in last( if lab == 0 then mainDiag
else if lab > 0 then lowers !! ( lab-1)
else uppers !! (-lab-1) )
-- module under Gnu 'copyleft' GPL General Public Licence.