Calculate the edit distance between two foldables.
A package to determine the edit distance between two Foldable
s. These are converted to lists, and the Levenshtein distance determine how many additions, removals and changes are necessary to change the first list into the second list.
levenshtein
Usage
The module Data.Foldable.Levenshtein
exports a data type Edit
that represent the possible ways to edit a list by Add
ing an element, Rem
oving an element, Copy
ing (do nothing with the element), and Swap
with a new value.
One can apply such edits to a list with the applyEdits
function, for example:
Prelude> applyEdits [Copy 1,Swap 3 4,Swap 0 2,Swap 2 5] [1,3,0,2]
Just [1,4,2,5]
We can also calculate the minimal list of edits necessary to turn one list into another one, for example:
Prelude> levenshtein [1,3,0,2] [1,4,2,5]
(3,[Copy 1,Swap 3 4,Swap 0 2,Swap 2 5])
here it means that the smallest edit distance is three, and that in order to transform [1,3,0,2]
to [1,4,2,5]
we copy 1
change 3
for 4
, change 0
for 2
, and change 2
for 5
.
Package structure
The package contains one module: Data.Foldable.Levenshtein
. This module provides functions to determine the edit distance and a list of edits to turn one Foldable
of items to another Foldable
of items. The foldables are first converted to a list, so the edits always eventually produce a list of edits, even if (one of) the Foldable
s is for example a Tree
, Maybe
, etc.
Besides the Edit
object, the module exports three types of functions:
- functions that return the edit distance together with a list of reversed edits;
- functions that return the edit distance with a list of edits (not reversed); and
- functions that only calculate the edit distance, not the edits itself.
The third type is more an optimized version of the first two types since it will take less memory and finish slightly faster.
Some functions allow to specify the penalty for an insertion, deletion, and replacement, and this even per item.
In the table below, we show the different implementations to determine the Levenshtein distance:
Edits | Eq | Equality function | Eq | |
---|---|---|---|---|
penalty functions | default | |||
Normal | genericLevenshtein | genericLevenshtein' | levenshtein' | levenshtein |
Reversed | genericReversedLevenshtein | genericReversedLevenshtein' | reversedLevenshtein' | reversedLevenshtein |
Without | genericLevenshteinDistance | genericLevenshteinDistance' | levenshteinDistance' | levenshteinDistance |
levenshtein
is safe Haskell
The levenshtein
package does not work with arrays, vectors, etc. but only vanilla lists, making this a safe package.
Contribute
You can contribute by making a pull request on the GitHub repository.
You can contact the package maintainer by sending a mail to [email protected]
.