MyNixOS website logo
Description

Determine string similarities and distance.

Haskell implementation of PostgreSQL fuzzystrmatch extension

fuzzystrmatch-pg

Build

Haskell implementation of PostgreSQL extension/module fuzzystrmatch.

Roadmap

  • [x] Levenshtein - Implement Levenshtein distance functions

Quick Start

import Data.FuzzyStrMatch (levenshtein)
import Data.Text

kitten  = "kitten"  :: Text
sitting = "sitting" :: Text

ghci> levenshtein kitten sitting
3

ghci> levenshteinLessEqual kitten sitting 3
Just 3

-- Bounded version which exits early, hence much faster
ghci> levenshteinLessEqual kitten sitting 2
Nothing

Benchmark

Benchmarking with criterion library:

let source = "aaaaaaaaaaaaaaaaaaaaaaaaaaaa"
    target = "abababababababababababababababababababababababababababab"
    maxD   = 3

defaultMain
  [
    bench "levenshtein" $ nf (levenshtein source) target
  , bench "levenshteinLessEqual" $ nf (levenshteinLessEqual source target) maxD
  ]
benchmarking levenshtein
time                 217.0 μs   (214.5 μs .. 219.0 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 212.9 μs   (211.4 μs .. 214.6 μs)
std dev              5.201 μs   (4.374 μs .. 6.315 μs)
variance introduced by outliers: 19% (moderately inflated)

benchmarking levenshteinLessEqual
time                 45.01 μs   (44.79 μs .. 45.29 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 45.09 μs   (44.86 μs .. 45.49 μs)
std dev              936.6 ns   (617.6 ns .. 1.526 μs)
variance introduced by outliers: 17% (moderately inflated)

We believe that the difference would be much more on longer strings.

Metadata

Version

0.1.0.0

License

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows