MyNixOS website logo
Description

Stream data to or from LMDB databases using the streamly library.

Please see the README on GitHub at https://github.com/shlok/streamly-lmdb#readme

streamly-lmdb

Hackage CI

Stream data to or from LMDB databases using the Haskell streamly library.

Requirements

Install LMDB on your system:

  • Debian Linux: sudo apt-get install liblmdb-dev.
  • macOS: brew install lmdb.

Quick start

{-# LANGUAGE OverloadedStrings #-}

module Main where

import Data.Function
import qualified Streamly.Data.Fold as F
import qualified Streamly.Data.Stream.Prelude as S
import Streamly.External.LMDB

main :: IO ()
main = do
  -- Open an environment. There should already exist a file or
  -- directory at the given path. (Empty for a new environment.)
  env <-
    openEnvironment
      "/path/to/lmdb-database"
      defaultLimits {mapSize = tebibyte}

  -- Get the main database.
  -- Note: It is common practice with LMDB to create the database
  -- once and reuse it for the remainder of the program’s execution.
  db <- getDatabase env Nothing

  -- Stream key-value pairs into the database.
  withReadWriteTransaction env $ \txn ->
    [("baz", "a"), ("foo", "b"), ("bar", "c")]
      & S.fromList
      & S.fold (writeLMDB defaultWriteOptions db txn)

  -- Stream key-value pairs out of the
  -- database, printing them along the way.
  -- Output:
  --     ("bar","c")
  --     ("baz","a")
  --     ("foo","b")
  S.unfold readLMDB (defaultReadOptions, db, LeftTxn Nothing)
    & S.mapM print
    & S.fold F.drain

Benchmarks

See bench/README.md. Summary (with rough figures from our machine):

  • Reading (iterating through a fully cached LMDB database):
    • When using the ordinary readLMDB (which creates intermediate key/value ByteStrings managed by the RTS), the overhead compared to C depends on the key/value sizes; for 480-byte keys and 2400-byte values, the overhead is roughly 815 ns/pair.
    • By using unsafeReadLMDB instead of readLMDB (to avoid the intermediate ByteStrings), we can get the overhead compared to C down to roughly 90 ns/pair. (Plain Haskell IO code has roughly a 50 ns/pair overhead compared to C. The two preceding figures being similar fulfills the promise of streamly and stream fusion.)
  • Writing:
    • The overhead of this library compared to C depends on the size of the key/value pairs (ByteStrings managed by the RTS). For 480-byte keys and 2400-byte values, the overhead is around 4.3 μs/pair.
    • For now, we don’t provide “unsafe” write functionality (to avoid the key/value ByteStrings) because this write performance is currently good enough for our purposes.
  • For reference, we note that opening and reading 1 byte [16 KiB] from a file on disk with C takes us around 2.8 μs [20 μs].

September 2024; NixOS 24.11; Intel i7-12700K (3.6 GHz, 12 cores); Corsair VENGEANCE LPX DDR4 RAM 64GB (2 x 32GB) 3200MHz; Samsung 970 EVO Plus SSD 2TB (M.2 NVMe).

Metadata

Version

0.8.0

Maintainers (1)

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows