MyNixOS website logo
Description

Generic encoding of records.

Generic encoding of records. It currently provides a single, polymorphic function to encode sum types (i.e. categorical variables) as one-hot vectors.

record-encode

Encoding categorical variables

Build Status Hackage

This library provides generic machinery to encode values of some algebraic type as points in a vector space.

Values of a sum type (e.g. enumerations) are also called "categorical" variables in statistics, because they encode a choice between a number of discrete categories.

On the other hand, many data science / machine learning algorithms rely on a purely numerical representation of data; the conversion code from values of a static type is often "boilerplate", i.e. largely repeated and not informative.

The encodeOneHot function provided here is a generic utility function (i.e. defined once and for all) to compute the one-hot representation of any sum type.

Usage example

    {-# language DeriveGeneric -#}

    import qualified GHC.Generics as G
    import qualified Generics.SOP as SOP
    
    import Data.Record.Encode

    data X = A | B | C deriving (G.Generic)
    instance SOP.Generic X
    > encodeOneHot B
    OH {oDim = 3, oIx = 1}

Please refer to the documentation of Data.Record.Encode for more examples and details.

Acknowledgements

Gagandeep Bhatia (@gagandeepb) for his Google Summer of Code 2018 work on Frames-beam, Mark Karpov (@mrkkrp) for his Template Haskell tutorial, Anthony Cowley (@acowley) for Frames, @mniip on Freenode #haskell for helping me better understand what can be done with generic programming.

Metadata

Version

0.2.3

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows