MyNixOS website logo
Description

Make writing in unicode easy.

Unicoder is a command-line tool transforms text documents, replacing simple patterns with unicode equivalents. The patterns can be easily configured by the user. This package is especially meant to open the vast and expressive array of unicode identifiers to programmers and language designers, but there's nothing wrong with a technically savvy user putting unicoder to work on documents for human consumption. Any system of special characters can be made easy to type on any keyboard and in any context as long as unicode supports it.

Cabal wants to fight me over typesetting some examples, so check out the real docs for a decent look at the features.

In the interests of giving readers some idea whats going on, with the default settings,

\E.x. \A.y. x \-> y
\l.x,y. x \of x \of y

becomes

∃x ∀y x → y
λx,y. x ∘ x ∘ y

except that the newline isn't removed (thanks, cabal!). Also, there are a couple important features that I can't seem to get cabal to even parse (thanks again!).

Unicoder

Build Status

Unicoder reads in a source file and makes replacements in-place. The goal is to allow ascii interfaces to be able to insert unicode without taking your hands off the keyboard. This can allow for unicode to be entered into source code or any other text document you're editing.

Entering unicode is as easy as typing a special string (default backslash) followed by an identifier. There's also syntax for wrapping content inside a pair of unicode strings. For example, with the default configuration, unicoder turns \floor{x} \def \lambda x. (floor x) into ⌊x⌋ ≡ λ x. (floor x). Admittedly, this is not a great syntax for some kinds of documents (esp. XeLaTeX), but that's why we've allowed for configuration of each of the special marks as well as the identifier character set, so Unicoder can be relevant to any type of text data.

By default, unicoder takes input on stdin and puts the unicodized verision on stdout: unicoder < file.in > file.out. Unicoder can also operate in-place on multiple files (unicoder -i src/*.c) and in file-watch mode (unicoder -w 'src/**/*.c' &; note that the glob pattern is quoted so that it is passed into unicoder instead of being expanded immediately).

There's more documentation on our Viewdocs. If you're learning to use Unicoder, I would especially recommend our examples.

Examples

Assuming a config file that looks like this:

\ . { } a-z

lambda λ
pi π
bag ⟅ ⟆

we can write this with a normal keyboard:

\lambda.x. x + \pi

and after unicodizing, we will get:

λx. x + π

and celebrate the nice, clean lambda-calculus.

Have no fear, however, code such as this:

id = \x -> x
newline_period = "\n."

Will remain unchanged, as x and n are not in the config file.

There are also two-part replacements. These take a single (non-nested) argument, transforming

\bag{black}

into

⟅black⟆

You can also use each half of a two-part replacement individually. This is especially usefule for nesting, but also when you simply have argument-close marks in the argument:

\{bag {} \}bag

becomes

⟅ {} ⟆

Pitfalls

Even in something as simple as this, you may want to be aware of a few facts:

  • Beware of adding names like n or t in your config file. If you are using a language that isn't esoteric, you will probably change the meaning of your code.
  • It is still possible to mess up strings. For example, "\neq""≠" instead of being equivalent to "\n" ++ "eq". I conjecture that there is no way to solve this problem without sacrificing idempotence.
  • I've made little attempt to ensure safety, other than using Haskell. Make backups if you are wary (and your editor doesn't already).

Thankfully, the pitfalls are realistically enumerable.

Contribute

Unicoder is in the beta stage. I'm sure there's a bug or two, some cleanup to be done, and definitely some missing features. Please add any issues or pull requests to our github. You can email me, but that's usually higher latency than github.

Metadata

Version

0.5.0

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows