Description
Simple implementation of Earley parsing.
Description
A little Earley parser.
Also some utilities for visualizing parse trees.
See README.
README.md
A little Earley parser
Parser for context-free grammars.
Example
The following module defines a grammar for arithmetic expressions (like 1+2*3
). It defines:
- A type of non-terminal symbols
Ns
. - A type of terminal symbols
Ts
. - The production rules of the grammar
arithRules
. - A matching function, which interprets terminal symbols
Ts
as sets of lexemes, of typeChar
here. - A grammar
arithG
packaging up all of the above.
import Data.Char (isDigit)
import Little.Earley
data Ns = SUM | PRODUCT | FACTOR | NUMBER deriving (Eq, Ord, Enum, Bounded, Show)
data Ts = Digit | OneOf [Char] deriving (Eq, Ord, Show)
arithRules :: Ns -> [Rule Ns Ts]
arithRules n = case n of
SUM ->
[ [ N PRODUCT ]
, [ N SUM, T (OneOf ['+', '-']), N PRODUCT ] ]
PRODUCT ->
[ [ N FACTOR ]
, [ N PRODUCT, T (OneOf ['*', '/']), N FACTOR ] ]
FACTOR ->
[ [ N NUMBER ]
, [ T (OneOf ['(']), N SUM, T (OneOf [')']) ] ]
NUMBER ->
[ [ T Digit ]
, [ T Digit, N NUMBER ] ]
matchTs :: Ts -> Char -> Bool
matchTs Digit = isDigit
matchTs (OneOf s) = (`elem` s)
arithG :: Grammar Ns Ts Char
arithG = mkGrammar arithRules matchTs
Load that file in GHCi:
ghci -package little-earley example.hs
Parse a string using that grammar:
> pparse arithG SUM "1+2*3"
Output:
+-----+--SUM #1---+
| | |
SUM #0 | +PRODUCT #1-+
| | | | |
PRODUCT #0 | PRODUCT #0 | |
| | | | |
FACTOR #0 | FACTOR #0 | FACTOR #0
| | | | |
NUMBER #0 | NUMBER #0 | NUMBER #0
| | | | |
-----------------------------------
1 + 2 * 3
The module Little.Earley.Examples
contains more examples of grammars using this library.
Links
- This Earley parser implementation is based on Loup Vaillant's tutorial.
- See also the Earley library also in Haskell, with much fancier types to encode arbitrary semantic actions instead of clumsy trees.