Fast JSON parsing powered by Chad Austin's sajson library.
A fast JSON parsing library that is faster than aeson.
sajson
This is a Haskell package that is a thin wrapper over the wonderful sajson library written by Chad Austin. It provides a high performance JSON parser written in C++11. After parsing, a Haskell Data.Aeson.Value
is constructed.
Who This Is For
This library is specifically designed for Haskell users who find the performance of the pure-Haskell Data.Aeson
parser inadequate. This library focuses on performance, not portability or absolute correctness. You should only use this library if all of the following is true:
You are using a system that is 64-bit. This library assumes
sizeof(size_t) == 8
. If this is not the case, the behavior is undefined.You are using a system where the alignment of
double
is no stricter than that ofsize_t
. Internally, the library casts a pointer tosize_t
into a pointer todouble
and expects this to be meaningful.You do not expect to parse floating point numbers beyond the range of
double
.You do not expect to parse integers that are longer than 53 bits.
You do not expect parsing a JSON number will produce a Haskell
Scientific
value that exactly represents the mathematical value of the JSON number, or even the closest approximation in IEEE-754double
.For example if you parse numbers that would be denormal numbers in IEEE-754 double-precision numbers, you will very likely get back zero instead.
As another example, if you parse
0.3
, which cannot be represented exactly in IEEE-754double
, you will get back 0.30000000000000004, instead of the closest number in IEEE-754double
which is 0.29999999999999999.You wish to sacrifice memory consumption for speed. The intermediate memory needed is 9 times the number of bytes of the input
ByteString
plus constant. This excludes the memory needed by the parsedValue
or the memory of the inputByteString
.You are parsing JSON that is known to be valid. As a library written in C++, it is inherently unsafe. Despite the use of address sanitizer and AFL which have caught memory usage bugs, it is difficult to guarantee that creatively crafted input will not crash or cause arbitrary code execution. It is recommend that you use this library only for parsing known good JSON.
If you satisfy all of the above, then this library could work for you. Again, please benchmark and show that JSON parsing is indeed a bottleneck before using this library.
Ideas and Future Plan
Skip creating
Value
s when the you eventually want another Haskell type. Currently theFromJSON
instance needs aValue
, but perhaps we can devise another method to avoid the generation of these intermediateValue
s.Provide a benchmark so that we can quantitatively argue this library parses JSON faster than
Data.Aeson
.Provide an alternative for parsing floating point numbers that better preserves the value.
More complete tests, including QuickCheck-based tests.