A JSON Parser for D - marler8997/json GitHub Wiki
Why?
The first question you should be asking is "Why would you write another JSON parser?", here's why:
-
It's Fast
2 to 5 times faster then std.json and stdx.data.json
-
It's small
json.obj is around 150 KB when compiled with dmd on windows
Size of json.obj Compiler Options 132 KB Windows(dmd json.d -c) 147 KB Windows(dmd json.d -c -O -release -inline -boundscheck=off) -
Supports @nogc
Supports a gc and a nogc version. Use --version=ParseJsonNoGC to enable the nogc version. How does the parser support this? The following is a list of what needs memory and how it is handled:
MemoryType How to make it nogc Json Objects and Arrays Allows the user to pass an interface to the parser to allocate these structures. If no interface is passed in, the default malloc interface is used, which requires the user to free the memory. Parser Object and Array Tree Allocated on the function stack. Other Parser Data Structures The parser data structures are allocated on the function stack. Exceptions The parser passes the Exception up the stack to the caller where it can either throw it or pass it back to the caller -
It supports Lenient JSON The user can set a flag at runtime to make the parser accept "lenient json":
- Allows comments ('//' or '#' or '/* ... */')
- Allows unquoted strings
- Allows trailing comma after the last item in an array/object
This option is implemented at runtime, meaning, it is not done through a template parameter. This allows a program to support both strict and lenient json without needing to create 2 instances of the parser. It was done this way because the extra instructions to support lenient json are so minimal that it makes sense to support both options at runtime.
-
Json File/Line/Column information is stored seperately from the values.
This allows the compiler to pack the JSON values tightly together and allows the lin/column/fine information to be optional without wasting memory at runtime and without having to create multiple instances of the parser to support a template option.
-
No TLS data
-
Supports JSON with multiple roots.
The "parseJsonValues" function will return an array of Json values instead of just a single Json value with the "parseJson" function.
-
Templates have been placed strategically to minimize template bloat.
-
Unlike std.json, this parser will accept numbers that that have any number of decimal digits or exponent digits making it fully JSON compliant.
Design
This parser is implemented as a state machine. The characters represent transitions and the various states include "RootState" "ObjectKeyState" "ObjectValueState" "ObjectCommaState" etc...
TODO
Add support for UTF(16/32)(LE/BE) I may support these extra encodings through templates. Most applications will only need to support UTF-8, so having to create more instances to support more encodings may be an acceptable trade-off. Store JSON value locations in a separate data structure to prevent wasteful use of memory