Guide: Using the Streaming Parser - louisphilipmarcoux/rill-json GitHub Wiki

The primary feature of rill-json is its high-performance, low-memory streaming parser[cite: 1]. It works by turning the JSON input into an Iterator that you loop over, reacting to "events" as they are found[cite: 2]. This allows you to parse multi-gigabyte files without ever loading the whole file into memory.

The main function is parse_streaming().

How "Events" Work

A small input like {"id": 1} would produce the following events:

  1. ParserEvent::StartObject
  2. ParserEvent::Key("id".into())
  3. ParserEvent::Number(JsonNumber::I64(1))
  4. ParserEvent::EndObject

Basic Example: Finding a Single Value

This example shows how to find the first occurrence of a value associated with a key.

use rill_json::{parse_streaming, ParserEvent, JsonNumber};

let json_data = r#"{ "name": "Babbage", "id": 1815 }"#;

let mut parser = parse_streaming(json_data).unwrap();
let mut found_name_key = false;

while let Some(event) = parser.next() {
    match event.unwrap() {
        // 1. We found a key...
        ParserEvent::Key(key) if key == "name" => {
            found_name_key = true;
        }
        // 2. ...so the *next* string event is our value.
        ParserEvent::String(value) if found_name_key => {
            println!("Found name: {}", value); // Prints "Babbage"
            break; // We're done, stop parsing.
        }
        // 3. Reset if we find any other value
        _ => {
            found_name_key = false;
        }
    }
}

Advanced Example: Parsing an Array of Objects

A more complex task is to parse a list of items. For example, let's find the username of every user in an array.

This requires a small state machine to track:

  • Are we inside the main users array? (in_users_array)
  • Did we just see a "username" key? (in_username_key)
use rill_json::{parse_streaming, ParserEvent};

let big_json = r#"
{
  "total": 2,
  "users": [
    { "id": 1, "username": "ada_l" },
    { "id": 2, "username": "babbage_c" }
  ]
}
"#;

let mut parser = parse_streaming(big_json).unwrap();

// Our simple state
let mut in_users_array = false;
let mut in_username_key = false;

let mut found_usernames = Vec::new();

while let Some(event) = parser.next() {
    match event.unwrap() {
        // We found the "users" key
        ParserEvent::Key(key) if key == "users" => {
            // The next event *must* be StartArray
            in_users_array = true;
        }
        
        // We are in the "users" array and found a key
        ParserEvent::Key(key) if in_users_array && key == "username" => {
            in_username_key = true;
        }

        // We are in the right array *and* just saw the right key
        ParserEvent::String(value) if in_users_array && in_username_key => {
            found_usernames.push(value);
            in_username_key = false; // Reset for the next object
        }
        
        // We hit the end of the "users" array
        ParserEvent::EndArray if in_users_array => {
            in_users_array = false; // We're done with this array
            break; // No need to parse the rest of the file
        }

        // Reset the username key flag if we see any other event
        _ => {
            in_username_key = false;
        }
    }
}

assert_eq!(found_usernames, vec!["ada_l", "babbage_c"]);
println!("Found users: {:?}", found_usernames);