Internal data format - HeisenbugLtd/spat GitHub Wiki

Basics

All data types denoting an entity extracted from the JSON files are derived from a single abstract tagged type in SPAT.Entity.

Inheritance Tree

Inheritance Tree These types map more or less directly to the JSON objects in the .spark files.

Putting it together

Parsed data is stored on a per entity basis in the SPARK_Info.T type in a hash map indexed by the entity name (see SPARK_Info.Analyzed_Entities). For the reason that the Timing_Item is associated with a single file, not an entity, these are stored outside of the entities structure in a hash map indexed by filenames.

Structure of an Analyzed_Entity

Each instance of an Analyzed_Entity contains a single tree (Multiway_Tree) which stores all entities. The nodes where certain entities start are marked with what I call Sentinel elements which collect information about their children. To simplify direct access, the Cursor to each of these Sentinel nodes is stored.

The tricky part was that each Proof_Item.T has children, too, these are instances of Proof_Attempts.T, corresponding to a path taken by the prover to prove it. An early version stored these within Proof_Item.T which is a logical solution, but this deemed inflexible, and relatively complex. So, instead, these items are stored within the same tree (which, honestly, was a bit of a headache to implement due to the way objects are being created, see SPAT.Proof_Item.Add_To_Tree). As with the other nodes, the hierarchy of Proof_Attempts is enclosed in another Sentinel parent object (which is how I map the JSON check-tree array).

The whole thing can be imagined roughly like that: Structure of Analyzed_Entity

Information Storage

For the initially intended purpose, the tree to store this information is a good solution, albeit rather redundant at times.

Entities are referenced through a hash table, which gives us access to the tree associated with this entity. Each Flow_Item or Proof_Item contains the source location and collects information about it's children, so information about maximum or total proof times is easily accessible without needing to work through the tree.