Investigating issue #50 - HeisenbugLtd/spat GitHub Wiki

Clarified by Johannes Kanig:

The number of steps reported per prover in the check_tree array indeed denotes a different value than the ones reported in the subsequent stats objects and the one from the stats object is the proper one to use.

Example (shortened to relevant parts):

"check_tree": [
   {
      "proof_attempts": {
         "Z3": {
            "result": "Valid",
            "steps": 8989120,
            "time": 2.67000000000000E+00
         }
      },
   }
],
"stats": {
   "Z3": {
      "count": 1,
      "max_steps": 10674,
      "max_time": 2.67000007629395E+00
   }
}

As can be seen here, while the time mostly matches (so far I've seen very minor differences which can probably be attributed to different set up times or something), the steps reported in the check_tree array is close to 9 million, while the max_steps from the stats object claims only about 10 thousand steps. The relationship between these two values is linear, but there is some scaling factor involved (to have a comparable number of steps independent of the actual prover's reported steps). See this spreadsheet, and this source which confirms the exact scaling factors I found.

Hence, the stats/<prover_name>/max_steps is the value that should be reported and used for comparisons etc., not the check_tree/proof_attempts/<prover_name>/steps.