Practical Considerations - HeisenbugLtd/spat GitHub Wiki

Optimizing proof times, a practical example

Just for fun, I finally tried to optimize the proof time for Saatana with my own tool.

First run with Z3 first and a rather large steps value:

Proof_Switches := ("--mode=all", "--proof=progressive", "--steps=131072", "-j0");
CVC4_First     := ("--prover=cvc4,z3,altergo") & Proof_Switches;
Z3_First       := ("--prover=z3,cvc4,altergo") & Proof_Switches;

--  Z3 seems significantly faster in proving certain VCs (see list below), so run it first.
for Proof_Switches ("ada") use Z3_First;

Time:

jellix@pikkumyy:~/projects/Saatana$ time gnatprove --assumptions --output-header -U -P saatana.gpr
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
Summary logged in /home/jellix/projects/Saatana/_build/gnatprove/gnatprove.out

real    4m11.275s
user    10m18.243s
sys     0m8.260s

Hmm, ok. We have a baseline result.

Suggested configuration

-- Proof_Switches as suggested by SPAT, timeout changed from 1 to 3 after proof failures.
for Proof_Switches ("saatana-crypto-lemmas.ads") use ("--prover=Z3", "--steps=1", "--timeout=3");
for Proof_Switches ("saatana-crypto-phelix.adb") use ("--prover=CVC4,Z3", "--steps=14009", "--timeout=3");
for Proof_Switches ("saatana-crypto.adb") use ("--prover=Z3", "--steps=1", "--timeout=3");
for Proof_Switches ("test_phelix_api.adb") use ("--prover=Z3", "--steps=1", "--timeout=3");

Applying that configuration

Trying to use exactly the suggested result (all timeout were 1) resulted in some proof failures, presumably because a timeout of only one second is a bit on the optimistic side once you take into account that the load of the machine may fluctuate rather heavily. So, after increasing that time to 3 seconds (as above), I got greeted with:

Time

jellix@pikkumyy:~/projects/Saatana$ time gnatprove --assumptions --output-header -U -P saatana.gpr 
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
Summary logged in /home/jellix/projects/Saatana/_build/gnatprove/gnatprove.out

real    4m6.088s
user    5m32.899s
sys     0m10.707s

Not too shabby... In CPU time that's almost a 50% gain.

Another suggestion?

Now, let's run spat again on that result (I mean, why not?):

for Proof_Switches ("saatana-crypto-lemmas.ads") use ("--prover=Z3", "--steps=1", "--timeout=1");
for Proof_Switches ("saatana-crypto-phelix.adb") use ("--prover=Z3,CVC4", "--steps=3986", "--timeout=2");
for Proof_Switches ("saatana-crypto.adb") use ("--prover=Z3", "--steps=1", "--timeout=1");
for Proof_Switches ("test_phelix_api.adb") use ("--prover=Z3", "--steps=1", "--timeout=1");

The (almost) only difference is that SPAT now suggests to let Z3 run first again. But of course, also with an even lower steps value. Which maybe even makes sense, when you consider the fact, even though we know that Z3 will definitely fail on some proofs, we may actually gain something overall, because even when Z3 inevitably will fail on certain verification conditions just like in the first run, it will now fail faster (literally), so CVC4 can take over earlier than before. The net result might be a gain in computing time...

Looking at the data, we should just try not to hit the timeouts too hard:

saatana-crypto-phelix.spark              => (Flow  => 179.6 ms,
                                             Proof => 1.9 s (3986 steps)

I mean, 1.9 is still less than 2, isn't it?

So let's see: We switch the provers again and lower the steps values even more, but to be on the safe side, let's leave the timeout at 3 seconds.

To be continued...

Alrighty, that was not even a commercial break for me, but...

real    1m39.221s
user    2m28.087s
sys     0m5.598s

Are you serious? From initially ~10:26 min down to 5:45, and now we get 2:34 or so?

That's like a 400% improvement. I must admit, I did not expect that. The first result was rather fine already...