Avoiding Common Modeling Mistakes - LS-Lab/KeYmaeraX-release GitHub Wiki

Theorem provers are great at telling us whether a formula is true, but they are awful at telling us whether a formula is the formula we actually wanted to prove. As an extreme example, if we wanted to know whether we Riemann Hypothesis is true, but instead asked a theorem prover whether 1 + 1 = 2, we have certainly learned very little about the Riemann Hypothesis when the theorem prover responds yes, 1 + 1 = 2 is true. The moral of the story is that when we verify a system in KeYmaera X, we must take special care that the model means the right thing. Otherwise, our proof could be just as meaningless as 1 + 1 = 2. The good news is that most of the common errors fall into a few classes that you can look for and avoid.

Most of the time, the theorem we want to prove looks like this:

For all safe starting states v, for all times t >= 0, the system is still safe at time t.

Excessive preconditions

The first way a model can be incorrect is if the definition of "all safe starting states" is too strict. For example, if we have a model of a car driving to some destination, then it would be a terrible precondition to say pos = destination. The hard part of making a car safe is making it safe when it's moving, so it's not meaningful if we say it's safe as long as it's already at the destination. That being said, all interesting models need some preconditions. If our precondition says "my car has enough space to brake", that's a totally reasonable precondition. To avoid this class of mistake, try to understand how often your preconditions are true. (If they are never true, then the model is definitely broken).

Insufficient time-nondeterminism

Another way that models fail is when they fail to describe "all times t". Recall that the formula [a]P means "P is true for all executions of a". A model is only accurate if "all runs of a" also means "all times t".

The first way that this goes wrong is if you add a test affecting a time variable: x= 0 -> [t := 0; {x' = 5, t' = 1 & t <= 10}; ?(t = 10)]x = 50 This formula is true because x = 50 is true at time 10, but it will clearly not be true at all times t!

This can also go wrong if the model prevents us from "seeing" interesting intermediate states. For example the following is true:

[{x' = v, v' = 1}; {x' = v, v' = -1}]v'=-1

This is because the formula [a]P only looks at all the end states of a. We need all of the "interesting" states to be end states of a, otherwise our formula doesn't mean what we thought. In the example above, we are only ever proving that we are safe while break, not safe while we accelerate, which is a potentially deadly mistake.

How do we avoid the mistake? Unless you have an extremely good reason, always structure your models as loops (ctrl;physics)* where ctrl is a discrete program implementing the control logic and physics is one or more differential equations. Why does this help? Because discrete programs always take 0 time and differential equations nondeterministically take any amount of time. So we can easily decompose the meaning of our model as "we might do any number of control loops, and for each loop we know we're safe throughout the entire loop"

Sneaky hidden assumptions

It is also important that we only make safety assumptions about the starting state. It is surprisingly easy to write down a model that says "assuming the system is safe at all times t, it is safe at all times t", which does not tell us anything useful! Specifically, any time we write down a test ?(P) or a differential equation {x' = theta & P}, we are making an assumption that P is true in the current state, not proving it.

What do we do? The solution is different for tests and ODEs, but in both cases the solution is to use good style and write our models in a way that makes it easy to check that our tests and ODEs don't add any unwanted assumptions.

Avoiding Hidden Assumptions in Tests

Tests are most often used to implement if-then-else style control flow. This is ok because we don't assume some formula P is true. An if-then-else just assumes P | !P which is always true, and does something different based on whether P is true or not. To make sure your tests are ok, always make sure the test in your last branch is the negation of the previous test(s), like so:

{?(P); a} ++ {?(!P); b}

or

{?(P); a} ++ {?(Q); b} ++ {?(!P & !Q); c}

Avoiding Hidden Assumptions in ODEs

The simplest modeling mistake with ODEs is to put formulas inside a domain constraint that we ought to prove instead of assume. For example if we have a safety theorem "safe", then

[{x' = theta & ... some physics ... & safe}]safe

is always trivially true, and we don't even need to look at the physics to prove it! Make sure the only things you ever put in the domain constraint are assumptions about the physical world (and acceptable assumptions at that). Things like "braking never makes a car go backwards" or "a falling ball never falls through the ground"

This mistake also occurs when implementing an event-triggered model by putting multiple ODEs inside a choice. Recall that the different domain constrains in an event-triggered model represent different discrete modes of operation, and the model is only accurate if the model allows the possibility of switching between modes. i.e. if we have

a = {x' = theta1 & P} ++ {y' = theta & Q}

then the modes are P and Q, and the model is only accurate if P & Q is satisfiable - otherwise it is impossible to switch modes. Why is switching modes so important? Generally our precondition is strong enough that the initial state is always in the same mode (say mode P). If the ODE does not allow us to switch to mode Q, then it's like mode Q doesnt even exist, and so it's like we made a hidden assumption "much of the physical world doesn't exist, and, by the way, that includes all the dangerous parts".

The region P & Q is the region where we switch modes, and is also the region where the controller will fire. Generally, the easiest way to implement this region is by using nonstrict inequalities >= and <= in your domain constraints (e.g. >= in P and <= in Q) in which case you switch modes when the two sides of the inequality are exactly equal.