methods - CAVE-PNP/cave-pnp GitHub Wiki

Proof methods (work in progress)

Quick Reference

Proof Meta-Tools

These are not proof methods, but will assist in solving goals.

  • try - try everything (built-in & external solvers, counterexample finders)
  • try0 - try fast built-in solvers
  • sledgehammer - try external tools (slower but more powerful)
  • solve_direct - check for lemmas that prove the goal directly
  • quickcheck, nitpick - counterexample finders

Automated Solvers

Well-integrated into Isabelle/HOL, these can be thrown at most goals without additional knowledge or setup. Note that when introducing definitions, it will be necessary to mark relevant facts for the simplifier and classical reasoner.

  • auto combines the methods of simp and blast (acts on all subgoals)
  • simp repeatedly rewrites the goal with simplification rules (the simplifier)
    • simp_all acts on all subgoals
    • code_simp applies simp with only code generation rules (see eval)
  • blast applies natural-deduction-style reasoning (the classical reasoner)
  • force and fastforce are comparable to auto in terms of capabilities

Specialized Solvers

These are either specialized towards certain problems, or require significant effort in order to be usable on general goals.

  • arith, linarith solve linear arithmetic problems (often outperformed by simp)
  • metis, meson use model elimination/resolution (often generated by sledgehammer)
    • unlike most other methods, these only have access to the facts provided to them manually
  • smt uses the Z3 SMT solver (often generated by sledgehammer)
  • auto2 is the method of the Auto2 proof framework in Isabelle (needs AFP import Auto2_HOL.Auto2_Main)

Single-step/action methods

In contrast to automated methods, these do not perform reasoning themselves, but are rather used to manipulate the goal state directly. This is often useful in order to make goals simpler for the automated methods.

  • induction applies proof-by-induction and similar case-distinctions
  • cases applies arbitrary case-distinctions
  • unfold repeatedly rewrites the goal given arbitrary equations ("definitional rewrite rules")
    • equivalent to the unfolding statement
    • fold applies them in reverse
    • rewrite grants more control (needs import HOL-Library.Rewrite)
    • subst applies a rewrite rule once (if it has any premises, these will be added as new subgoals)
  • intro, elim repeatedly apply introduction and elimination rules, respectively
  • rule applies a single rule (backward reasoning)
  • fact solves the goal if the given rule (or some fact already shown in the current proof) matches the goal exactly
  • assumption solves the goal if it already appears in its assumptions
  • contradiction solves any goal if P and ¬P appear in the assumptions
  • insert adds facts as assumptions to the current goal
    • similar to chaining facts with then or using statements
  • intro_locales, unfold_locales, intro_classes apply locale/class definitions

Improper methods

These only perform informal verification and circumvent formal verification through the use of proof oracles, which dynamically introduce axioms. Results proven using improper methods should not be given the same amount of trust as regular proofs, but they can be useful as quick estimates or when numeric evaluation is sufficient.

To check whether an oracle was involved in proving a fact (or lemma), use thm_oracles <fact>

  • eval calculates the result using the unverified code generator (oracle holds_by_evaluation)
    • can often be replaced by the slower, but proper method code_simp
  • approximation calculates bounded inequalities (oracle holds, needs import HOL-Decision_Procs.Approximation)

Combination operators

These can be used to chain methods together and solve complex goals. However, they should not be overused, as this can quickly make any proof unreadable. Instead, try using Isar to add intermediate steps or state subgoals more explicitly.

If a complex combination pattern is used frequently, consider extracting it with Eisbach, Isabelle's proof method definition framework.

  • <method1>, <method2> apply method1 then apply method2
  • <method1>; <method2> apply method1 then apply method2 to all subgoals generated by method1
  • <method1> | <method2> apply method1. if it fails, apply method2
  • <method>+ apply method to each subgoal in order until it fails for one
  • <method>? apply method and ignore failure

Related Statements/Commands

  • .. (= by standard) tries to solve the goal in a single intro/elim step
  • . (= by this) solves the goal if the assumptions (applied in order) already match it
  • sorry skips the proof entirely (oracle skip_proof)
  • oops cancels the entire proof

Less Relevant Methods

  • fast, best, slow follow a similar approach as blast
    • fastforce, bestsimp, slowsimp integrate the simplifier into these methods, respectively

Automated Proof Finding Tools

While there is value in knowing what each method and tactic is capable of and how they do it, often it may be faster, easier, and/or more complete to just throw everything at the problem and see what sticks. Isabelle provides a few such convenience commands.

Note that these commands do not appear in finished proof documents, but only find and suggest valid Isabelle proofs. Remove them after applying the solution.

try0 (almost) all the automated tools

Is there a trivial solution to my problem?

try0 applies many of the built-in solvers and reports results. This is faster than trying out several methods manually and may even demonstrate some obscure method that easily solves the goal where others fail.

Examples:

lemma "1 + 1 = 2" try0 (* Try this: by simp *)
lemma "(∃y. ∀x. F x y ⟷ F x x) ⟶ ¬ (∀x. ∃y. ∀z. F z y ⟷ ¬ F z x)"
  try0 (* Try this: by blast *)

Note: This is very helpful for deciding which method may be best suited for a particular goal.

try everything

You solve this while I go grab a coffee.

try extends this by additionally calling sledgehammer (external tools) and solve_direct (is there a lemma for this?), as well as quickcheck and nitpick (counterexample finders).

Examples:

lemma "1 + 1 = 2" try (* Try0 found a proof: by simp *)

lemma "∄x::nat. x^2 = 2"
  try0 (* No proof found *)
  try (* "some_solver": Try this: by ... *)

lemma "∄x::nat. x^2 = 4"
  try0 (* No proof found *)
  try (* Quickcheck found a counterexample: x = 2 *)

Compared to try0 this is a lot more powerful, since sledgehammer can take any available proven fact into account, and quickcheck and nitpick readily point out errors in the proposition. However, this also means it is significantly slower and requires more resources.

Sledgehammer

Take any lemma you want (and all the CPUs and RAM you need), but find me a proof!

sledgehammer collects available facts (those from the current and imported session) it deems relevant and launches a few external automated theorem provers (ATP) in parallel. If any of them report a solution, it then converts this into a valid Isabelle proof. Each of these three phases may take significant amounts of time, but make sledgehammer the most powerful tool available to Isabelle users.

Proofs generated by sledgehammer are often hard to read and inefficient, but provide two key insights: (1) the proposition is provable and (2) a set of facts that can be used to prove it. Depending on the circumstances (1) can be entirely sufficient. In other cases it may be beneficial to take a look at (2) and try to construct a step-by-step Isar proof from it.

Note: calling sledgehammer directly (instead of through try) has the benefit of displaying Proof found... as soon as any external prover reports a solution. As there are no other kinds of progress indicators, this provides quicker feedback.

Usage Tips

Specify relevant lemmas: The methods applied by try0 will only be able to use specific lemmas, so their knowledge can be quite limited at times, especially when working with new definitions. Additionally, even though sledgehammer is able to search for useful lemmas by itself, it may save time and improve the results to manually provide facts known to be relevant.

lemma "..." using relevant_facts try0

Manually apply induction or other case-distinctions (where applicable): The automated tools are notoriously bad at applying case-distinctions. This is covered in the main tutorial "Programming and Proving in Isabelle/HOL".

fun sum_upto :: "nat ⇒ nat" where
  "sum_upto 0 = 0"
| "sum_upto (Suc n) = Suc n + sum_upto n"

lemma prog_prove_ex2_5: "sum_upto n = n * (n+1) div 2"
  try (* Tried in vain *)
proof (induction n) (* Proof outline with cases: *)
  case 0
  then show ?case try0 (* Try this: *) by simp
next
  case (Suc n)
  then show ?case try0 (* Try this: *) by simp
qed

Automated Solvers

The Simplifier

The simplifier simp repeatedly rewrites terms with simplification rules ("definitional rewrite rules"), until either the goal has been proven or no more rules apply. Simplification rules are generally equivalences of the form A ≡ B; given this rule, all instances of A will unconditionally replaced by B. The simplifier can also work with conditional rules of the form P ⟹ A ≡ B, in which case A will only be replaced, if the simplifier can prove P (recursively).

Simplification will always only replace the left side of rules by the right side (for A ≡ B, replace A by B), to avoid looping. This means that for simplification rules, the left side should always be simpler than the right side (note that simpler does not necessarily mean shorter). In particular, simplification rules should never include the left side as part of the right side; rules like f x = g (f (g x)) will lead to simp looping whenever f is encountered.

By default, simp only uses predefined simplification rules and ones derived from the assumptions, but its behaviour can be modified on the spot using flags, for instance by (simp add: some_rule).

  • add: use these additional rules for simplification
  • del: do not use the following rules
  • only: use these and nothing else
  • flip: use these rules in reverse
    • flip: ‹A ≡ B› is equivalent to add: ‹B ≡ A› del: ‹A ≡ B›
  • split: specifies splitter rules (simp version of case-distinctions)
    • see for instance if_split orw option.split
    • use split!: to mark a split as safe: after each split, safe will be applied, effectively splitting the goal
    • use split del: to remove split rules
  • cong: specifies congruence rules (how to prove (f a b ...) = (f x y ...))
    • useful for adding assumptions for arguments
    • see for instance thm HOL.all_cong: when proving (∀x. Q x ⟶ P x) = (∀x. Q x ⟶ P' x), P x = P' x only needs to hold when Q x holds. This allows the simplifier to use Q x when trying to prove P x = P' x
    • use cong del: to remove congruence rules

Arbitrary rules can also be marked as simplification rules globally by adding [simp] after their name (similar for [split] and [cong]).

lemma some_rule[simp]: "1 + 1 = 2" <proof>

Only mark rules as simplification rules if you always want to apply them, as simp will blindly apply any matching rules. In most cases it is easier to remember to add a specific rule on the spot when it is needed, than to figure out why simp gets stuck when it is not relevant.

Single Step Tactics

TODO induct/induction

  • induction by function rules, as opposed to "normal" induction following the rules of the function Suc/S. See prog-prove.pdf, 4.4.3 Computation Induction
  • other induction rules like nat_induct_non_zero and nat_induct_at_least

TODO [intro]/introduction rules

TODO sorry

listed from ~weak to strong

TODO the .. ("default") and . ("immediate") tactics. something like:

  • . proves an already known result (including rewriting constant abbreviations)
  • .. proves a goal through a single application of one of the natural deduction axioms

TODO unfold (acts on all subgoals)

TODO auto (acts on all subgoals)

More Tools

Auto2

Auto2 is a proof framework for Isabelle. It introduces the powerful auto2 method and a custom proof language similar to Isar, both of which can be used alongside Isabelle/Isar proofs.

The proof language is a step further toward reasoning in steps that a human may take ("proof scripts at a level of detail comparable to human exposition"). In it, it is not necessary to reference local facts directly (everything in the local context is available (?)), or specify methods used to proof an individual step (everything is by auto2).

Example: Excerpt from Auto2_HOL/HOL/Primes_Ex.thy (re-proves results from HOL/Computational_Algebra/Primes.thy as a benchmark).

(* from Auto2_HOL/HOL/Primes_Ex.thy *)

definition prime :: "nat ⇒ bool" where [rewrite]:
  "prime p = (1 < p ∧ (∀m. m dvd p ⟶ m = 1 ∨ m = p))"

theorem exists_prime: "∃p. prime p"
@proof @have "prime 2" @qed

lemma prime_odd_nat: "prime p ⟹ p > 2 ⟹ odd p" by auto2

Trying to reproduce exists_prime in regular Isar yields:

theorem exists_prime: "∃p. prime p"
proof
  show "prime 2" unfolding prime_def
    sledgehammer (* "spass": Try this: *)
    by (meson dvd_add_triv_left_iff dvd_antisym dvd_trans even_plus_one_iff one_dvd one_less_numeral_iff semiring_norm(76))
qed

These and more examples from Primes_Ex.thy demonstrate impressively concise proofs, but these seem harder to read (this may be due to missing familiarity with the framework).

In the examples there is a significant amount of "setup" work; marking lemmas for use in auto2 and specifying the necessary conditions therefore (these are similar but apparently not equal to the existing simp, intro, ... tags in Isabelle). In addition, it is currently not possible (?) to incorporate Isar and the auto2 proof language as subproofs in each other. These two properties seem to entail that projects can be developed either mainly with or mainly without auto2 (and not just using by auto2 as addition to other tools).

Auto2 is included in the AFP. To use it, import Auto2_HOL.Auto2_Main.

Sources, Further Reading

The cookbook's section on Proof Methods is recommended reading. For more information see the isar-ref.pdf ch. 9 Generic tools and packages, especially chs. 9.2.1-9.2.2, 9.3.1, 9.4.3-9.4.4. Refer to chs. 12.1-12.2 for more tools (see also chs. 12.4-12.8 for some more uncommon tactics). See chs. 9.3.2, 9.4.2 for theorem attributes that allow the automatic tools to use them effectively.

⚠️ **GitHub.com Fallback** ⚠️