Understanding the induction on evidence in coq - coq

I am working on the theorem ev_ev__ev in IndProp.v of Software Foundations (Vol 1: Logical Foundations).
Theorem ev_ev__ev : forall n m,
even (n+m) -> even n -> even m.
Proof.
intros n m Enm En. induction En as [| n' Hn' IHn'].
- (* En: ev_0 *) simpl in Enm. apply Enm.
- (* En: ev_SS n' Hn': even n'
with IHn': even (n' + m) -> even m *)
apply IHn'. simpl in Enm. inversion Enm as [| n'm H]. apply H.
Qed.
where even is defined as:
Inductive even : nat -> Prop :=
| ev_0 : even 0
| ev_SS (n : nat) (H : even n) : even (S (S n)).
At the point of the second bullet -, the context as well as the goal is as follows:
m, n' : nat
Enm : even (S (S n') + m)
Hn' : even n'
IHn' : even (n' + m) -> even m
______________________________________(1/1)
even m
I understand how m, n', Enm, Hn' in the context are generated. However, how is IHn' generated?

Induction hypotheses are systematically created for premises of constructors that are in the same type family. So, you can look at each constructor independently.
Assume you have an inductive definition of a type that starts with:
Inductive arbitraryName : A -> B -> Prop :=
An induction principle called arbitraryName_ind will be created, which starts with a quantification over an arbitrary predicate usually called P with the same type
forall P : A -> B -> Prop,
Now, if you have a constructor of the form
arbitrary_constructor : forall x y, arbitraryName x y -> ...
The induction principle will have a sub-clause for this constructor that starts with the same quantifications over all variables in the constructor, the same hypothesis, plus an induction hypothesis for the premise that relies on arbitraryName.
forall x y, arbitraryName x y -> P x y -> ...
Finally, each constructor of the inductive definition has to finish with an application of the defined type family (in this case arbitraryName). The end of the clause for this constructor apply the function P to the same argument.
Let's go back to arbitrary_constructor and suppose it has the following full type:
arbitrary_constructor : forall x y, arbitraryName x y -> arbitraryName (g x y) (h x y)
In that case the clause in the induction principle is :
(forall x y, arbitraryName x y -> P x y -> P (g x y) (h x y))
In the case of even, there is a constructor ev_SS that has the following shape:
ev_SS : forall x, even x -> even (S (S x))
So the clause that is generated has the following shape:
(forall x, even x -> P x -> P (S (S x)))
The induction hypothesis IHn' corresponds exactly to this P in the clause.
The full induction principle has the following shape.
forall P : nat -> Prop, P 0 ->
(forall x, even x -> P x -> P (S (S x))) ->
forall n, even n -> P n
When you type induction En, this theorem is applied. The hypothesis even n, where n is universally quantified, is matched with the text of En in the goal at that moment. It turns out that the statement of that hypothesis is even n (the n here is fixed in the goal) so the universally quantified n is instantiated with the local n from the goal context. Then, the tactic tries to find all the hypotheses in the context where this n appears. In this case, there is Enm, so this hypothesis is used to define the P on which the induction principle will be instantiated. In a sense, what happens is that Enm is put back in the goal's conclusion, as if one had executed revert Enm.
We need P n to be the same thing as even (n + m) -> even m. The most natural solution is that P is equal to the function fun x => even (x + m) -> even m
So in the second case of the proof by induction, a new n' is introduced and P is applied to n' to give the contents of the induction hypothesis:
(even (n' + m) -> even m)
and P is applied to S (S n') to give the contents of the final goal.
even (S (S n') + m) -> even m
Now, at the time of calling the induction tactic, the hypothesis Enm was in the context, so the statement even (S (S n') + m), which is morally an offspring of Enm is put back in the context with the same name. Note that there was already a hypothesis named Enm in the other goal, but the statement was again different.
It is normal that you have a question on how this induction hypothesis was generated, because what happens actually involves several operations.

Related

Destruct hypothesis: general case

That's pretty clear what destruct H does if H contains conjunction or disjunction. But I can't figure out what it does in general case. It does something bizarre, especially if H: a -> b.
Some examples:
Lemma demo : forall (x y: nat), x=4 -> x=4.
Proof.
intros. destruct H.
The hypothesis is just destroyed:
1 subgoal
x, y : nat
______________________________________(1/1)
x = x
Another one:
Lemma demo : forall (x y: nat), (x = 4 -> x=4) -> True.
Proof.
intros. destruct H.
Now I have two branches:
1 subgoal
x, y : nat
______________________________________(1/1)
x = 4
1 subgoal
x, y : nat
______________________________________(1/1)
True
Third example. It's not provable but it still doesn't make sense to me:
Lemma demo : forall (x y: nat), (x = 4 -> x = 4) -> x = 4.
Proof.
intros. destruct H.
Now I have to prove x = x in the second branch!
2 subgoals
x, y : nat
______________________________________(1/2)
x = 4
______________________________________(2/2)
x = x
So, I clearly don't understand what destruct H does.
The cases you are referring to fall in two categories. If H : A and A is inductively or coinductively defined (e.g., conjunction and disjunction), then destruct H generates one subgoal for each constructor in that type, with additional hypotheses determined by the arguments of that constructor. On the other hand, if H : A -> B, then destruct H generates one subgoal where you have to prove A, and then continues recursively as if H : B. This is roughly equivalent to the following calls:
assert (H' : A); [ |specialize (H H'); destruct H].
The missing piece of the puzzle is that equality itself is defined as an inductive type:
Inductive eq (A : Type) (a : A) : A -> Prop :=
| eq_refl : eq A a a
When you destruct something of type x = 4, Coq generates one case for each constructor of that type. But there is only one constructor in that type: eq_refl. When considering that case, Coq also automatically replaces occurrences of the RHS of destructed equality by the LHS (since both sides are equal for that constructor). In your first and third examples, this leads to replacing 4 in the goal with x.
Most of the time, you do not want to destruct an equality hypothesis, since this replacement behavior is not very useful. It is usually better to use the rewrite tactic, since it allows you to rewrite from rightto-left or left-to-right.

Coq: Induction on associated variable

I can figure out how to prove my "degree_descent" Theorem below if I really need to:
Variable X : Type.
Variable degree : X -> nat.
Variable P : X -> Prop.
Axiom inductive_by_degree : forall n, (forall x, S (degree x) = n -> P x) -> (forall x, degree x = n -> P x).
Lemma hacky_rephrasing : forall n, forall x, degree x = n -> P x.
Proof. induction n; intros.
- apply (inductive_by_degree 0). discriminate. exact H.
- apply (inductive_by_degree (S n)); try exact H. intros y K. apply IHn. injection K; auto.
Qed.
Theorem degree_descent : forall x, P x.
Proof. intro. apply (hacky_rephrasing (degree x)); reflexivity.
Qed.
but this "hacky_rephrasing" Lemma is an ugly and unintuitive pattern to me. Is there a better way to prove degree_descent all by itself? For example, using set or pose to introduce n := degree x and then invoking induction n isn't working because it annihilates the hypothesis from the subsequent contexts (if someone could explain why this occurs, too, that would be helpful!). I can't figure out how to get generalize to work with me here, either.
PS: This is just weak induction for simplicity, but ideally I would like the solution to work with custom induction schemes via induction ... using ....
It looks like you would like to use the remember tactic:
Variable X : Type.
Variable degree : X -> nat.
Variable P : X -> Prop.
Axiom inductive_by_degree : forall n, (forall x, S (degree x) = n -> P x) -> (forall x, degree x = n -> P x).
Theorem degree_descent : forall x, P x.
Proof.
intro x. remember (degree x) as n eqn:E.
symmetry in E. revert x E.
(* Goal: forall x : X, degree x = n -> P x *)
Restart. From Coq Require Import ssreflect.
(* Or ssreflect style *)
move=> x; move: {2}(degree x) (eq_refl : degree x = _)=> n.
(* ... *)

IndProp: ev_plus_plus

(** **** Exercise: 3 stars, standard, optional (ev_plus_plus)
This exercise just requires applying existing lemmas. No
induction or even case analysis is needed, though some of the
rewriting may be tedious. *)
Theorem ev_plus_plus : forall n m p,
even (n+m) -> even (n+p) -> even (m+p).
Proof.
intros n m p H1 H2.
Here is what I got:
1 subgoal (ID 89)
n, m, p : nat
H1 : even (n + m)
H2 : even (n + p)
============================
even (m + p)
I have proven the previous theorem:
Theorem ev_ev__ev : forall n m,
even (n+m) -> even n -> even m.
And wanted to apply it to H1, but
apply ev_ev__ev in H1.
gives an error:
Error: Unable to find an instance for the variable m.
Why can't it find "m" in the expression even (n + m)? How to fix?
Update
apply ev_ev__ev with (m:=m) in H1.
gives a very strange result:
2 subgoals (ID 90)
n, m, p : nat
H1 : even m
H2 : even (n + p)
============================
even (m + p)
subgoal 2 (ID 92) is:
even (n + m + m)
I thought that it will transform H1 to 2 hypothesis:
H11 : even n
H12 : even m
But instead it gave 2 subgoals, the second that we need to prove is more complicated than the initial one:
even (n + m + m)
What's happening here?
The statement forall n m, even (n+m) -> even n -> even m. does not mean "if we have that (n + m) is even then we have both that n is even and that m is even" (this is false, consider n = m = 1). Instead it means "if we have that (n+m) is even, and we have that n is even, then we have that m is even".
There is no way to get H11 : even n and H12 : even m just from H1 : even (n + m) without assuming a contradiction. I would suggest figuring out how to prove your theorem with pen and paper before trying to prove it in Coq.
Because Coq can't figure out what value it should give for m. You can apply the tactic eapply ev_ev__ev in H1. and see the goals
n, m, p : nat
H2 : even (n + p)
H1 : even ?m
============================
even (m + p)
subgoal 2 (ID 17) is:
even (n + m + ?m)
Coq has instantiated the m with a meta variable ?m, and you need to give a witness for this meta variable in the end to finish the proof.
Second approach is just apply the tactic with instantiating the value of m apply ev_ev__ev with (m := m) in H1.
You can see more on apply with tactics in software-foundations https://softwarefoundations.cis.upenn.edu/lf-current/Tactics.html
The thing that is happening is that Coq unifies H1 with the even n argument of ev_ev__ev instead of the even (n+m).
You can tell Coq exactly where you want H1 to go, and use _ wildcards for the places where you let Coq work out the details.
You probably wanted this the term ev_ev__ev n m H1 with type even n -> even m but your apply produced the term ev_ev__ev (n+m) m _ H1 which also left you with some more stuff to prove. To take a look at the proof context, do
Check ev_ev__ev (n+m) m _ H1.

Proof leaking in Coq extraction?

In order to understand how general recursive Function definitions works, and how they comply with Coq's structural recursion constraint, I tried to reimplement it on the Peano natural numbers. I want to define recursive nat -> nat functions that can use any previous values, not just the predecessor. Here is what I did :
Definition nat_strong_induction_set
(* erased at extraction, type specification *)
(P : nat -> Set)
(* The strong induction step. To build the P n it can, but does not have to,
recursively query the construction of any previous P k's. *)
(ind_step : forall n : nat, (forall k : nat, (lt k n -> P k)) -> P n)
(n : nat)
: P n.
Proof.
(* Force the hypothesis of ind_step as a standard induction hypothesis *)
assert (forall m k : nat, lt k m -> P k) as partial_build.
{ induction m.
- intros k H0. destruct k; inversion H0.
- intros k H0. apply ind_step. intros k0 H1. apply IHm. apply (lt_transitive k0 k).
assumption. apply le_lt_equiv. assumption. }
apply (partial_build (S n) n). apply succ_lt.
Defined.
I used some custom lemmas on nats that I didn't paste here. It works, I managed to define the euclidean division div a b with it, which recursively uses div (a-b) b. The extraction is almost what I expected :
let nat_strong_induction_set ind_step n =
let m = S n in
let rec f n0 k =
match n0 with
| O -> assert false (* absurd case *)
| S n1 -> ind_step k (fun k0 _ -> f n1 k0)
in f m n
Except for the n0 parameter. We see that the only effect of this parameter is to stop the recursion at the S n-nth step. The extraction also mentions that this assert false should not happen. So why is it extracted ? This seems better
let nat_strong_induction_set ind_step n =
let rec f k = ind_step k (fun k0 _ -> f k0)
in f n
It looks like a glitch of Coq's structural recursion constraint, to ensure the termination of all recursions. The Coq definition of nat_strong_induction_set writes lt k n, so Coq knows only previous P k's will be queried. This makes a decreasing chain in the nats, which is forced to terminate in less than S n steps. This allows a structural recursive definition on an additional fuel parameter n0 starting at S n, it won't affect the result. So if it is only a part of the termination proof, why is it not erased by the extraction ?
Your match is not erased because your definition mixes two things: the termination argument, where the match is needed, and the computationally relevant recursive call, where it isn't.
To force erasure, you need to convince Coq that the match is computationally irrelevant. You can do so by making the termination argument -- that is, the induction on m -- produce the proof of a proposition instead of a function of type forall m k, lt k m -> P k. Luckily, the standard library provides an easy way of doing so, with the Fix combinator:
Require Import Coq.Arith.Wf_nat.
Definition nat_strong_induction_set
(P : nat -> Set)
(ind_step : forall n : nat, (forall k : nat, (lt k n -> P k)) -> P n)
(n : nat)
: P n :=
Fix lt_wf P ind_step n.
Here, lt_wf is a proof that lt is well-founded. When you extract this function, you get
let rec nat_strong_induction_set ind_step n =
ind_step n (fun y _ -> nat_strong_induction_set ind_step y)
which is exactly what you wanted.
(As an aside, note that you don't need well-founded recursion to define division -- check for instance how it is defined in the Mathematical Components library.)

Induction on predicates with product type arguments

If I have a predicate like this:
Inductive foo : nat -> nat -> Prop :=
| Foo : forall n, foo n n.
then I can trivially use induction to prove some dummy lemmas:
Lemma foo_refl : forall n n',
foo n n' -> n = n'.
Proof.
intros.
induction H.
reflexivity.
Qed.
However, for a predicate with product type arguments:
Inductive bar : (nat * nat) -> (nat * nat) -> Prop :=
| Bar : forall n m, bar (n, m) (n, m).
a similar proof for nearly identical lemma gets stuck because all assumptions about variables disappear:
Lemma bar_refl : forall n n' m m',
bar (n, m) (n', m') -> n = n'.
Proof.
intros.
induction H.
(* :( *)
Why is this happening? If I replace induction with inversion, then it behaves as expected.
The lemma is still provable with induction but requires some workarounds:
Lemma bar_refl : forall n n' m m',
bar (n, m) (n', m') -> n = n'.
Proof.
intros.
remember (n, m) as nm.
remember (n', m') as n'm'.
induction H.
inversion Heqnm. inversion Heqn'm'. repeat subst.
reflexivity.
Qed.
Unfortunately, this way proofs gets completely cluttered and are impossible to follow for more complicated predicates.
One obvious solution would be to declare bar like this:
Inductive bar' : nat -> nat -> nat -> nat -> Prop :=
| Bar' : forall n m, bar' n m n m.
This solves all the problems. Yet, for my purposes, I find the previous ("tupled") approach somewhat more elegant. Is there a way to keep the predicate as it is and still be able to do manageable inductive proofs? Where does the problem even come from?
The issue is that induction can only works with variables, not constructed terms. This is why you should first prove something like
Lemma bar_refl : forall p q, bar p q -> fst p = fst q.
which is trivially proved by now induction 1. to prove your lemma.
If you don't want the intermediate lemma to have a name, your solution is the correct one: you need to help Coq with remember to generalize your goal, and then you'll be able to prove it.
I don't remember exactly where this restriction comes from, but I recall something about making some unification problem undecidable.
Often in these situation one can do induction on one of the sub-terms.
In your case your lemma can be proved by induction on n, with
Lemma bar_refl : forall n n' m m', bar (n, m) (n', m') -> n = n'.
Proof. induction n; intros; inversion H; auto. Qed.
... all assumptions about variables disappear... Why is this happening? If I replace induction with inversion, then it behaves as expected.
The reason that happens is described perfectly in this blog post:
Dependent Case Analysis in Coq without Axioms
by James Wilcox. Let me quote the most relevant part for this case:
When Coq performs a case analysis, it first abstracts over all indices. You may have seen this manifest as a loss of information when using destruct on predicates (try destructing even 3 for example: it just deletes the hypothesis!), or when doing induction on a predicate with concrete indices (try proving forall n, even (2*n+1) -> False by induction on the hypothesis (not the nat) -- you'll be stuck!). Coq essentially forgets the concrete values of the indices. When trying to induct on such a hypothesis, one solution is to replace each concrete index with a new variable together with a constraint that forces the variable to be equal to the correct concrete value. destruct does something similar: when given a term of some inductive type with concrete index values, it first replaces the concrete values with new variables. It doesn't add the equality constraints (but inversion does). The error here is about abstracting out the indices. You can't just go replacing concrete values with arbitrary variables and hope that things still type check. It's just a heuristic.
To give a concrete example, when using destruct H. one basically does pattern-matching like so:
Lemma bar_refl : forall n n' m m',
bar (n, m) (n', m') -> n = n'.
Proof.
intros n n' m m' H.
refine (match H with
| Bar a b => _
end).
with the following proof state:
n, n', m, m' : nat
H : bar (n, m) (n', m')
a, b : nat
============================
n = n'
To get almost the exact proof state we should've erased H from the context, using the clear H. command: refine (...); clear H.. This rather primitive pattern-matching doesn't allow us to prove our goal.
Coq abstracted away (n, m) and (n',m') replacing them with some pairs p and p', such that p = (a, b) and p' = (a, b). Unfortunately, our goal has the form n = n' and there is neither (n,m) nor (n',m') in it -- that's why Coq didn't change the goal with a = a.
But there is a way to tell Coq to do that. I don't know how to do exactly that using tactics, so I'll show a proof term. It's is going to look somewhat similar to #Vinz's solution, but notice that I didn't change the statement of the lemma:
Undo. (* to undo the previous pattern-matching *)
refine (match H in (bar p p') return fst p = fst p' with
| Bar a b => _
end).
This time we added more annotations for Coq to understand the connections between the components of H's type and the goal -- we explicitly named the p and p' pairs and because we told Coq to treat our goal as fst p = fst p' it will replace p and p' in the goal with (a,b). Our proof state looks like this now:
n, n', m, m' : nat
H : bar (n, m) (n', m')
a, b : nat
============================
fst (a, b) = fst (a, b)
and simple reflexivity is able to finish the proof.
I think now it should be clear why destruct works fine in the following lemma (don't look at the answer below, try to figure it out first):
Lemma bar_refl_all : forall n n' m m',
bar (n, m) (n', m') -> (n, m) = (n', m').
Proof.
intros. destruct H. reflexivity.
Qed.
Answer: because the goal contains the same pairs that are present in the hypothesis's type, so Coq replaces them all with appropriate variables and that will prevent the information loss.
Another way ...
Lemma bar_refl n n' m m' : bar (n, m) (n', m') -> n = n'.
Proof.
change (n = n') with (fst (n,m) = fst (n',m')).
generalize (n,m) (n',m').
intros ? ? [ ]; reflexivity.
Qed.