How to apply a function once during simplification in Coq? - coq

From what I understand, function calls in Coq are opaque.
Sometimes, I need to use unfold to apply it and then fold to turn the function definition/body back to its name. This is often tedious. My question is, is there an easier way to let apply a specific instance of a function call?
As a minimal example, for a list l, to prove right-appending [] does not change l:
Theorem nil_right_app: forall {Y} (l: list Y), l ++ [] = l.
Proof.
induction l.
reflexivity.
This leaves:
1 subgoals
Y : Type
x : Y
l : list Y
IHl : l ++ [] = l
______________________________________(1/1)
(x :: l) ++ [] = x :: l
Now, I need to apply the definition of ++ (i.e. app) once (pretending there are other ++ in the goal which I don't want to apply/expand). Currently, the only way I know to implement this one time application is to first unfold ++ and then fold it:
unfold app at 1. fold (app l []).
giving:
______________________________________(1/1)
x :: l ++ [] = x :: l
But this is inconvenient as I have to figure out the form of the term to use in fold. I did the computation, not Coq. My question boils down to:
Is there a simpler way to implement this one-time function application to the same effect?

You can use simpl, compute or vm_compute if you want to ask Coq to perform some computation for you. If the definition of the function is Opaque, the above solution will fail, but you could first prove a rewriting lemma such as:
forall (A:Type) (a:A) (l1 l2: list A), (a :: l1) ++ l2 = a :: (l1 ++ l2).
using your technique, and then rewrite with it when necessary.
Here is an example using simpl:
Theorem nil_right_app: forall {Y} (l: list Y), l ++ nil = l.
Proof.
(* solve the first case directly *)
intros Y; induction l as [ | hd tl hi]; [reflexivity | ].
simpl app. (* or simply "simpl." *)
rewrite hi.
reflexivity.
Qed.
To answer your comment, I don't know how to tell cbv or compute to only compute a certain symbol. Note that in your case, they seem to compute too eagerly and simpl works better.

Related

Lemma about list and rev(list)

Trying to prove the following lemma I got stuck. Usully theorems about lists are proven using induction, but I don't know where to move next.
Lemma reverse_append : forall (T : Type) (h : T) (t : list T), h::t = rev(t) ++ [h] -> t = rev(t).
Proof.
intros. induction t.
- simpl. reflexivity.
- simpl. simpl in H.
Result:
1 subgoal (ID 522)
T : Type
h, x : T
t : list T
H : h :: x :: t = (rev t ++ [x]) ++ [h]
IHt : h :: t = rev t ++ [h] -> t = rev t
============================
x :: t = rev t ++ [x]
Main answer
Before you start proving your theorem, you should try to thoroughly understand what your theorem says. Your theorem is simply wrong.
Counterexample: 2 :: [1;2] = rev [1;2] ++ [2], but [1;2] is not a palindrome.
Full proof:
Require Import List.
Import ListNotations.
Lemma reverse_append_false :
~(forall (T : Type) (h : T) (t : list T), h::t = rev(t) ++ [h] -> t = rev(t)).
Proof. intros H. specialize (H nat 2 [1;2] eq_refl). inversion H. Qed.
Minor issues
rev(t) should be just rev t. Just an aesthetic point, but probably you should get yourself more familiar to writing in functional programming style.
Usually theorems about lists are proven using induction
This is a pretty naive statement, though technically correct. There are so many ways to do induction on a value, and choosing the induction that works best is a crucial skill. To name a few:
Induction on the list
Induction on the length of the list
arises quite frequently when dealing with rev and other functions that preserve length
Example
If simple induction doesn't work, consider a custom induction scheme
nat_ind2
The lemma isn't true as stated. Before proving anything, you should make sure it makes sense. The hypothesis is essentially saying that h::t = rev (h::t). But why would that mean that t = rev t? If you remove an element from the start of a palindromic list, it probably won't be a palindrome anymore. For example, deified is palindrome ('deified' = rev 'deified'), but eified isn't a palindrome.
For an example in this particular situation, 1::[2; 1] = (rev [2; 1]) ++ [1], since both are [1; 2; 1]. But [2; 1] is not equal to rev [2; 1] = [1; 2].

What should be done when simpl does not reduce all the necessary steps?

The following example is from chapter Poly of the Software Foundations book.
Definition fold_length {X : Type} (l : list X) : nat :=
fold (fun _ n => S n) l 0.
Theorem fold_length_correct : forall X (l : list X),
fold_length l = length l.
Proof.
intros.
induction l.
- simpl. reflexivity.
- simpl.
1 subgoal
X : Type
x : X
l : list X
IHl : fold_length l = length l
______________________________________(1/1)
fold_length (x :: l) = S (length l)
I expected it to simplify a step here on the left side. It certainly should be able to.
Theorem fold_length_correct : forall X (l : list X),
fold_length l = length l.
Proof.
intros.
induction l.
- simpl. reflexivity.
- simpl. rewrite <- IHl. simpl.
1 subgoal
X : Type
x : X
l : list X
IHl : fold_length l = length l
______________________________________(1/1)
fold_length (x :: l) = S (fold_length l)
During the running of the tests I had an issue where simpl would refuse to dive in, but reflexivity did the trick, so I tried the same thing here and the proof succeeded.
Note that one would not expect reflexivity to pass given the state of the goal, but it does. In this example it worked, but it did force me to do the rewrite in the opposite direction of what I intended originally.
Is it possible to have more control over simpl so that it does the desired reductions?
For the purposes of this answer, I'll assume the definition of fold is something along the lines of
Fixpoint fold {A B: Type} (f: A -> B -> B) (u: list A) (b: B): B :=
match u with
| [] => b
| x :: v => f x (fold f v b)
end.
(basically fold_right from the standard library). If your definition is substantially different, the tactics I recommend might not work.
The issue here is the behavior of simpl with constants that have to be unfolded before they can be simplified. From the documentation:
Notice that only transparent constants whose name can be reused in the recursive calls are possibly unfolded by simpl. For instance a constant defined by plus' := plus is possibly unfolded and reused in the recursive calls, but a constant such as succ := plus (S O) is never unfolded.
This is a bit hard to understand, so let's use an example.
Definition add_5 (n: nat) := n + 5.
Goal forall n: nat, add_5 (S n) = S (add_5 n).
Proof.
intro n.
simpl.
unfold add_5; simpl.
exact eq_refl.
Qed.
You'll see that the first call to simpl didn't do anything, even though add_5 (S n) could be simplified to S (n + 5). However, if I unfold add_5 first, it works perfectly. I think the issue is that plus_5 is not directly a Fixpoint. While plus_5 (S n) is equivalent to S (plus_5 n), that isn't actually the definition of it. So Coq doesn't recognize that its "name can be reused in the recursive calls". Nat.add (that is, "+") is defined directly as a recursive Fixpoint, so simpl does simplify it.
The behavior of simpl can be changed a little bit (see the documentation again). As Anton mentions in the comments, you can use the Arguments vernacular command to change when simpl tries to simplify. Arguments fold_length _ _ /. tells Coq that fold_length should be unfolded if at least two arguments are provided (the slash separates between the required arguments on the left and the unnecessary arguments on the right).[sup]1[\sup]
A simpler tactic to use if you don't want to deal with that is cbn which works here by default and works better in general. Quoting from the documentation:
The cbn tactic is claimed to be a more principled, faster and more predictable replacement for simpl.
Neither simpl with Arguments and a slash nor cbn reduce the goal to quite what you want in your case, since it'll unfold fold_length but not refold it. You could recognize that the call to fold is just fold_length l and refold it with fold (fold_length l).
Another possibility in your case is to use the change tactic. It seemed like you knew already that fold_length (a :: l) was supposed to simplify to S (fold_length l). If that's the case, you could use change (fold_length (a :: l)) with (S (fold_length l)). and Coq will try to convert one into the other (using only the basic conversion rules, not equalities like rewrite does).
After you've gotten the goal to S (fold_length l) = S (length l) using either of the above tactics, you can use rewrite -> IHl. like you wanted to.
I thought the slashes only made simpl unfold things less, which is why I didn't mention it before. I'm not sure what the default actually is, since putting the slash anywhere seems to make simpl unfold fold_length.

Logic: auxilliry lemma for tr_rev_correct

In Logic chapter a tail recursive version of reverse list function is introduced. We need to prove that it works correctly:
Fixpoint rev_append {X} (l1 l2 : list X) : list X :=
match l1 with
| [] => l2
| x :: l1' => rev_append l1' (x :: l2)
end.
(* Tail recursion rev *)
Definition tr_rev {X} (l : list X) : list X :=
rev_append l [].
But before proving it I wanted to prove a lemma:
Lemma rev_append_app: forall (X: Type) (x: X) (l : list X),
rev_append l [x] = rev_append l [] ++ [x].
Proof.
intros X x l. induction l as [| h t IH].
- simpl. reflexivity.
- simpl.
Here I am stuck:
X : Type
x, h : X
t : list X
IH : rev_append t [x] = rev_append t [ ] ++ [x]
============================
rev_append t [h; x] = rev_append t [h] ++ [x]
What to do next?
As you noticed during your attempted proof, when taking the induction step from rev_append l [x] to rev_append (h :: t) [x], you end up with the term rev_append t [h; x] after simplification. The induction step does not lead towards the base case of the rev_append function, but to another recursive invocation that you cannot simplify.
Notice how the induction hypothesis that you would like to apply makes a statement about rev_append t [x] for some fixed x, but in your goal, the extra h list element before it gets in the way, and the induction hypothesis is of no use.
This is what Bubbler's answer was referring to when stating that your induction hypothesis is not strong enough: it only makes a statement about the case where the second argument is a list with a single element. But even after just the induction step (one recursive application), that list already has at least two elements!
As suggested by Bubbler, the helper lemma rev_append l (l1 ++ l2) = rev_append l l1 ++ l2 is stronger and does not have this problem: when used as the induction hypothesis, it can be applied to rev_append t [h; x] as well, allowing you to prove equality with rev_append t [h] ++ [x].
When attempting to prove the helper lemma, you may get stuck (like I did) in the same way as when proving rev_append_app itself. The crucial bit of advice that helped me proceed was to be careful which of the universally quantified variables you introduce before you start the induction. If you specialize any of them too early on, you might weaken your induction hypothesis and become stuck again. You may need to change the order of these quantified variables or use the generalize dependent tactic (see the Tactics chapter of Logic Foundations).
You can see that the induction hypothesis IH is not strong enough to prove the goal. Here what you need is a more general statement to prove in the first place. You can find more exercises dedicated to this topic here. (Actually, tail-recursive reverse is one of the exercises.)
In your case, the fully generalized statement could be as follows:
Lemma rev_append_app': forall (X: Type) (l l1 l2 : list X),
rev_append l (l1 ++ l2) = rev_append l l1 ++ l2.
Proving this by induction is trivial. Then you can prove your own statement as a corollary of this one:
Corollary rev_append_app: forall (X: Type) (x: X) (l : list X),
rev_append l [x] = rev_append l [] ++ [x].
Proof. intros. apply (rev_append_app _ _ [] [x]). Qed.
use the generalize dependent tactic like this:
Lemma rev_append_app: forall (X: Type) (l l1: list X) (x : X),
rev_append l (l1 ++ [x]) = rev_append l l1 ++ [x].
intros.
generalize dependent l1.
induction l as [| h t IH].
- intros.
easy.
- intros.
apply (IH (h::l1)).
Qed.

Coq: Rewriting with 'forall' in hypothesis or goal

I have proved 'correctness' of the reverse function on polymorphic Lists in Coq. The following proof works just fine, but I have a few questions about how the rewrite tactic works.
Here's the code:
Require Export Coq.Lists.List.
Import ListNotations.
Fixpoint rev {T:Type} (l:list T) : list T :=
match l with
| nil => nil
| h :: t => rev t ++ [h]
end.
(* Prove rev_acc equal to above naive implementation. *)
Fixpoint rev_acc {T:Type} (l acc:list T) : list T :=
match l with
| nil => acc
| h :: t => rev_acc t (h::acc)
end.
Theorem app_assoc : forall (T:Type) (l1 l2 l3 : list T),
(l1 ++ l2) ++ l3 = l1 ++ (l2 ++ l3).
Proof.
Admitted.
Theorem rev_acc_correct : forall (T:Type) (l k :list T),
rev l ++ k = rev_acc l k.
Proof.
intros T l.
induction l as [ | h l' IHl' ].
- reflexivity.
- simpl.
intro k.
(* Why is "intro k" required for "rewrite -> app_assoc" *)
(* But "rewrite -> IHl'" works regardless of "intro k". *)
(* generalize (rev l'), [h], k. *)
rewrite -> app_assoc.
simpl.
rewrite -> IHl'.
reflexivity.
Qed.
In the inductive step of the proof for rev_acc_correct if I skip intro k, then rewriting with app_assoc complains that it cannot find a matching subterm.
Found no subterm matching "(?M1058 ++ ?M1059) ++ ?M1060" in the current goal.
Here, I presume that the ? before the placeholder names denote that the terms are constrained, in this case to be of type List T for some type T; and since rev l' and [h] in the goal are instances of List T, one would expect a match in the goal.
On the other hand, rewriting with inductive hypothesis(rewrite -> IHl') instead of app_assoc goes through without needing an intro k before.
I find this behaviour of rewrite a bit confusing and the Coq manual doesn't provide any details. I don't want to have to read through the implementation but I need a good operational understanding of what the rewrite tactic does, especially with regards to how term matching works. Any answers/references in this direction are highly appreciated.
The complication with this rewrite is that there's a binder (the forall k), which can complicate things. If you just want things to work, use setoid_rewrite instead of rewrite and it will rewrite under binders.
rewrite IHl' looks like it happens under a binder, but the pattern being re-written doesn't actually involve the bound variable, so the binder isn't actually important. Here's what I mean: the goal is
forall k : list T, (rev l' ++ [h]) ++ k = rev_acc l' (h :: k)
which is the same thing as (that is, equal to):
(fun l : list T => forall k : list T, l ++ k = rev_acc l' (h :: k)) (rev l' ++ [h])
which I got using pattern (rev l' ++ [h]) in Ltac. Now it's clear that you can just rewrite the part being applied to and ignore the binder. When you do rewrite IHl' Coq easily figures out that IHl should be specialized to [h] and the rewrite proceeds.
rewrite app_assoc, on the other hand, needs to be specialized to three lists, specifically rev l', [h], and k. It can't be specialized in the current context because the variable k is only bound underneath the forall. This is why the pattern (?x ++ ?y) ++ ?z doesn't appear in the goal.
So what do you actually do? You can of course introduce k so there is no binder, but there's a simpler and more general technique: Coq has generalized rewriting that can rewrite under binders, which you can use by instead calling setoid_rewrite (see Rewriting under binders in the Coq reference manual). The manual tells you you need to declare morphisms, but the relevant ones have all been implemented for you in this case for forall, so setoid_rewrite app_assoc will just work.
Note that while you can always introduce a forall to get rid of the binder, setoid_rewrite can be really handy when your goal is an exists. Rather than using eexists you can just rewrite under the binder.

Pigeonhole proof without decidable equality or excluded middle

In Software Foundations IndProp.v one is asked to prove the pigeonhole principle, and one may use excluded middle, but it is mentioned that it is not strictly necessary. I've been trying to prove it without EM, but my brain seems to be wired classically.
Q: How would one prove the theorem without using excluded middle? How should one generally approach proofs for types without decidable equality, where one can't easily reason by cases?
I'd be very happy for a complete proof to look at, but please avoid posting it "in the clear", so as to not spoil the exercise in the Software Foundations course.
The definition uses two inductive predicates, In and repeats.
Require Import Lists.List.
Import ListNotations.
Section Pigeon.
Variable (X:Type).
Implicit Type (x:X).
Fixpoint In x l : Prop := (*** In ***)
match l with
| nil => False
| (x'::l') => x' = x \/ In x l'
end.
Hypothesis in_split : forall x l, In x l -> exists l1 l2, l = l1 ++ x :: l2.
Hypothesis in_app: forall x l1 l2, In x (l1++l2) <-> In x l1 \/ In x l2.
Inductive repeats : list X -> Prop := (*** repeats ***)
repeats_hd l x : In x l -> repeats (x::l)
| repeats_tl l x : repeats l -> repeats (x::l).
Theorem pigeonhole_principle_NO_EM: (*** pigeonhole ***)
forall l1 l2,
length l2 < length l1 -> (* There are more pigeons than nests *)
(forall x, In x l1 -> In x l2) -> (* All pigeons are in some nest *)
repeats l1. (* Thus, some pigeons share nest *)
Proof.
(* ??? *)
I'll describe the thought process that led me to a solution, in case it helps. We may apply induction and it is straightforward to reduce to the case l1 = a::l1', l2 = a::l2'. Now l1' is a subset of a::l2'. My EM-trained intuition is that one of the following holds:
a is in l1'.
a is not in l1'.
In the latter case, each element of l1' is in a::l2' but differs from a, and therefore must be in l2'. Thus l1' is a subset of l2', and we can apply the inductive hypothesis.
Unfortunately if In is not decidable, the above can't be directly formalized. In particular if equality is not decidable for the given type, it's difficult to prove elements are unequal, and therefore difficult to prove a negative statement like ~(In a l1'). However, we wanted to use that negative statement to prove a positive one, namely
forall x, In x l1' -> In x l2'
By analogy, suppose we wanted to prove
P \/ Q
Q -> R
------
P \/ R
The above intuitive argument is like starting from P \/ ~P, and using ~P -> Q -> R in the second case. We can use a direct proof to avoid EM.
Quantifying over the list l1' makes this a bit more complicated, but still we can construct a direct proof using the following lemma, which can be proven by induction.
Lemma split_or {X} (l : list X) (P Q : X -> Prop) :
(forall x, In x l -> (P x \/ Q x)) ->
(exists x, In x l /\ P x) \/ (forall x, In x l -> Q x).
Finally note that the intuitive pigeonhole principle could also be formalized as the following way, which cannot be proven when the type has undecidable equality (note that it has a negative statement in the conclusion):
Definition pigeon2 {X} : Prop := forall (l1 l2 : list X),
length l2 < length l1 ->
(exists x, In x l1 /\ ~(In x l2)) \/ repeats l1.
A possible constructive proof goes like this:
We prove pigeonhole_principle_NO_EM by induction on l1. Only the non-empty case is possible because of the length constraint. So, assume l1 = x :: l1'. Now, check whether there is some element of l1' which is mapped by f : (forall x, In x l1 -> In x l2) to the same membership proof as x. If there is such an x' element, then it follows that x = x', therefore l1 repeats. If there is no such element, then we can get l2' by removing the element that x is mapped to from l2, and apply the induction hypothesis to l2' and the appropriate f' : forall x, In x l1' -> In x l2' function.
That's it, but I note that actually formalizing this proof is not easy with the definitions given, because we need to prove heterogeneous or dependent equalities, since we have to compare membership proofs for possibly different elements.
As to the question of getting the hang of constructive proofs in general, an important skill or habit is always examining what kind of data we have, not just what kind of logical facts we know. In this case, membership proofs are actually indices pointing into lists, bundled together with proofs that the pointed-to elements equal certain values. If membership proofs didn't tell where exactly elements are located then this proof would not be possible constructively.