What should be done when simpl does not reduce all the necessary steps? - coq

The following example is from chapter Poly of the Software Foundations book.
Definition fold_length {X : Type} (l : list X) : nat :=
fold (fun _ n => S n) l 0.
Theorem fold_length_correct : forall X (l : list X),
fold_length l = length l.
Proof.
intros.
induction l.
- simpl. reflexivity.
- simpl.
1 subgoal
X : Type
x : X
l : list X
IHl : fold_length l = length l
______________________________________(1/1)
fold_length (x :: l) = S (length l)
I expected it to simplify a step here on the left side. It certainly should be able to.
Theorem fold_length_correct : forall X (l : list X),
fold_length l = length l.
Proof.
intros.
induction l.
- simpl. reflexivity.
- simpl. rewrite <- IHl. simpl.
1 subgoal
X : Type
x : X
l : list X
IHl : fold_length l = length l
______________________________________(1/1)
fold_length (x :: l) = S (fold_length l)
During the running of the tests I had an issue where simpl would refuse to dive in, but reflexivity did the trick, so I tried the same thing here and the proof succeeded.
Note that one would not expect reflexivity to pass given the state of the goal, but it does. In this example it worked, but it did force me to do the rewrite in the opposite direction of what I intended originally.
Is it possible to have more control over simpl so that it does the desired reductions?

For the purposes of this answer, I'll assume the definition of fold is something along the lines of
Fixpoint fold {A B: Type} (f: A -> B -> B) (u: list A) (b: B): B :=
match u with
| [] => b
| x :: v => f x (fold f v b)
end.
(basically fold_right from the standard library). If your definition is substantially different, the tactics I recommend might not work.
The issue here is the behavior of simpl with constants that have to be unfolded before they can be simplified. From the documentation:
Notice that only transparent constants whose name can be reused in the recursive calls are possibly unfolded by simpl. For instance a constant defined by plus' := plus is possibly unfolded and reused in the recursive calls, but a constant such as succ := plus (S O) is never unfolded.
This is a bit hard to understand, so let's use an example.
Definition add_5 (n: nat) := n + 5.
Goal forall n: nat, add_5 (S n) = S (add_5 n).
Proof.
intro n.
simpl.
unfold add_5; simpl.
exact eq_refl.
Qed.
You'll see that the first call to simpl didn't do anything, even though add_5 (S n) could be simplified to S (n + 5). However, if I unfold add_5 first, it works perfectly. I think the issue is that plus_5 is not directly a Fixpoint. While plus_5 (S n) is equivalent to S (plus_5 n), that isn't actually the definition of it. So Coq doesn't recognize that its "name can be reused in the recursive calls". Nat.add (that is, "+") is defined directly as a recursive Fixpoint, so simpl does simplify it.
The behavior of simpl can be changed a little bit (see the documentation again). As Anton mentions in the comments, you can use the Arguments vernacular command to change when simpl tries to simplify. Arguments fold_length _ _ /. tells Coq that fold_length should be unfolded if at least two arguments are provided (the slash separates between the required arguments on the left and the unnecessary arguments on the right).[sup]1[\sup]
A simpler tactic to use if you don't want to deal with that is cbn which works here by default and works better in general. Quoting from the documentation:
The cbn tactic is claimed to be a more principled, faster and more predictable replacement for simpl.
Neither simpl with Arguments and a slash nor cbn reduce the goal to quite what you want in your case, since it'll unfold fold_length but not refold it. You could recognize that the call to fold is just fold_length l and refold it with fold (fold_length l).
Another possibility in your case is to use the change tactic. It seemed like you knew already that fold_length (a :: l) was supposed to simplify to S (fold_length l). If that's the case, you could use change (fold_length (a :: l)) with (S (fold_length l)). and Coq will try to convert one into the other (using only the basic conversion rules, not equalities like rewrite does).
After you've gotten the goal to S (fold_length l) = S (length l) using either of the above tactics, you can use rewrite -> IHl. like you wanted to.
I thought the slashes only made simpl unfold things less, which is why I didn't mention it before. I'm not sure what the default actually is, since putting the slash anywhere seems to make simpl unfold fold_length.

Related

Some help dealing with inject/unject and vector types

I'm reading through CPDT while doing the readings and exercises from Pierce's course here: https://www.cis.upenn.edu/~bcpierce/courses/670Fall12/
This question relates to HW10 here: https://www.cis.upenn.edu/~bcpierce/courses/670Fall12/HW10.v
Here's the code up to my question
Require Import Arith Bool List.
Require Import CpdtTactics MoreSpecif.
Set Implicit Arguments.
(* Length-Indexed Lists *)
Section ilist.
Variable A : Set.
Inductive ilist : nat -> Set :=
| Nil : ilist O
| Cons : forall n, A -> ilist n -> ilist (S n).
Definition ilength n (l : ilist n) := n.
Fixpoint app n1 (ls1 : ilist n1)
n2 (ls2 : ilist n2)
: ilist (n1 + n2) :=
match ls1
(*in (ilist n1) return (ilist (n1 + n2))*)
with
| Nil => ls2
| Cons _ x ls1' => Cons x (app ls1' ls2)
end.
(* Coq automatically adds annotations to the
definition of app. *)
Print app.
Fixpoint inject (ls : list A) : ilist (length ls) :=
match ls with
| nil => Nil
| h :: t => Cons h (inject t)
end.
Print inject.
Fixpoint unject n (ls : ilist n) : list A :=
match ls with
| Nil => nil
| Cons _ h t => h :: unject t
end.
Theorem inject_inverse : forall ls,
unject (inject ls) = ls.
induction ls; crush.
Qed.
(* Exercise (20 min) : Prove the opposite, that inject (unject ls) = ls.
You cannot state this theorem directly, since ls : ilist n
and inject (unject ls) : ilist (length (unject ls)).
One approach is to define an alternative version of equality ilist_eq
on ilists and prove that the equality holds under this definition.
If you do this, prove that ilist_eq is an equivalence relation (and try
to automate the proof).
Another more involved approach is to prove that n = length (unject ls)
and then to define a function that, given (ls : ilist n) and a
proof that m = n, produces an ilist m. In this approach you may
find proof irrelevance convenient.
*)
Because I really want to better understand dependent types and how to use proofs in programs, I decided to try to do the latter. Here is what I have so far.
Definition ilists_sizechange (n1 n2:nat) (l1:ilist n1) (P:n1=n2): ilist n2.
subst.
assumption.
Defined.
Lemma ilists_size_equal: forall n (ls:ilist n), n = length (unject ls).
Proof.
intros.
induction ls.
reflexivity.
simpl.
auto.
Qed.
Theorem unject_inject_thehardway: forall n (ls:ilist n),
inject (unject ls) = ilists_sizechange ls (ilists_size_equal ls).
Proof.
intros.
induction ls.
simpl.
?????????????????
Qed.
When I get to "?????????????????" that's where I'm stuck. I have a target like Nil = ilists_sizechange Nil (ilists_size_equal Nil) and I'm not really sure what I can do here.
I tried writing ilists_sizechange as a more direct function, but failed to do so. Not sure how to massage the type checking.
I guess I'm curious first if this approach is fruitful, or if I'm making some fundamental mistake. I'm also curious what the most concise way of expressing inject (unject ls) = ilists_sizechange ls (ilists_size_equal ls). is...here there are two custom functions (the sizechange and the proof of equality), and one imagines it should be possible with just one.
Coq is great but the syntax around dependently types stuff can be tricky. I appreciate any help!
Edit: I realize that an inductive type or something expressing equality of two lists and then building up and showing the sizes are equal is probably easier (eg the first suggestion they have), but I want to understand this case because I can imagine running into these sorts of issues in the future and I want to know how to work around them.
Edit2: I was able to make it past the Nil case using the following
dep_destruct (ilists_size_equal Nil).
compute.
reflexivity.
But then get stuck on the Cons case...I will try to prove some theorems and see if I can't get there, but I think I'm still missing something conceptual here.
Although functions may depend on proof objects, one approach (I'm going to show below) is to define the functions so that they don't use the proof objects except to construct other proof objects and to eliminate absurd cases, ensuring that opaque proofs never block computation. Another approach is to fully embrace dependently typed programming and the unification of "proofs as programs", but that's a much bigger paradigm shift to explain, so I'm not going to do that.
Starting with ilists_sizechange, we now care about the shape of the term constructed by tactics, so not all tactics are allowed. Not wanting to use the equality proof rules out the tactic subst. Instead we can recurse (induction) on the list l1 and pattern-match (destruct) on the natural number n2; there are four cases:
two absurd ones, which can be eliminated by using the equality (discriminate)
the 0 = 0 case, where you can just construct the empty list
the S m1 = S m2 case, where you can construct Cons, use the induction hypothesis (i.e., recursive call), and then you are asked for a proof of m1 = m2, which is where you can fall back to regular reasoning without caring what the proof term looks like.
Definition ilists_sizechange (n1 n2:nat) (l1:ilist n1) (P:n1=n2): ilist n2.
Proof.
revert n2 P. (* Generalize the induction hypothesis. *)
induction l1; destruct n2; discriminate + constructor; auto.
Defined.
While the rest of the proof below would technically work with that definition, it is still not ideal because any computation would unfold ilist_sizechange into an ugly function. While we've been careful to give that function the "right" computational behavior, tactic-based programming tends to be sloppy about some finer details of the syntax of those functions, which makes later proofs where they appear hard to read.
To have it look nicer in proofs, one way is to define a Fixpoint with the refine tactic. You write down the body of the function in Gallina, and put underscores for the proof terms, which become obligations that you have to prove separately. refine is not the only way to perform this technique, there's also the Program Fixpoint command and the Equations plugin. I would recommend looking into Equations. I stick with refine out of familiarity.
As you can see, intuitively all this function does is deconstruct the list l1, indexed by n1, and reconstruct it with index n2.
Fixpoint ilists_sizechange (n1 n2 :nat) (l1:ilist n1) {struct l1} : n1 = n2 -> ilist n2.
Proof.
refine (
match l1, n2 with
| Nil, 0 => fun _ => Nil
| Cons x xs, S n2' => fun EQ => Cons x (ilists_sizechange _ _ xs _)
| _, _ => fun _ => _
end
); try discriminate.
auto.
Defined.
The proof of ilists_size_equal needs no modification.
Lemma ilists_size_equal: forall n (ls:ilist n), n = length (unject ls).
Proof.
intros.
induction ls.
reflexivity.
simpl.
auto.
Qed.
For the final proof, there is one more step: first generalize the equality proof.
The idea is that ilists_sizechange doesn't actually look at it, but when it makes a recursive call it will need to construct some other proof, and this generalization allows you to use the induction hypothesis independently of that particular proof.
Theorem unject_inject_ : forall n (ls:ilist n) (EQ : n = length (unject ls)),
inject (unject ls) = ilists_sizechange ls EQ.
Proof.
intros n ls; induction ls; cbn.
- reflexivity.
- intros EQ. f_equal. apply IHls. (* Here we have ilists_sizechange applied to some big proof object, which we can ignore because the induction hypothesis generalizes over all such proof objects *)
Qed.
Then you want to specialize that theorem to use a concrete proof, ensuring that such a proof exists so the theorem is not vacuous.
Theorem unject_inject : forall n (ls:ilist n),
inject (unject ls) = ilists_sizechange ls (ilists_size_equal _).
Proof.
intros; apply unject_inject_.
Qed.
Here is one solution:
(* Length-Indexed Lists *)
Require Import Coq.Lists.List.
Import ListNotations.
Section ilist.
Variable A : Set.
Inductive ilist : nat -> Set :=
| Nil : ilist O
| Cons : forall {n}, A -> ilist n -> ilist (S n).
Fixpoint inject (ls : list A) : ilist (length ls) :=
match ls with
| nil => Nil
| h :: t => Cons h (inject t)
end.
Fixpoint unject {n} (ls : ilist n) : list A :=
match ls with
| Nil => nil
| Cons h t => h :: unject t
end.
Definition cast {A B : Set} (e : A = B) : A -> B :=
match e with eq_refl => fun x => x end.
Fixpoint length_unject n (l : ilist n) : length (unject l) = n :=
match l with
| Nil => eq_refl
| Cons _ l => f_equal S (length_unject _ l)
end.
Theorem unject_inverse n (ls : ilist n) :
cast (f_equal ilist (length_unject _ ls)) (inject (unject ls)) = ls.
Proof.
induction ls as [|n x l IH]; simpl; trivial.
revert IH.
generalize (inject (unject l)).
generalize (length_unject _ l).
generalize (length (unject l)).
intros m e.
destruct e.
simpl.
intros; congruence.
Qed.
End ilist.
The trick is to make your goal sufficiently general, and then to destruct the equality. The generalization is required to ensure that your goal is well-typed after destructing; failing to generalize will often lead to dependent-type errors.
Here, I've defined the length lemma by hand to be able to use the reduction machinery. But you could also have used proof irrelevance to get the proof to reduce to eq_refl after the fact.

Lemma about list and rev(list)

Trying to prove the following lemma I got stuck. Usully theorems about lists are proven using induction, but I don't know where to move next.
Lemma reverse_append : forall (T : Type) (h : T) (t : list T), h::t = rev(t) ++ [h] -> t = rev(t).
Proof.
intros. induction t.
- simpl. reflexivity.
- simpl. simpl in H.
Result:
1 subgoal (ID 522)
T : Type
h, x : T
t : list T
H : h :: x :: t = (rev t ++ [x]) ++ [h]
IHt : h :: t = rev t ++ [h] -> t = rev t
============================
x :: t = rev t ++ [x]
Main answer
Before you start proving your theorem, you should try to thoroughly understand what your theorem says. Your theorem is simply wrong.
Counterexample: 2 :: [1;2] = rev [1;2] ++ [2], but [1;2] is not a palindrome.
Full proof:
Require Import List.
Import ListNotations.
Lemma reverse_append_false :
~(forall (T : Type) (h : T) (t : list T), h::t = rev(t) ++ [h] -> t = rev(t)).
Proof. intros H. specialize (H nat 2 [1;2] eq_refl). inversion H. Qed.
Minor issues
rev(t) should be just rev t. Just an aesthetic point, but probably you should get yourself more familiar to writing in functional programming style.
Usually theorems about lists are proven using induction
This is a pretty naive statement, though technically correct. There are so many ways to do induction on a value, and choosing the induction that works best is a crucial skill. To name a few:
Induction on the list
Induction on the length of the list
arises quite frequently when dealing with rev and other functions that preserve length
Example
If simple induction doesn't work, consider a custom induction scheme
nat_ind2
The lemma isn't true as stated. Before proving anything, you should make sure it makes sense. The hypothesis is essentially saying that h::t = rev (h::t). But why would that mean that t = rev t? If you remove an element from the start of a palindromic list, it probably won't be a palindrome anymore. For example, deified is palindrome ('deified' = rev 'deified'), but eified isn't a palindrome.
For an example in this particular situation, 1::[2; 1] = (rev [2; 1]) ++ [1], since both are [1; 2; 1]. But [2; 1] is not equal to rev [2; 1] = [1; 2].

Coq: Rewriting with 'forall' in hypothesis or goal

I have proved 'correctness' of the reverse function on polymorphic Lists in Coq. The following proof works just fine, but I have a few questions about how the rewrite tactic works.
Here's the code:
Require Export Coq.Lists.List.
Import ListNotations.
Fixpoint rev {T:Type} (l:list T) : list T :=
match l with
| nil => nil
| h :: t => rev t ++ [h]
end.
(* Prove rev_acc equal to above naive implementation. *)
Fixpoint rev_acc {T:Type} (l acc:list T) : list T :=
match l with
| nil => acc
| h :: t => rev_acc t (h::acc)
end.
Theorem app_assoc : forall (T:Type) (l1 l2 l3 : list T),
(l1 ++ l2) ++ l3 = l1 ++ (l2 ++ l3).
Proof.
Admitted.
Theorem rev_acc_correct : forall (T:Type) (l k :list T),
rev l ++ k = rev_acc l k.
Proof.
intros T l.
induction l as [ | h l' IHl' ].
- reflexivity.
- simpl.
intro k.
(* Why is "intro k" required for "rewrite -> app_assoc" *)
(* But "rewrite -> IHl'" works regardless of "intro k". *)
(* generalize (rev l'), [h], k. *)
rewrite -> app_assoc.
simpl.
rewrite -> IHl'.
reflexivity.
Qed.
In the inductive step of the proof for rev_acc_correct if I skip intro k, then rewriting with app_assoc complains that it cannot find a matching subterm.
Found no subterm matching "(?M1058 ++ ?M1059) ++ ?M1060" in the current goal.
Here, I presume that the ? before the placeholder names denote that the terms are constrained, in this case to be of type List T for some type T; and since rev l' and [h] in the goal are instances of List T, one would expect a match in the goal.
On the other hand, rewriting with inductive hypothesis(rewrite -> IHl') instead of app_assoc goes through without needing an intro k before.
I find this behaviour of rewrite a bit confusing and the Coq manual doesn't provide any details. I don't want to have to read through the implementation but I need a good operational understanding of what the rewrite tactic does, especially with regards to how term matching works. Any answers/references in this direction are highly appreciated.
The complication with this rewrite is that there's a binder (the forall k), which can complicate things. If you just want things to work, use setoid_rewrite instead of rewrite and it will rewrite under binders.
rewrite IHl' looks like it happens under a binder, but the pattern being re-written doesn't actually involve the bound variable, so the binder isn't actually important. Here's what I mean: the goal is
forall k : list T, (rev l' ++ [h]) ++ k = rev_acc l' (h :: k)
which is the same thing as (that is, equal to):
(fun l : list T => forall k : list T, l ++ k = rev_acc l' (h :: k)) (rev l' ++ [h])
which I got using pattern (rev l' ++ [h]) in Ltac. Now it's clear that you can just rewrite the part being applied to and ignore the binder. When you do rewrite IHl' Coq easily figures out that IHl should be specialized to [h] and the rewrite proceeds.
rewrite app_assoc, on the other hand, needs to be specialized to three lists, specifically rev l', [h], and k. It can't be specialized in the current context because the variable k is only bound underneath the forall. This is why the pattern (?x ++ ?y) ++ ?z doesn't appear in the goal.
So what do you actually do? You can of course introduce k so there is no binder, but there's a simpler and more general technique: Coq has generalized rewriting that can rewrite under binders, which you can use by instead calling setoid_rewrite (see Rewriting under binders in the Coq reference manual). The manual tells you you need to declare morphisms, but the relevant ones have all been implemented for you in this case for forall, so setoid_rewrite app_assoc will just work.
Note that while you can always introduce a forall to get rid of the binder, setoid_rewrite can be really handy when your goal is an exists. Rather than using eexists you can just rewrite under the binder.

How to apply a function once during simplification in Coq?

From what I understand, function calls in Coq are opaque.
Sometimes, I need to use unfold to apply it and then fold to turn the function definition/body back to its name. This is often tedious. My question is, is there an easier way to let apply a specific instance of a function call?
As a minimal example, for a list l, to prove right-appending [] does not change l:
Theorem nil_right_app: forall {Y} (l: list Y), l ++ [] = l.
Proof.
induction l.
reflexivity.
This leaves:
1 subgoals
Y : Type
x : Y
l : list Y
IHl : l ++ [] = l
______________________________________(1/1)
(x :: l) ++ [] = x :: l
Now, I need to apply the definition of ++ (i.e. app) once (pretending there are other ++ in the goal which I don't want to apply/expand). Currently, the only way I know to implement this one time application is to first unfold ++ and then fold it:
unfold app at 1. fold (app l []).
giving:
______________________________________(1/1)
x :: l ++ [] = x :: l
But this is inconvenient as I have to figure out the form of the term to use in fold. I did the computation, not Coq. My question boils down to:
Is there a simpler way to implement this one-time function application to the same effect?
You can use simpl, compute or vm_compute if you want to ask Coq to perform some computation for you. If the definition of the function is Opaque, the above solution will fail, but you could first prove a rewriting lemma such as:
forall (A:Type) (a:A) (l1 l2: list A), (a :: l1) ++ l2 = a :: (l1 ++ l2).
using your technique, and then rewrite with it when necessary.
Here is an example using simpl:
Theorem nil_right_app: forall {Y} (l: list Y), l ++ nil = l.
Proof.
(* solve the first case directly *)
intros Y; induction l as [ | hd tl hi]; [reflexivity | ].
simpl app. (* or simply "simpl." *)
rewrite hi.
reflexivity.
Qed.
To answer your comment, I don't know how to tell cbv or compute to only compute a certain symbol. Note that in your case, they seem to compute too eagerly and simpl works better.

How to introduce a new variable in Coq?

I was wondering if there is a way to introduce an entirely new variable during the proof of a theorem in Coq?
For a complete example, consider the following property from here about the evenness of the length of a list.
Inductive ev_list {X:Type}: list X -> Prop :=
| el_nil : ev_list []
| el_cc : forall x y l, ev_list l -> ev_list (x :: y :: l).
Now I want to prove that for any list l if its length is even, then ev_list l holds:
Lemma ev_length__ev_list': forall X (l : list X), ev (length l) -> ev_list l.
Proof.
intros X l H.
which gives:
1 subgoals
X : Type
l : list X
H : ev (length l)
______________________________________(1/1)
ev_list l
Now, I'd like to "define" a new free variable n and a hypothesis n = length l. In hand-written math, I think we can do this, and then do induction about n. But is there a way to do the same in Coq?
Note. the reasons I ask are that:
I don't want to introduce this n artificially into the statement of the theorem itself, as is done in the page linked earlier, which IMHO is unnatural.
I tried to induction H., but it seems not working. Coq wasn't able to do case analysis on length l's ev-ness, and no induction hypothesis (IH) was generated.
Thanks.
This is a common issue in Coq proofs. You can use the remember tactic:
remember (length l) as n.
If you're doing induction on H as well, you might also have to generalize over l beforehand, by doing
generalize dependent l.
induction H.
If you want to add a new variable only for your induction, you can use directly
induction (length l) eqn:H0
According to the Curry-Howard Isomorphism, hypothesis in your context are just variables. You can define new variables with a function. The following refine tactic extends the goal with a fresh variable n (that is set to length l) and a proof e that n = length l (that is set to eq_refl).
Lemma ev_length__ev_list': forall X (l : list X), ev (length l) -> ev_list l.
Proof.
intros X l H.
refine ((fun n (e:n = length l) => _) (length l) eq_refl).
(* proof *)
Admitted.