Reasoning about lists in Coq

Reasoning about lists in Coq - coq

I'm try to solve some theorems, based on Pierce's "Software Foundations".
First of all I create a couple of useful functions:
Inductive natlist: Type :=
| nil: natlist
| cons : nat -> natlist -> natlist.
Notation "x :: l" := (cons x l) (at level 60, right associativity).
Fixpoint repeat (n count: nat): natlist :=
match count with
| O => nil
| S count' => n :: (repeat n count')
end.
Fixpoint length (l: natlist): nat :=
match l with
| nil => O
| h :: t => S (length t)
end.
Theorem count_repeat: forall n: nat, length (repeat n n) = n.
Proof.
intros n. induction n as [| n'].
simpl. reflexivity.
simpl. (* and here I can't continue... *)
I want to follow Pierce's advice:
Note that, since this problem is somewhat open-ended, it's possible
that you may come up with a theorem which is true, but whose proof
requires techniques you haven't learned yet. Feel free to ask for help
if you get stuck!
So, could you please advice some proof techniques for me?

As #eponier said, you should try to prove a more general lemma, like
Theorem count_repeat_gen: forall m n: nat, length (repeat n m) = m.
Using repeat n n creates an implicit link between the value of the element and the size of the list which makes your statement impossible to prove directly. Once you proved count_repeat_gen, you'll be able to prove your theorem.

Related

Some help dealing with inject/unject and vector types

I'm reading through CPDT while doing the readings and exercises from Pierce's course here: https://www.cis.upenn.edu/~bcpierce/courses/670Fall12/
This question relates to HW10 here: https://www.cis.upenn.edu/~bcpierce/courses/670Fall12/HW10.v
Here's the code up to my question
Require Import Arith Bool List.
Require Import CpdtTactics MoreSpecif.
Set Implicit Arguments.
(* Length-Indexed Lists *)
Section ilist.
Variable A : Set.
Inductive ilist : nat -> Set :=
| Nil : ilist O
| Cons : forall n, A -> ilist n -> ilist (S n).
Definition ilength n (l : ilist n) := n.
Fixpoint app n1 (ls1 : ilist n1)
n2 (ls2 : ilist n2)
: ilist (n1 + n2) :=
match ls1
(*in (ilist n1) return (ilist (n1 + n2))*)
with
| Nil => ls2
| Cons _ x ls1' => Cons x (app ls1' ls2)
end.
(* Coq automatically adds annotations to the
definition of app. *)
Print app.
Fixpoint inject (ls : list A) : ilist (length ls) :=
match ls with
| nil => Nil
| h :: t => Cons h (inject t)
end.
Print inject.
Fixpoint unject n (ls : ilist n) : list A :=
match ls with
| Nil => nil
| Cons _ h t => h :: unject t
end.
Theorem inject_inverse : forall ls,
unject (inject ls) = ls.
induction ls; crush.
Qed.
(* Exercise (20 min) : Prove the opposite, that inject (unject ls) = ls.
You cannot state this theorem directly, since ls : ilist n
and inject (unject ls) : ilist (length (unject ls)).
One approach is to define an alternative version of equality ilist_eq
on ilists and prove that the equality holds under this definition.
If you do this, prove that ilist_eq is an equivalence relation (and try
to automate the proof).
Another more involved approach is to prove that n = length (unject ls)
and then to define a function that, given (ls : ilist n) and a
proof that m = n, produces an ilist m. In this approach you may
find proof irrelevance convenient.
*)
Because I really want to better understand dependent types and how to use proofs in programs, I decided to try to do the latter. Here is what I have so far.
Definition ilists_sizechange (n1 n2:nat) (l1:ilist n1) (P:n1=n2): ilist n2.
subst.
assumption.
Defined.
Lemma ilists_size_equal: forall n (ls:ilist n), n = length (unject ls).
Proof.
intros.
induction ls.
reflexivity.
simpl.
auto.
Qed.
Theorem unject_inject_thehardway: forall n (ls:ilist n),
inject (unject ls) = ilists_sizechange ls (ilists_size_equal ls).
Proof.
intros.
induction ls.
simpl.
?????????????????
Qed.
When I get to "?????????????????" that's where I'm stuck. I have a target like Nil = ilists_sizechange Nil (ilists_size_equal Nil) and I'm not really sure what I can do here.
I tried writing ilists_sizechange as a more direct function, but failed to do so. Not sure how to massage the type checking.
I guess I'm curious first if this approach is fruitful, or if I'm making some fundamental mistake. I'm also curious what the most concise way of expressing inject (unject ls) = ilists_sizechange ls (ilists_size_equal ls). is...here there are two custom functions (the sizechange and the proof of equality), and one imagines it should be possible with just one.
Coq is great but the syntax around dependently types stuff can be tricky. I appreciate any help!
Edit: I realize that an inductive type or something expressing equality of two lists and then building up and showing the sizes are equal is probably easier (eg the first suggestion they have), but I want to understand this case because I can imagine running into these sorts of issues in the future and I want to know how to work around them.
Edit2: I was able to make it past the Nil case using the following
dep_destruct (ilists_size_equal Nil).
compute.
reflexivity.
But then get stuck on the Cons case...I will try to prove some theorems and see if I can't get there, but I think I'm still missing something conceptual here.

Although functions may depend on proof objects, one approach (I'm going to show below) is to define the functions so that they don't use the proof objects except to construct other proof objects and to eliminate absurd cases, ensuring that opaque proofs never block computation. Another approach is to fully embrace dependently typed programming and the unification of "proofs as programs", but that's a much bigger paradigm shift to explain, so I'm not going to do that.
Starting with ilists_sizechange, we now care about the shape of the term constructed by tactics, so not all tactics are allowed. Not wanting to use the equality proof rules out the tactic subst. Instead we can recurse (induction) on the list l1 and pattern-match (destruct) on the natural number n2; there are four cases:
two absurd ones, which can be eliminated by using the equality (discriminate)
the 0 = 0 case, where you can just construct the empty list
the S m1 = S m2 case, where you can construct Cons, use the induction hypothesis (i.e., recursive call), and then you are asked for a proof of m1 = m2, which is where you can fall back to regular reasoning without caring what the proof term looks like.
Definition ilists_sizechange (n1 n2:nat) (l1:ilist n1) (P:n1=n2): ilist n2.
Proof.
revert n2 P. (* Generalize the induction hypothesis. *)
induction l1; destruct n2; discriminate + constructor; auto.
Defined.
While the rest of the proof below would technically work with that definition, it is still not ideal because any computation would unfold ilist_sizechange into an ugly function. While we've been careful to give that function the "right" computational behavior, tactic-based programming tends to be sloppy about some finer details of the syntax of those functions, which makes later proofs where they appear hard to read.
To have it look nicer in proofs, one way is to define a Fixpoint with the refine tactic. You write down the body of the function in Gallina, and put underscores for the proof terms, which become obligations that you have to prove separately. refine is not the only way to perform this technique, there's also the Program Fixpoint command and the Equations plugin. I would recommend looking into Equations. I stick with refine out of familiarity.
As you can see, intuitively all this function does is deconstruct the list l1, indexed by n1, and reconstruct it with index n2.
Fixpoint ilists_sizechange (n1 n2 :nat) (l1:ilist n1) {struct l1} : n1 = n2 -> ilist n2.
Proof.
refine (
match l1, n2 with
| Nil, 0 => fun _ => Nil
| Cons x xs, S n2' => fun EQ => Cons x (ilists_sizechange _ _ xs _)
| _, _ => fun _ => _
end
); try discriminate.
auto.
Defined.
The proof of ilists_size_equal needs no modification.
Lemma ilists_size_equal: forall n (ls:ilist n), n = length (unject ls).
Proof.
intros.
induction ls.
reflexivity.
simpl.
auto.
Qed.
For the final proof, there is one more step: first generalize the equality proof.
The idea is that ilists_sizechange doesn't actually look at it, but when it makes a recursive call it will need to construct some other proof, and this generalization allows you to use the induction hypothesis independently of that particular proof.
Theorem unject_inject_ : forall n (ls:ilist n) (EQ : n = length (unject ls)),
inject (unject ls) = ilists_sizechange ls EQ.
Proof.
intros n ls; induction ls; cbn.
- reflexivity.
- intros EQ. f_equal. apply IHls. (* Here we have ilists_sizechange applied to some big proof object, which we can ignore because the induction hypothesis generalizes over all such proof objects *)
Qed.
Then you want to specialize that theorem to use a concrete proof, ensuring that such a proof exists so the theorem is not vacuous.
Theorem unject_inject : forall n (ls:ilist n),
inject (unject ls) = ilists_sizechange ls (ilists_size_equal _).
Proof.
intros; apply unject_inject_.
Qed.

Here is one solution:
(* Length-Indexed Lists *)
Require Import Coq.Lists.List.
Import ListNotations.
Section ilist.
Variable A : Set.
Inductive ilist : nat -> Set :=
| Nil : ilist O
| Cons : forall {n}, A -> ilist n -> ilist (S n).
Fixpoint inject (ls : list A) : ilist (length ls) :=
match ls with
| nil => Nil
| h :: t => Cons h (inject t)
end.
Fixpoint unject {n} (ls : ilist n) : list A :=
match ls with
| Nil => nil
| Cons h t => h :: unject t
end.
Definition cast {A B : Set} (e : A = B) : A -> B :=
match e with eq_refl => fun x => x end.
Fixpoint length_unject n (l : ilist n) : length (unject l) = n :=
match l with
| Nil => eq_refl
| Cons _ l => f_equal S (length_unject _ l)
end.
Theorem unject_inverse n (ls : ilist n) :
cast (f_equal ilist (length_unject _ ls)) (inject (unject ls)) = ls.
Proof.
induction ls as [|n x l IH]; simpl; trivial.
revert IH.
generalize (inject (unject l)).
generalize (length_unject _ l).
generalize (length (unject l)).
intros m e.
destruct e.
simpl.
intros; congruence.
Qed.
End ilist.
The trick is to make your goal sufficiently general, and then to destruct the equality. The generalization is required to ensure that your goal is well-typed after destructing; failing to generalize will often lead to dependent-type errors.
Here, I've defined the length lemma by hand to be able to use the reduction machinery. But you could also have used proof irrelevance to get the proof to reduce to eq_refl after the fact.

Coq: Rewriting with 'forall' in hypothesis or goal

I have proved 'correctness' of the reverse function on polymorphic Lists in Coq. The following proof works just fine, but I have a few questions about how the rewrite tactic works.
Here's the code:
Require Export Coq.Lists.List.
Import ListNotations.
Fixpoint rev {T:Type} (l:list T) : list T :=
match l with
| nil => nil
| h :: t => rev t ++ [h]
end.
(* Prove rev_acc equal to above naive implementation. *)
Fixpoint rev_acc {T:Type} (l acc:list T) : list T :=
match l with
| nil => acc
| h :: t => rev_acc t (h::acc)
end.
Theorem app_assoc : forall (T:Type) (l1 l2 l3 : list T),
(l1 ++ l2) ++ l3 = l1 ++ (l2 ++ l3).
Proof.
Admitted.
Theorem rev_acc_correct : forall (T:Type) (l k :list T),
rev l ++ k = rev_acc l k.
Proof.
intros T l.
induction l as [ | h l' IHl' ].
- reflexivity.
- simpl.
intro k.
(* Why is "intro k" required for "rewrite -> app_assoc" *)
(* But "rewrite -> IHl'" works regardless of "intro k". *)
(* generalize (rev l'), [h], k. *)
rewrite -> app_assoc.
simpl.
rewrite -> IHl'.
reflexivity.
Qed.
In the inductive step of the proof for rev_acc_correct if I skip intro k, then rewriting with app_assoc complains that it cannot find a matching subterm.
Found no subterm matching "(?M1058 ++ ?M1059) ++ ?M1060" in the current goal.
Here, I presume that the ? before the placeholder names denote that the terms are constrained, in this case to be of type List T for some type T; and since rev l' and [h] in the goal are instances of List T, one would expect a match in the goal.
On the other hand, rewriting with inductive hypothesis(rewrite -> IHl') instead of app_assoc goes through without needing an intro k before.
I find this behaviour of rewrite a bit confusing and the Coq manual doesn't provide any details. I don't want to have to read through the implementation but I need a good operational understanding of what the rewrite tactic does, especially with regards to how term matching works. Any answers/references in this direction are highly appreciated.

The complication with this rewrite is that there's a binder (the forall k), which can complicate things. If you just want things to work, use setoid_rewrite instead of rewrite and it will rewrite under binders.
rewrite IHl' looks like it happens under a binder, but the pattern being re-written doesn't actually involve the bound variable, so the binder isn't actually important. Here's what I mean: the goal is
forall k : list T, (rev l' ++ [h]) ++ k = rev_acc l' (h :: k)
which is the same thing as (that is, equal to):
(fun l : list T => forall k : list T, l ++ k = rev_acc l' (h :: k)) (rev l' ++ [h])
which I got using pattern (rev l' ++ [h]) in Ltac. Now it's clear that you can just rewrite the part being applied to and ignore the binder. When you do rewrite IHl' Coq easily figures out that IHl should be specialized to [h] and the rewrite proceeds.
rewrite app_assoc, on the other hand, needs to be specialized to three lists, specifically rev l', [h], and k. It can't be specialized in the current context because the variable k is only bound underneath the forall. This is why the pattern (?x ++ ?y) ++ ?z doesn't appear in the goal.
So what do you actually do? You can of course introduce k so there is no binder, but there's a simpler and more general technique: Coq has generalized rewriting that can rewrite under binders, which you can use by instead calling setoid_rewrite (see Rewriting under binders in the Coq reference manual). The manual tells you you need to declare morphisms, but the relevant ones have all been implemented for you in this case for forall, so setoid_rewrite app_assoc will just work.
Note that while you can always introduce a forall to get rid of the binder, setoid_rewrite can be really handy when your goal is an exists. Rather than using eexists you can just rewrite under the binder.

How to prove a theorem on natural numbers using Coq list

I'm new in Coq. To do practice on list and list of pairs, I used Coq list library to prove a simple theorem of natural numbers. I try to prove the simple property of natural numbers:
forall n, multiplier, a0....an, d1...dn:
((a0*multiplier)=d1)+((a1*multiplier)=d2)+((a2*multiplier)=d3)+...+((an*multiplier)=dn) = result
-> (a0+a1+a2+...+an) * multiplier = d1+d2+...+dn = result
((3*2)=6)+((5*2)=10)+((9*2)=18) = 34 -> (3+5+9)*2 = 6+10+18 = 34 can be an example of this property(i.e. n=3 and multiplier = 2).
I use list of pairs (storing a's in one list and d's in another list) to code this property in Coq as:
Require Import List.
Fixpoint addnumbers (L : list nat) : nat :=
match L with
| nil => 0
| H::tail => H + addnumbers tail
end.
Theorem resultAreEqual : forall (natListofpair :list (nat * nat))
(multiplier : nat) (result : nat),
Forall (fun '(a,d) => a * multiplier = d ) natListofpair ->
addnumbers(List.map (#fst nat nat) natListofpair) * multiplier = result ->
addnumbers (List.map (#snd nat nat) natListofpair) = result.
Proof.
intros.
destruct natListofpair.
subst. simpl. reflexivity.
rewrite <- H0.
inversion H.
destruct p. simpl.
But I don't know how I should continue this prove. I'm stuck in this proving for one week. I'd be thankful for your help.

One reason you are having difficulty is that you have stated your lemma in an indirect way. When proving something in Coq, it is very important that you state it as simple as possible, as this often leads to easier proofs. In this case, the statement can become much simpler by using higher-order functions on lists.
Require Import Coq.Arith.PeanoNat.
Require Import Coq.Lists.List.
Definition sum (l : list nat) := fold_right Nat.add 0 l.
Lemma my_lemma l m : sum (map (Nat.mul m) l) = m * sum l.
The sum function is the equivalent of your addnumbers. The lemma says "the result of multiplying all numbers in l by m and adding them is the same as the result of adding them up first and multiplying by m later".
To prove this result, we need a crucial ingredient that your proof was missing: induction. This is often needed in Coq when we want to reason about objects of unbounded size, such as lists. Here is one possible proof.
Proof.
unfold sum.
induction l as [|x l IH]; simpl.
- (* Nil case *)
now rewrite Nat.mul_0_r.
- (* Cons case *)
now rewrite IH, Nat.mul_add_distr_l.
Qed.

Teach coq to check termination

Coq, unlike many others, accepts an optional explicit parameter,which can be used to indicate the decreasing structure of a fixpoint definition.
From Gallina specification, 1.3.4,
Fixpoint ident params {struct ident0 } : type0 := term0
defines the syntax. but from it, we've known that it must be an identifier, instead of a general measure.
However, in general, there are recursive functions, that the termination is not quite obvious,or it in fact is, but just difficult for the termination checker to find a decreasing structure. For example, following program interleaves two lists,
Fixpoint interleave (A : Set) (l1 l2 : list A) : list A :=
match l1 with
| [] => []
| h :: t => h :: interleave l2 t
end
This function clearly terminates, while Coq just couldn't figure it out. The reason is neither l1 nor l2 are decreasing every cycle. But what if we consider a measure, defined to be length l1 + length l2? Then this measure clearly decreases every recursion.
So my question is, in the case of sophisticated situation, where code is not straightforward to be organized in a termination checkable way, how do you educate coq and convince it to accept the fixpoint definition?

You have multiple options and all of them boil down to structural recursion in the end.
Preamble
From Coq Require Import List.
Import ListNotations.
Set Implicit Arguments.
Structural recursion
Sometimes you can reformulate your algorithm in a structurally recursive way:
Fixpoint interleave1 {A} (l1 l2 : list A) {struct l1} : list A :=
match l1, l2 with
| [], _ => l2
| _, [] => l1
| h1 :: t1, h2 :: t2 => h1 :: h2 :: interleave1 t1 t2
end.
Incidentally, in some cases you can use a trick with nested fixes -- see this definition of Ackermann function (it wouldn't work with just Fixpoint).
Program Fixpoint
You can use Program Fixpoint mechanism which lets you write your program naturally and later prove that it always terminates.
From Coq Require Import Program Arith.
Program Fixpoint interleave2 {A} (l1 l2 : list A)
{measure (length l1 + length l2)} : list A :=
match l1 with
| [] => l2
| h :: t => h :: interleave2 l2 t
end.
Next Obligation. simpl; rewrite Nat.add_comm; trivial with arith. Qed.
Function
Another option is to use the Function command which can be somewhat limited compared to Program Fixpoint. You can find out more about their differences here.
From Coq Require Recdef.
Definition sum_len {A} (ls : (list A * list A)) : nat :=
length (fst ls) + length (snd ls).
Function interleave3 {A} (ls : (list A * list A))
{measure sum_len ls} : list A :=
match ls with
| ([], _) => []
| (h :: t, l2) => h :: interleave3 (l2, t)
end.
Proof.
intros A ls l1 l2 h t -> ->; unfold sum_len; simpl; rewrite Nat.add_comm; trivial with arith.
Defined.
Equations plugin
This is an external plugin which addresses many issues with defining functions in Coq, including dependent types and termination.
From Equations Require Import Equations.
Equations interleave4 {A} (l1 l2 : list A) : list A :=
interleave4 l1 l2 by rec (length l1 + length l2) lt :=
interleave4 nil l2 := l2;
interleave4 (cons h t) l2 := cons h (interleave4 l2 t).
Next Obligation. rewrite Nat.add_comm; trivial with arith. Qed.
The code above works if you apply this fix.
Fix / Fix_F_2 combinators
You can learn more about this (manual) approach if you follow the links from this question about mergeSort function. By the way, the mergeSort function can be defined without using Fix if you apply the nested fix trick I mentioned earlier. Here is a solution which uses Fix_F_2 combinator since we have two arguments and not one like mergeSort:
Definition ordering {A} (l1 l2 : list A * list A) : Prop :=
length (fst l1) + length (snd l1) < length (fst l2) + length (snd l2).
Lemma ordering_wf' {A} : forall (m : nat) (p : list A * list A),
length (fst p) + length (snd p) <= m -> Acc (#ordering A) p.
Proof.
unfold ordering; induction m; intros p H; constructor; intros p'.
- apply Nat.le_0_r, Nat.eq_add_0 in H as [-> ->].
intros contra%Nat.nlt_0_r; contradiction.
- intros H'; eapply IHm, Nat.lt_succ_r, Nat.lt_le_trans; eauto.
Defined.
Lemma ordering_wf {A} : well_founded (#ordering A).
Proof. now red; intro ; eapply ordering_wf'. Defined.
(* it's in the stdlib but unfortunately opaque -- this blocks evaluation *)
Lemma destruct_list {A} (l : list A) :
{ x:A & {tl:list A | l = x::tl} } + { l = [] }.
Proof.
induction l as [|h tl]; [right | left]; trivial.
exists h, tl; reflexivity.
Defined.
Definition interleave5 {A} (xs ys : list A) : list A.
refine (Fix_F_2 (fun _ _ => list A)
(fun (l1 l2 : list A)
(interleave : (forall l1' l2', ordering (l1', l2') (l1, l2) -> list A)) =>
match destruct_list l1 with
| inright _ => l2
| inleft pf => let '(existT _ h (exist _ tl eq)) := pf in
h :: interleave l2 tl _
end) (ordering_wf (xs,ys))).
Proof. unfold ordering; rewrite eq, Nat.add_comm; auto.
Defined.
Evaluation tests
Check eq_refl : interleave1 [1;2;3] [4;5;6] = [1;4;2;5;3;6].
Check eq_refl : interleave2 [1;2;3] [4;5;6] = [1;4;2;5;3;6].
Check eq_refl : interleave3 ([1;2;3], [4;5;6]) = [1;4;2;5;3;6].
Fail Check eq_refl : interleave4 [1;2;3] [4;5;6] = [1;4;2;5;3;6]. (* Equations plugin *)
Check eq_refl : interleave5 [1;2;3] [4;5;6] = [1;4;2;5;3;6].
Exercise: what happens with this last check if you comment out destruct_list lemma?

You can use something called a measure instead of a structural argument for termination. For this, I believe you have to use the Program Fixpoint mechanism, which is a little involved and will make your proofs look uglier (because it generates a structural recursion out of the proof that you provide, so that the function you will actually use is not quite the function you wrote).
Details here:
https://coq.inria.fr/refman/program.html
It also seems like something called Equations can deal with measures?
cf. http://mattam82.github.io/Coq-Equations/examples/RoseTree.html
https://www.irif.fr/~sozeau/research/coq/equations.en.html

How to introduce a new variable in Coq?

I was wondering if there is a way to introduce an entirely new variable during the proof of a theorem in Coq?
For a complete example, consider the following property from here about the evenness of the length of a list.
Inductive ev_list {X:Type}: list X -> Prop :=
| el_nil : ev_list []
| el_cc : forall x y l, ev_list l -> ev_list (x :: y :: l).
Now I want to prove that for any list l if its length is even, then ev_list l holds:
Lemma ev_length__ev_list': forall X (l : list X), ev (length l) -> ev_list l.
Proof.
intros X l H.
which gives:
1 subgoals
X : Type
l : list X
H : ev (length l)
______________________________________(1/1)
ev_list l
Now, I'd like to "define" a new free variable n and a hypothesis n = length l. In hand-written math, I think we can do this, and then do induction about n. But is there a way to do the same in Coq?
Note. the reasons I ask are that:
I don't want to introduce this n artificially into the statement of the theorem itself, as is done in the page linked earlier, which IMHO is unnatural.
I tried to induction H., but it seems not working. Coq wasn't able to do case analysis on length l's ev-ness, and no induction hypothesis (IH) was generated.
Thanks.

This is a common issue in Coq proofs. You can use the remember tactic:
remember (length l) as n.
If you're doing induction on H as well, you might also have to generalize over l beforehand, by doing
generalize dependent l.
induction H.

If you want to add a new variable only for your induction, you can use directly
induction (length l) eqn:H0

According to the Curry-Howard Isomorphism, hypothesis in your context are just variables. You can define new variables with a function. The following refine tactic extends the goal with a fresh variable n (that is set to length l) and a proof e that n = length l (that is set to eq_refl).
Lemma ev_length__ev_list': forall X (l : list X), ev (length l) -> ev_list l.
Proof.
intros X l H.
refine ((fun n (e:n = length l) => _) (length l) eq_refl).
(* proof *)
Admitted.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Reasoning about lists in Coq - coq

Related

Some help dealing with inject/unject and vector types

Coq: Rewriting with 'forall' in hypothesis or goal

How to prove a theorem on natural numbers using Coq list

Teach coq to check termination

How to introduce a new variable in Coq?

Categories

Resources