Proof involving unfolding two recursive functions in COQ - coq

I've began learning Coq, and am trying to prove something that seems fairly simple: if a list contains x, then the number of instances of x in that list will be > 0.
I've defined the contains and count functions as follows:
Fixpoint contains (n: nat) (l: list nat) : Prop :=
match l with
| nil => False
| h :: t => if beq_nat h n then True else contains n t
end.
Fixpoint count (n acc: nat) (l: list nat) : nat :=
match l with
| nil => acc
| h :: t => if beq_nat h n then count n (acc + 1) t else count n acc t
end.
I'm trying to prove:
Lemma contains_count_ge1 : forall (n: nat) (l: list nat), contains n l -> (count n 0 l > 0).
I understand the proof will involve unfolding the definitions of count and contains, but then I'd like to say "the list cannot be nil, as contains is true, so there must be an element x in l such that beq_nat h x is true", and I've played around a bit but can't figure out how to use tactics to do this. Any guidance would be greatly appreciated.

ejgallego already gave a great solution to your problem in his answer. I would still like to single out an important point that he left out: in Coq, you must always argue from first principles, and be very pedantic and precise about your proofs.
You argued that the proof should proceed as follows:
The list cannot be nil, as contains is true, so there must be an element x in l such that beq_nat h x is true.
Even though this makes intuitive sense for humans, it is not precise enough for Coq to understand. The problem, as ejgallego's answer shows, is that your informal reasoning conceals a use of induction. Indeed, it is useful to try to expand out your argument in more details even before translating it into tactics. We could proceed like this, for instance:
Let us prove that, for every n : nat and ns : list nat, contains n ns implies count n 0 ns > 0. We proceed by induction on the list ns. If ns = nil, the definition of contains implies that False holds; a contradiction. We are thus left with the case ns = n' :: ns', where we can use the following induction hypothesis: contains n ns' -> count n 0 ns' > 0. There are two sub-cases to consider: whether beq_nat n n' is true or not.
If beq_nat n n' is true, by the definition of count, we see that we just have to show that count n (0 + 1) ns' > 0. Note there isn't a direct way to proceed here. This is because you wrote count tail-recursively, using an accumulator. While this is perfectly reasonable in functional programming, it can making proving properties about count more difficult. In this case, we would need the following auxiliary lemma, also proved by induction: forall n acc ns, count n acc ns = acc + count n 0 ns. I'll let you figure out how to prove this one. But assuming that we have already established it, the goal would reduce to showing that 1 + count n 0 ns' > 0. This is true by simple arithmetic. (There is an even simpler way that does not require an auxiliary lemma, but it would require slightly generalizing the statement you're proving.)
If beq_nat n n' is false, by the definitions of contains and count, we would need to show that contains n ns' implies count n 0 ns' > 0. This is exactly what the induction hypothesis gives us, and we are done.
There are two lessons to be learned here. The first one is that doing formal proofs often requires translating your intuition in formal terms that the system can understand. We know intuitively what it means to have some element occur inside of a list. But if we were to explain what that means more formally, we would resort to some kind of recursive traversal of the list, which would probably turn out to be the very definition of count that you wrote in Coq. And in order to reason about recursion, we need induction. The second lesson is that the way you define things in Coq has important consequences for the proofs you write. ejgallego's solution did not require any auxiliary lemmas beyond those in the standard library, precisely because his definition of count was not tail-recursive.

Well, you pose many questions about basic Coq beyond what is IMO possible to address here. For this particular problem, I would proceed this way (in reality I would use the already provided lemmas in MathComp):
From Coq Require Import PeanoNat Bool List.
Fixpoint contains (n: nat) (l: list nat) : bool :=
match l with
| nil => false
| h :: t => if Nat.eqb h n then true else contains n t
end.
Fixpoint count (n : nat) (l: list nat) : nat :=
match l with
| nil => 0
| h :: t => if Nat.eqb h n then S (count n t) else count n t
end.
Lemma contains_count_ge1 n l : contains n l = true -> count n l > 0.
Proof.
induction l as [|x l IHl]; simpl; [now congruence|].
now destruct (Nat.eqb_spec x n); auto with arith.
Qed.
My "standard" solution:
Lemma test n (l : list nat) : n \in l -> 0 < count_mem n l.
Proof. by rewrite lt0n => /count_memPn/eqP. Qed.
and different definitions of count and contains that may prove useful:
Fixpoint contains (n: nat) (l: list nat) : bool :=
match l with
| nil => false
| h :: t => Nat.eqb h n || contains n t
end.
Fixpoint count (n : nat) (l: list nat) : nat :=
match l with
| nil => 0
| h :: t => Nat.b2n (Nat.eqb h n) + (count n t)
end.

Related

Natural number list in coq

I have a list of natural numbers, elements in the list are in descending order. I want to write lemma about the list ,that first element h is greater than all the elements of list . Let list is [h;h1;t] . 0 h1? Please guide me , how to write h is greater than all the elements in the tail of list.
You need to say
For any natural number, and any list like h::t, if the list is descending and if the number is in the tail, then it's smaller than the head.
So in Coq language you may write
Lemma head_is_max : forall n h t, desc (h::t) -> In n t -> h >= n.
if desc is a Boolean predicate you may write
Lemma head_is_max : forall n h t, desc (h::t) = true -> In n t -> h >= n.
Performing an induction on t would work for the proof.
In a more sophisticated way, you may use a predicate on list which asserts that all elements of a list have a specific property, you may define it as
Fixpoint All {T : Type} (P : T -> Prop) (l : list T) : Prop :=
match l with
| [] => True
| h :: t => P h /\ All P t
end.
So All P l means P x holds for all xs in l. Now we can write the mentioned lemma as
Lemma head_is_max : forall h t, desc (h::t) -> All (fun n => h >= n) t.
To express that a given list of natural numbers is in descending order, you can use existing functions in the List module of Coq.
Require Import List Lia.
Definition desc (l : list nat) : Prop :=
forall i j, i <= j -> nth j l 0 <= nth i l 0.
What did I do? I just expressed that the value at rank i must be larger than the value at rank j for any j larger than i. This is clever in a subtle way. The expression nth j l 0 actually represents the value at rank i in l if i is smaller than the length of the list or 0 otherwise. It happens that 0 is smaller than any other natural number, so this definition does work. If instead, you had asked for a list of numbers in strictly descending order, then I would have had to write a more precise definition, involving only ranks that are smaller than the list length (you can use the function length for this). I let you do this as an exercise.
When you write a logical predicate like desc here, it is important to test this definition, to make sure you have really captured the notion you had in mind. To test my own definition, I wrote the following code:
Definition sample1 := 1 :: 2 :: 3 :: nil.
Definition sample2 := map (fun x => 10 - x) sample1.
Lemma s2_desc : desc sample2.
Proof.
intros [ | [ | [ | [ | ]]]] [ | [ | [ | [ | ]]]];
intros ilej; simpl; lia.
Qed.
Lemma s1_n_desc : ~desc sample1.
Proof.
intros abs; generalize (abs 0 1 (le_S _ _ (le_n _))).
compute; lia.
Qed.
The proof of s2_desc is a proof by brute force, it actually tries all pairs of ranks smaller than 4, and checks
that in all these cases the comparisons between natural numbers (ranks or values in the list) do give logically provable formulas.
The proof of s1_n_desc is used to check that my definition of desc really rejects a list that obviously does not satisfy the criterion. It is a good thing I wrote this proof, because it helped me discover a bug in my code of desc that was not caught by the previous proof: I had written nth 0 l i instead of nth i l 0!
Last, but not least, my solution starts with Require Import List Lia. This means that we use two existing modules of Coq. The first one provides frequently used functions about lists, the second one provides an automatic tool to perform easy proofs about comparison between numbers (natural numbers or integers, actually).
As a next step, one could also write a boolean function that computes the value true exactly when its input is in
descending order and develop proofs of tests to verify that both functions behave accordingly.
You need to define what you mean by descending and use that in your proof.
#Yves has perhaps the neatest way of doing it. Here is another definition that
is just writing down a simple inductive definition. A list is descending if the tail is descending and the first element is larger or equal to the second element.
One nice thing with inductive definitions is that you can do induction on them, which gives you a lot of information in each proof case, with very little work.
Require Import List.
Inductive descending : list nat -> Prop :=
desc_nil : descending nil
| desc_1 n : descending (cons n nil)
| desc_hd n m l :
m <= n ->
descending (cons m l) ->
descending (cons n (cons m l)).
Lemma head_gt l d:
descending l -> forall m, In m l -> m <= hd d l.
Proof.
induction 1; intros k H'.
now exfalso; apply in_nil in H'.
now replace k with n; [ | inversion H'].
now inversion H';
[ subst; apply le_n
| eapply PeanoNat.Nat.le_trans; auto].
Qed.

Pattern Matching with Even and Odd Cases

Suppose I write a Fixpoint algorithm in Coq that sums up all the "halves" of a number:
Fixpoint sum_of_halves (a : nat) : nat :=
match a with
| 0 => 0
| 2 * k => a + (sum_of_halves k)
| S (2 * k) => a + (sum_of_halves k)
end.
Trying to evaluate the algorithm would get: Error: Invalid notation for pattern.
How can I get Coq to recognize that a is either an even or an odd number, and match it with either 2 * k or S (2 * k)?
Coq can only match on constructors. nat has two constructors, O and S, so you cannot match on 2 * k. You will have to use a non-match construct or a non-nat type or a different algorithm.
You need to prove that there are only three cases for a given natural number a. Either a is 0, either a is the double of another number k and k < a, or a is the double k + 1 and k < a, that the three cases are exclusive (this is important, otherwise making pattern matching possible would lead to an inconistency).
Fortunately, all this can be done. It is a bit advanced Coq programming, but it is somehow already done in ZArith. Here is a solution.
First note that the other number is already provided by one of the functions in the Coq library, div2.
Require Import Arith Nat.
Definition cases_div2 (a : nat) :
{k : nat | a = 2 * k /\ k < a}+{k : nat | a = S (2 * k) /\ k < a}+{a=0}.
destruct a as [ | a'].
right; reflexivity.
case_eq (odd (S a')); intros odd_a.
left; right; exists (div2 (S a')); rewrite (div2_odd (S a')) at 1.
split.
rewrite odd_a; simpl b2n; ring.
apply lt_div2; auto with arith.
left; left; exists (div2 (S a')); rewrite (div2_odd (S a')) at 1.
split.
rewrite odd_a; simpl b2n; ring.
apply lt_div2; auto with arith.
Defined.
Now, you can pattern match on your number a using cases_div2, but it is still not enough to define your function, because recursion using Fixpoint relies on recursive calls happening on the predecessor, and here k cannot be written as a predecessor pattern that will work for any input a. You need a stronger kind of recursion. I usually rely on Function or Fix for this kind of strong recursion. Here is an example with Fix
Definition sum_of_halves : nat -> nat :=
Fix Arith.Wf_nat.lt_wf (fun _ => nat)
(fun a (sum_of_halves' : forall y, y < a -> nat) =>
match cases_div2 a with
| inright h => 0
| inleft (inl (exist _ k (conj keq klt))) =>
a + sum_of_halves' k klt
| inleft (inr (exist _ k (conj keq klt))) =>
a + sum_of_halves' k klt
end).
Then to reason about sum_of_halves you will need to reason by well founded induction and use Fix_eq.
This is one possibility.

Writing well-founded programs in Coq using Fix or Program Fixpoint

Following the example given in the chapter GeneralRec of Chlipala book, I'm trying to write the mergesort algorithm.
Here is my code
Require Import Nat.
Fixpoint insert (x:nat) (l: list nat) : list nat :=
match l with
| nil => x::nil
| y::l' => if leb x y then
x::l
else
y::(insert x l')
end.
Fixpoint merge (l1 l2 : list nat) : list nat :=
match l1 with
| nil => l2
| x::l1' => insert x (merge l1' l2)
end.
Fixpoint split (l : list nat) : list nat * list nat :=
match l with
| nil => (nil,nil)
| x::nil => (x::nil,nil)
| x::y::l' =>
let (ll,lr) := split l' in
(x::ll,y::lr)
end.
Definition lengthOrder (l1 l2 : list nat) :=
length l1 < length l2.
Theorem lengthOrder_wf : well_founded lengthOrder.
Admitted.
The problem is that it is not possible to write the mergeSort function with the command Fixpoint since the function is not structurally decreasing :
Fixpoint mergeSort (l: list nat) : list nat :=
if leb (length l) 1 then l
else
let (ll,lr) := split l in
merge (mergeSort ll) (mergeSort lr).
Instead, one can use the command Program Fixpoint or Definition with the term Fix (as in Chlipala book).
However, if I'm writing this
Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun (l: list nat) => list nat)
(fun (l : list nat) => (fun mergeSort : (forall ls : list nat, lengthOrder ls l -> list nat )=>
if leb (length l) 1 then
let (ll,lr) := split l in
merge (mergeSort ll _) (mergeSort lr _)
else
l))).
I'm getting impossible goals :
2 subgoals, subgoal 1 (ID 65)
l : list nat
mergeSort : forall ls : list nat, lengthOrder ls l -> list nat
ll, lr : list nat
============================
lengthOrder ll l
subgoal 2 (ID 66) is:
lengthOrder lr l
That is why Chlipala suggests to change the definition of mergeSort this way:
Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun _ => list nat)
(fun (ls : list nat)
(mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat) =>
if Compare_dec.le_lt_dec 2 (length ls)
then let lss := split ls in
merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
else ls)).
that generates the following goals:
2 subgoals, subgoal 1 (ID 68)
ls : list nat
mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat
l : 2 <= length ls
lss := split ls : list nat * list nat
============================
lengthOrder (fst lss) ls
subgoal 2 (ID 69) is:
lengthOrder (snd lss) ls
This new definition sounds like magic to me. So I wonder:
Fom the first definition, is it still possible to proof the well-foudness of the function?
Otherwise why the first definition cannot work?
How a basic user can go from the first definition to the second easily?
It's easy to see that you need to make two changes in order to get to A. Chlipala's solution.
1) When doing split you somehow need to remember that ll and lr came from split, otherwise they would be some arbitrary lists, which cannot possibly be shorter than the original list l.
The following piece of code fails to save that kind of information:
let (ll,lr) := split l in
merge (mergeSort ll _) (mergeSort lr _)
and, thus, needs to be replaced with
let lss := split ls in
merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
which keeps what we need.
The failure happens due to Coq's inability to remember that ll and lr come from split l and that happens because let (ll,lr) is just match in disguise (see the manual, ยง2.2.3).
Recall that the aims of pattern-matching is to (loosely speaking)
unpack the components of some value of an inductive datatype and bind them to some names (we'll need this in the 2nd part of my answer) and
replace the original definition with its special cases in the corresponding pattern-match branches.
Now, observe that split l does not occur anywhere in the goal or context before we pattern-match on it. We just arbitrarily introduce it into the definition. That's why pattern-matching doesn't give us anything -- we can't replace split l with its "special case" ((ll,lr)) in the goal or context, because there is no split l anywhere.
There is an alternative way of doing this by using logical equality (=):
(let (ll, lr) as s return (s = split l -> list nat) := split l in
fun split_eq => merge (mergeSort ll _) (mergeSort lr _)) eq_refl
This is analogous to using the remember tactic. We've got rid of fst and snd, but it is a huge overkill and I wouldn't recommend it.
2) Another thing we need to prove is the fact that ll and lr are shorter than l when 2 <= length l.
Since an if-expression is a match in disguise as well (it works for any inductive datatype with exactly two constructors), we need some mechanism to remember that leb 2 (length l) = true in the then branch. Again, since we don't have leb anywhere, this information gets lost.
There are at least two possible solutions to the problem:
either we remember leb 2 (length l) as an equation (just as we did in the 1st part), or
we can use some comparison function with result type behaving like bool (so it can represent two alternatives), but it should also remember some additional information we need. Then we could pattern-match on the comparison result and extract the information, which, of course, in this case have to be a proof of 2 <= length l.
What we need is a type which is able to carry a proof of m <= n in the case when leb m n returns true and a proof of, say, m > n otherwise.
There is a type in the standard library that does exactly that! It's called sumbool:
Inductive sumbool (A B : Prop) : Set :=
left : A -> {A} + {B} | right : B -> {A} + {B}
{A} + {B} is just a notation (syntactic sugar) for sumbool A B.
Just as bool, it has two constructors, but in addition it remembers a proof of either of two propositions A and B. Its advantage over bool shows up when you do case analysis on it with if: you get a proof of A in the then branch and a proof of B in the else branch. In other words, you get to use context you saved beforehand, whereas bool doesn't carry any context (only in the mind of the programmer).
And we need exactly that! Well, not in the else branch, but we would like to get 2 <= length l in our then branch. So, let us ask Coq if it already has a comparison function with the return type like that:
Search (_ -> _ -> {_ <= _} + {_}).
(*
output:
le_lt_dec: forall n m : nat, {n <= m} + {m < n}
le_le_S_dec: forall n m : nat, {n <= m} + {S m <= n}
le_ge_dec: forall n m : nat, {n <= m} + {n >= m}
le_gt_dec: forall n m : nat, {n <= m} + {n > m}
le_dec: forall n m : nat, {n <= m} + {~ n <= m}
*)
Any of the five results would do, because we need a proof only in one case.
Hence, we can replace if leb 2 (length l) then ... with if le_lt_dec 2 (length l) ... and get 2 <= length in the proof context, which will let us finish the proof.

Reasoning about lists in Coq

I'm try to solve some theorems, based on Pierce's "Software Foundations".
First of all I create a couple of useful functions:
Inductive natlist: Type :=
| nil: natlist
| cons : nat -> natlist -> natlist.
Notation "x :: l" := (cons x l) (at level 60, right associativity).
Fixpoint repeat (n count: nat): natlist :=
match count with
| O => nil
| S count' => n :: (repeat n count')
end.
Fixpoint length (l: natlist): nat :=
match l with
| nil => O
| h :: t => S (length t)
end.
Theorem count_repeat: forall n: nat, length (repeat n n) = n.
Proof.
intros n. induction n as [| n'].
simpl. reflexivity.
simpl. (* and here I can't continue... *)
I want to follow Pierce's advice:
Note that, since this problem is somewhat open-ended, it's possible
that you may come up with a theorem which is true, but whose proof
requires techniques you haven't learned yet. Feel free to ask for help
if you get stuck!
So, could you please advice some proof techniques for me?
As #eponier said, you should try to prove a more general lemma, like
Theorem count_repeat_gen: forall m n: nat, length (repeat n m) = m.
Using repeat n n creates an implicit link between the value of the element and the size of the list which makes your statement impossible to prove directly. Once you proved count_repeat_gen, you'll be able to prove your theorem.

Coq dependent types

I am new to Coq and need some help with some of trivial examples to get me started. In particular I am interested in defining some operations of vectors (fixed size lists) using dependent types. I started with Vector package and trying to implement some additional functions. For example I am having difficulty implementing trivial 'take' and 'drop' functions, which take or drop first 'p' elements from the list.
Require Import Vector.
Fixpoint take {A} {n} (p:nat) (a: t A n) : p<=n -> t A p :=
match a return ( p<=n -> t A p) with
| cons A v (S m) => cons (hd v) (take m (tl v)) m
| nil => fun pf => a
end.
The error (in case of nil) is:
The term "a" has type "t A n" while it is expected to have type "t A p".
Could somebody help me with some starting points? Thanks!
I don't understand your approach. You're always returning a non-empty vector when the argument is a non-empty vector, but take must return nil when p=0 regardless of the vector.
Here's one approach to building take. Rather than using the hypothesis p <= n, I express the length of the argument n as a sum of the number p of elements to take and the number of trailing elements m, which is possible iff p <= n. This allows for an easier recursive definition, because (S p') + m is structurally equal to S (p' + m). Note that the discrimination is on the number of elements to take: return nil if taking 0, return cons head new_tail otherwise.
This version of the take function has the desired computational behavior, so all that's left is to define one with the desired proof content. I use the Program feature to do this conveniently: fill in the computational content (trivial, I just need to say that I want to use m = n - p), then complete the proof obligations (which are simple arithmetic).
Require Import Arith.
Require Import Vector.
Fixpoint take_plus {A} {m} (p:nat) : t A (p+m) -> t A p :=
match p return t A (p+m) -> t A p with
| 0 => fun a => nil _
| S p' => fun a => cons A (hd a) _ (take_plus p' (tl a))
end.
Program Definition take A n p (a : t A n) (H : p <= n) : t A p :=
take_plus (m := n - p) p a.
Solve Obligations using auto with arith.
For your newdrop : forall A n p, t A n -> p <= n -> t A (n-p), the following approach works. You need to help Coq by telling it what p and n become in the recursive call.
Program Fixpoint newdrop {A} {n} p : t A n -> p <= n -> t A (n-p) :=
match p return t A n -> p <= n -> t A (n-p) with
| 0 => fun a H => a
| S p' => fun a H => newdrop p' (tl a) (_ : p' <= n - 1)
end.
Next Obligation.
omega.
Qed.
Next Obligation.
omega.
Qed.
Next Obligation.
omega.
Qed.
Next Obligation.
omega.
Qed.
I don't know why Solve Obligations using omega. doesn't work but solving each obligation individually works.