Following the example given in the chapter GeneralRec of Chlipala book, I'm trying to write the mergesort algorithm.
Here is my code
Require Import Nat.
Fixpoint insert (x:nat) (l: list nat) : list nat :=
match l with
| nil => x::nil
| y::l' => if leb x y then
x::l
else
y::(insert x l')
end.
Fixpoint merge (l1 l2 : list nat) : list nat :=
match l1 with
| nil => l2
| x::l1' => insert x (merge l1' l2)
end.
Fixpoint split (l : list nat) : list nat * list nat :=
match l with
| nil => (nil,nil)
| x::nil => (x::nil,nil)
| x::y::l' =>
let (ll,lr) := split l' in
(x::ll,y::lr)
end.
Definition lengthOrder (l1 l2 : list nat) :=
length l1 < length l2.
Theorem lengthOrder_wf : well_founded lengthOrder.
Admitted.
The problem is that it is not possible to write the mergeSort function with the command Fixpoint since the function is not structurally decreasing :
Fixpoint mergeSort (l: list nat) : list nat :=
if leb (length l) 1 then l
else
let (ll,lr) := split l in
merge (mergeSort ll) (mergeSort lr).
Instead, one can use the command Program Fixpoint or Definition with the term Fix (as in Chlipala book).
However, if I'm writing this
Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun (l: list nat) => list nat)
(fun (l : list nat) => (fun mergeSort : (forall ls : list nat, lengthOrder ls l -> list nat )=>
if leb (length l) 1 then
let (ll,lr) := split l in
merge (mergeSort ll _) (mergeSort lr _)
else
l))).
I'm getting impossible goals :
2 subgoals, subgoal 1 (ID 65)
l : list nat
mergeSort : forall ls : list nat, lengthOrder ls l -> list nat
ll, lr : list nat
============================
lengthOrder ll l
subgoal 2 (ID 66) is:
lengthOrder lr l
That is why Chlipala suggests to change the definition of mergeSort this way:
Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun _ => list nat)
(fun (ls : list nat)
(mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat) =>
if Compare_dec.le_lt_dec 2 (length ls)
then let lss := split ls in
merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
else ls)).
that generates the following goals:
2 subgoals, subgoal 1 (ID 68)
ls : list nat
mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat
l : 2 <= length ls
lss := split ls : list nat * list nat
============================
lengthOrder (fst lss) ls
subgoal 2 (ID 69) is:
lengthOrder (snd lss) ls
This new definition sounds like magic to me. So I wonder:
Fom the first definition, is it still possible to proof the well-foudness of the function?
Otherwise why the first definition cannot work?
How a basic user can go from the first definition to the second easily?
It's easy to see that you need to make two changes in order to get to A. Chlipala's solution.
1) When doing split you somehow need to remember that ll and lr came from split, otherwise they would be some arbitrary lists, which cannot possibly be shorter than the original list l.
The following piece of code fails to save that kind of information:
let (ll,lr) := split l in
merge (mergeSort ll _) (mergeSort lr _)
and, thus, needs to be replaced with
let lss := split ls in
merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
which keeps what we need.
The failure happens due to Coq's inability to remember that ll and lr come from split l and that happens because let (ll,lr) is just match in disguise (see the manual, §2.2.3).
Recall that the aims of pattern-matching is to (loosely speaking)
unpack the components of some value of an inductive datatype and bind them to some names (we'll need this in the 2nd part of my answer) and
replace the original definition with its special cases in the corresponding pattern-match branches.
Now, observe that split l does not occur anywhere in the goal or context before we pattern-match on it. We just arbitrarily introduce it into the definition. That's why pattern-matching doesn't give us anything -- we can't replace split l with its "special case" ((ll,lr)) in the goal or context, because there is no split l anywhere.
There is an alternative way of doing this by using logical equality (=):
(let (ll, lr) as s return (s = split l -> list nat) := split l in
fun split_eq => merge (mergeSort ll _) (mergeSort lr _)) eq_refl
This is analogous to using the remember tactic. We've got rid of fst and snd, but it is a huge overkill and I wouldn't recommend it.
2) Another thing we need to prove is the fact that ll and lr are shorter than l when 2 <= length l.
Since an if-expression is a match in disguise as well (it works for any inductive datatype with exactly two constructors), we need some mechanism to remember that leb 2 (length l) = true in the then branch. Again, since we don't have leb anywhere, this information gets lost.
There are at least two possible solutions to the problem:
either we remember leb 2 (length l) as an equation (just as we did in the 1st part), or
we can use some comparison function with result type behaving like bool (so it can represent two alternatives), but it should also remember some additional information we need. Then we could pattern-match on the comparison result and extract the information, which, of course, in this case have to be a proof of 2 <= length l.
What we need is a type which is able to carry a proof of m <= n in the case when leb m n returns true and a proof of, say, m > n otherwise.
There is a type in the standard library that does exactly that! It's called sumbool:
Inductive sumbool (A B : Prop) : Set :=
left : A -> {A} + {B} | right : B -> {A} + {B}
{A} + {B} is just a notation (syntactic sugar) for sumbool A B.
Just as bool, it has two constructors, but in addition it remembers a proof of either of two propositions A and B. Its advantage over bool shows up when you do case analysis on it with if: you get a proof of A in the then branch and a proof of B in the else branch. In other words, you get to use context you saved beforehand, whereas bool doesn't carry any context (only in the mind of the programmer).
And we need exactly that! Well, not in the else branch, but we would like to get 2 <= length l in our then branch. So, let us ask Coq if it already has a comparison function with the return type like that:
Search (_ -> _ -> {_ <= _} + {_}).
(*
output:
le_lt_dec: forall n m : nat, {n <= m} + {m < n}
le_le_S_dec: forall n m : nat, {n <= m} + {S m <= n}
le_ge_dec: forall n m : nat, {n <= m} + {n >= m}
le_gt_dec: forall n m : nat, {n <= m} + {n > m}
le_dec: forall n m : nat, {n <= m} + {~ n <= m}
*)
Any of the five results would do, because we need a proof only in one case.
Hence, we can replace if leb 2 (length l) then ... with if le_lt_dec 2 (length l) ... and get 2 <= length in the proof context, which will let us finish the proof.
Related
Going through Separation Logic Foundations and I'm stuck on the exercise triple_mlength in Repr.v. I think my current problem is that I don't know how to handle ints and nats in Coq.
Lemma triple_mlength: forall (L: list val) (p:loc),
triple (mlength p)
(MList L p)
(fun r => \[r = val_int (length L)] \* (MList L p))
Check (fun L => val_int (length L)) doesn't throw an error, so that means length is capable of being an int. However, length is opaque and I can't unfold it.
My current context and goal:
x : val
p : loc
C : p <> null
x0 : loc
H : p <> null
xs : list val
IH : forall y : list val,
list_sub y (x :: xs) ->
forall p, triple (mlength p)
(MList y p)
(fun r:val => \[r = length y] \* MList y p)
______________________________________________________________
length xs + 1 = length (x :: xs)
Unsetting print notation the goal transforms into:
eq (Z.add (length xs) (Zpos xH)) (length (cons x xs))
which I think is trying to add (1:Z) to (length xs: nat), then compare it to (length (cons x xs) : nat)
Types:
Inductive nat : Set := O : nat
| S : nat -> nat
Inductive Z : Set := Z0 : int
| Zpos : positive -> int
| Zneg : positive -> int
list: forall A, list A -> nat
length: forall A, list A -> nat
val_int: int -> val
Coq version is 8.12.2
There is a coercion nat_to_Z : nat -> int in scope that is converting length xs : nat and length (x :: xs) : nat to ints. This is separate from the notation mechanism and thus you don't see it when you only ask Coq to show notations. However, it is there and you need to handle it in your proofs. There are a bunch of lemmas floating around that prove equivalence between nat operations and Z/int operations.
Having loaded your file and looked around a bit (Search is your friend!), it appears the reason you cannot simplify length (x :: xs) = S (length xs) is because there is a lemma length_cons which gives length (x :: xs) = (1 + length xs)%nat, instead. I suppose the authors of this book thought that would be a good idea for some reason, so they disabled the usual simplification. Do note that "normally" length is transparent and simpl would work on this goal.
After using length_cons, you can use plus_nat_eq_plus_int to push the coercion down under the +, and then Z.add_comm finishes. This line should satisfy the goal.
now rewrite length_cons, plus_nat_eq_plus_int, Z.add_comm.
I want to use definition of list_max_le. After applying "Search list_max_le" I get nothing. How I can define list_max_le in Coq?
Require Import List.
Search "list" "max".
by using quotes " you search for definitions with has the specified string in its name.
The search above produces the result
list_max: list nat -> nat
list_max_le:
forall (l : list nat) (n : nat), list_max l <= n <-> Forall (fun k : nat => k <= n) l
list_max_app:
forall l1 l2 : list nat, list_max (l1 ++ l2) = Nat.max (list_max l1) (list_max l2)
list_max_lt:
forall [l : list nat] (n : nat),
l <> nil -> list_max l < n <-> Forall (fun k : nat => k < n) l
If you want to see the definition of list_max_le, you use the Print command
Print list_max_le.
but in this case the definition is not very readable.
If you Search without ", you search with a pattern that tries to match part of the type definition.
So if you search
Search list_max.
You search for all definitions that contains the term list_max.
list_max_le:
forall (l : list nat) (n : nat), list_max l <= n <-> Forall (fun k : nat => k <= n) l
list_max_app:
forall l1 l2 : list nat, list_max (l1 ++ l2) = Nat.max (list_max l1) (list_max l2)
list_max_lt:
forall [l : list nat] (n : nat),
l <> nil -> list_max l < n <-> Forall (fun k : nat => k < n) l
You can have many strings and terms to refine your search.
As an example, if you want som lemma about induction on lists, it is probably called something with "ind" in its name, and has the list term in its type (not necessarily in its name). So you can try
Search "ind" list.
I guess you already have Require Import List. so that Check list_max_le. works. A Search list_max_le. doesn't show anything because Search searches lemmas which have the searched term in their type/statement, but there is no lemma which has list_max_le in its type. You could do Search "list_max_le". to search for a lemma with list_max_le in its name.
If the question is "how do I find out which module to require?": I either google for "Coq standard library lemma name" or grep through the theories folder of the Coq sources. Maybe we should have a file which includes the complete standard library to make searching lemmas easier.
If you do Require Import List.
Search "max" "list".
will show you all that is known about maximum and list.
TL;DR
I want to write a fixpoint definition that matches over a value inside a dependent type without proof-mode. The essential issue is that Coq won't use the match to notice that the types are equivalent in a dependent type; I can force it in proof-mode, but I wonder if it's possible to do so without it.
I'm working on a project that involves lots of matrix operations. The matrices can be arbitrarily many dimensions (each of which is rectangular), so I wrote a definition to compute the type of the matrix:
Require Import Coq.Unicode.Utf8.
Require Export Vector.
Import VectorNotations.
Require Import List.
Import ListNotations.
Fixpoint matrix (A: Type) (dims: list nat) :=
match dims with
| [] => A
| head::tail => Vector.t (matrix A tail) head
end.
For "reasons" I need to linearize the elements in order to select the nth element of a linearized matrix. My first attempt was to try to return a single-dimensional matrix, but I ran into a wall with List's fold_left (advice on proceeding would be appreciated):
Definition product (dims: list nat) := List.fold_left Nat.mul dims 1.
Definition linearize {A: Type} {dims: list nat} (m: matrix A dims): matrix A [product dims].
Proof.
generalize dependent m.
induction dims.
- intros.
assert (product [] = 1) by reflexivity. rewrite H; clear H.
exact (Vector.cons A m 0 (Vector.nil A)).
- intros.
(* why so hard? *)
assert (product (a::dims) = a * product dims).
{ unfold product.
assert (a::dims = [a] ++ dims) by reflexivity. rewrite H; clear H.
rewrite List.fold_left_app.
assert (List.fold_left Nat.mul [a] 1 = a). admit. }
Abort.
I decided it might be easier to convert to a list, so:
Fixpoint linearize' {A: Type} {dims: list nat} (m: matrix A dims): list A :=
match dims with
| [] => []
| h::t => Vector.fold_left
(#app A)
[]
(Vector.map linearize' (m: Vector.t (matrix (list A) t) h))
end.
but Coq complains:
In environment
linearize' : ∀ (A : Type) (dims : list nat), matrix A dims → list A
A : Type
dims : list nat
m : matrix A dims
h : nat
t : list nat
The term "m" has type "matrix A dims" while it is expected to have type
"Vector.t (matrix (list A) t) h".
I am able to write the definition using a "proof style," but I am flummoxed that I cannot get Coq to accept the fixpoint that is essentially the same!
Definition linearize {A: Type} {dims: list nat} (m: matrix A dims): list A.
Proof.
induction dims.
- (* unfold matrix in m. *) (* exact [m]. *) apply [m].
- simpl in m.
(* exact (Vector.fold_left (#List.app A) [] (Vector.map IHdims m)). *)
apply (Vector.map IHdims) in m.
apply (Vector.fold_left (#List.app A) [] m).
Defined.
It seems like if I could get Coq to destruct the type of m along with dims, like what happens in induction, I would be good to go… here's Print linearize.
linearize =
λ (A : Type) (dims : list nat) (m : matrix A dims),
list_rect (λ dims0 : list nat, matrix A dims0 → list A)
(λ m0 : matrix A [], [m0])
(λ (a : nat) (dims0 : list nat) (IHdims : matrix A dims0 → list A)
(m0 : matrix A (a :: dims0)),
let m1 := Vector.map IHdims m0 in Vector.fold_left (app (A:=A)) [] m1)
dims m
: ∀ (A : Type) (dims : list nat), matrix A dims → list A
Arguments linearize {A}%type_scope {dims}%list_scope _
My first reaction was "List.fold_left, he's going to have a bad time."
Here's a solution using List.fold_right instead.
Definition product (dims: list nat) := List.fold_right Nat.mul 1 dims.
Fixpoint concat {A} {n m : nat} (v : Vector.t (Vector.t A m) n) : Vector.t A (n * m) :=
match v with
| []%vector => []%vector
| (x :: xs)%vector => append x (concat xs)
end.
Fixpoint linearize {A: Type} {dims: list nat} : matrix A dims -> matrix A [product dims] :=
match dims with
| [] => fun a => (a :: [])%vector
| head :: tail => fun a => concat (Vector.map (linearize (dims := tail)) a)
end.
The problem with fold_left is that, in the non-empty case, it unfolds to an immediate recursive call, which keeps too much information hidden for dependently typed programming. One use case might be to define tail-recursive functions, but this is not applicable here.
With fold_right, whenever you pattern-match on dims, the cons case exposes one Nat.mul which permits one use of concat : Vector.t (Vector.t A m) n -> Vector.t A (n * m).
This is one of the major headaches of using dependent types in Coq. The solution is to rewrite linearize so that it returns a function after matching:
Require Import Coq.Unicode.Utf8.
Require Export Vector.
Import VectorNotations.
Require Import List.
Import ListNotations.
Fixpoint matrix (A: Type) (dims: list nat) :=
match dims with
| [] => A
| head::tail => Vector.t (matrix A tail) head
end.
Fixpoint linearize {A: Type} {dims: list nat} : matrix A dims -> list A :=
match dims with
| [] => fun _ => []
| dim :: dims => fun mat =>
let res := Vector.map (#linearize _ dims) mat in
Vector.fold_left (#app _) [] res
end.
This trick is known as the convoy pattern; you can find more about it here: http://adam.chlipala.net/cpdt/html/MoreDep.html .
I made an environment to try to proof what I want/need
I have a posfijo function that says if a list (l1) contains another list (l2) at the end.
So if I add an element to the first list and I use the result as the second list, like l2 = x :: l1, I want to proof that is not possible.
I did this...
Variable G:Set.
Inductive posfijo : list _ -> list _ -> Prop :=
| posfijoB : forall l: list _, posfijo l l
| posfijoI : forall (l1 l2: list _) (a : G), posfijo l1 l2 -> posfijo l1 (cons a l2).
Infix "<<" := (posfijo) (at level 70, right associativity).
Lemma Pref4_a : forall (X:Set)(l: list G)(x:G), ~ (cons x l << l).
Proof.
intros X l x H.
So then my goal is
You should proceed with induction l.
I defined a recursive function for all subsets of nat_list in coq as
Fixpoint subsets (a: list nat) : (list (list nat)) :=
match a with
|[] => [[]]
|h::t => subsets t ++ map (app [h]) (subsets t)
end.
I am trying to prove that
forall (a:list nat), In [] (subsets a).
I tried to induct on a. The base-case was straight forward. However in the induction case i tried to use the in-built theorem in_app_or.
Unable to unify "In ?M1396 ?M1394 \/ In ?M1396 ?M1395" with
"(fix In (a : list nat) (l : list (list nat)) {struct l} : Prop :=
match l with
| [] => False
| b :: m => b = a \/ In a m
end)
[] (subsets t ++ map (fun m : list nat => h :: m) (subsets t))".
How do I prove such a theorem or get around such an issue?
The problem with in_app_or is that is has the following type:
forall (A : Type) (l m : list A) (a : A),
In a (l ++ m) -> In a l \/ In a m
and application of lemmas to the goal works "backwards": Coq matches the consequent B of the implication A -> B with the goal, and if they can be unified, you are left with a new goal: you need to prove a (stronger) statement A. And in your case the A and B are in the wrong order (swapped), so you need to apply in_or_app instead:
in_or_app : forall (A : Type) (l m : list A) (a : A),
In a l \/ In a m -> In a (l ++ m)
This is how your goal can be proved using in_or_app:
Goal forall (a:list nat), In [] (subsets a).
intros.
induction a; simpl; auto.
apply in_or_app; auto.
Qed.