Can I extract a Coq proof as a Haskell function? - coq

Ever since I learned a little bit of Coq I wanted to learn to write a Coq proof of the so-called division algorithm that is actually a logical proposition: forall n m : nat, exists q : nat, exists r : nat, n = q * m + r
I recently accomplished that task using what I learned from Software Foundations.
Coq being a system for developing constructive proofs, my proof is in effect a method to construct suitable values q and r from values m and n.
Coq has an intriguing facility for "extracting" an algorithm in Coq's algorithm language (Gallina) to general-purpose functional programming languages including Haskell.
Separately I have managed to write the divmod operation as a Gallina Fixpoint and extract that. I want to note carefully that that task is not what I'm considering here.
Adam Chlipala has written in Certified Programming with Dependent Types that "Many fans of the Curry-Howard correspondence support the idea of extracting programs from proofs. In reality, few users of Coq and related tools do any such thing."
Is it even possible to extract the algorithm implicit in my proof to Haskell? If it is possible, how would it be done?

Thanks to Prof. Pierce's summer 2012 video 4.1 as Dan Feltey suggested, we see that the key is that the theorem to be extracted must provide a member of Type rather than the usual kind of propositions, which is Prop.
For the particular theorem the affected construct is the inductive Prop ex and its notation exists. Similarly to what Prof. Pierce has done, we can state our own alternate definitions ex_t and exists_t that replace occurrences of Prop with occurrences of Type.
Here is the usual redefinition of ex and exists similarly as they are defined in Coq's standard library.
Inductive ex (X:Type) (P : X->Prop) : Prop :=
ex_intro : forall (witness:X), P witness -> ex X P.
Notation "'exists' x : X , p" := (ex _ (fun x:X => p))
(at level 200, x ident, right associativity) : type_scope.
Here are the alternate definitions.
Inductive ex_t (X:Type) (P : X->Type) : Type :=
ex_t_intro : forall (witness:X), P witness -> ex_t X P.
Notation "'exists_t' x : X , p" := (ex_t _ (fun x:X => p))
(at level 200, x ident, right associativity) : type_scope.
Now, somewhat unfortunately, it is necessary to repeat both the statement and the proof of the theorem using these new definitions.
What in the world??
Why is it necessary to make a reiterated statement of the theorem and a reiterated proof of the theorem, that differ only by using an alternative definition of the quantifier??
I had hoped to use the existing theorem in Prop to prove the theorem over again in Type. That strategy fails when Coq rejects the proof tactic inversion for a Prop in the environment when that Prop uses exists and the goal is a Type that uses exists_t. Coq reports "Error: Inversion would require case analysis on sort Set which is not allowed
for inductive definition ex." This behavior occurred in Coq 8.3. I am not certain that it
still occurs in Coq 8.4.
I think the need to repeat the proof is actually profound although I doubt that I personally am quite managing to perceive its profundity. It involves the facts that Prop is "impredicative" and Type is not impredicative, but rather, tacitly "stratified". Predicativity is (if I understand correctly) vulnerability to Russell's paradox that the set S of sets that are not members of themselves can neither be a member of S, nor a non-member of S. Type avoids Russell's paradox by tacitly creating a sequence of higher types that contain lower types. Because Coq is drenched in the formulae-as-types interpretation of the Curry-Howard correspondence, and if I am getting this right, we can even understand stratification of types in Coq as a way to avoid Gödel incompleteness, the phenomenon that certain formulae express constraints on formulae such as themselves and thereby become unknowable as to their truth or falsehood.
Back on planet Earth, here is the repeated statement of the theorem using "exists_t".
Theorem divalg_t : forall n m : nat, exists_t q : nat,
exists_t r : nat, n = plus (mult q m) r.
As I have omitted the proof of divalg, I will also omit the proof of divalg_t. I will only mention that we do have the good fortune that proof tactics including "exists" and "inversion" work just the same with our new definitions "ex_t" and "exists_t".
Finally, the extraction itself is accomplished easily.
Extraction Language Haskell.
Extraction "divalg.hs" divalg_t.
The resulting Haskell file contains a number of definitions, the heart of which is the reasonably nice code, below. And I was only slightly hampered by my near-total ignorance of the Haskell programming language. Note that Ex_t_intro creates a result whose type is Ex_t; O and S are the zero and the successor function from Peano arithmetic; beq_nat tests Peano numbers for equality; nat_rec is a higher-order function that recurs over the function among its arguments. The definition of nat_rec is not shown here. At any rate it is generated by Coq according to the inductive type "nat" that was defined in Coq.
divalg :: Nat -> Nat -> Ex_t Nat (Ex_t Nat ())
divalg n m =
case m of {
O -> Ex_t_intro O (Ex_t_intro n __);
S m' ->
nat_rec (Ex_t_intro O (Ex_t_intro O __)) (\n' iHn' ->
case iHn' of {
Ex_t_intro q' hq' ->
case hq' of {
Ex_t_intro r' _ ->
let {k = beq_nat r' m'} in
case k of {
True -> Ex_t_intro (S q') (Ex_t_intro O __);
False -> Ex_t_intro q' (Ex_t_intro (S r') __)}}}) n}
Update 2013-04-24: I know a bit more Haskell now. To assist others in reading the extracted code above, I'm presenting the following hand-rewritten code that I claim is equivalent and more readable. I'm also presenting the extracted definitions Nat, O, S, and nat_rec that I did not eliminate.
-- Extracted: Natural numbers (non-negative integers)
-- in the manner in which Peano defined them.
data Nat =
O
| S Nat
deriving (Eq, Show)
-- Extracted: General recursion over natural numbers,
-- an interpretation of Nat in the manner of higher-order abstract syntax.
nat_rec :: a1 -> (Nat -> a1 -> a1) -> Nat -> a1
nat_rec f f0 n =
case n of {
O -> f;
S n0 -> f0 n0 (nat_rec f f0 n0)}
-- Given non-negative integers n and m, produce (q, r) with n = q * m + r.
divalg_t :: Nat -> Nat -> (Nat, Nat)
divalg_t n O = (O, n) -- n/0: Define quotient 0, remainder n.
divalg_t n (S m') = divpos n m' -- n/(S m')
where
-- Given non-negative integers n and m',
-- and defining m = m' + 1,
-- produce (q, r) with n = q * m + r
-- so that q = floor (n / m) and r = n % m.
divpos :: Nat -> Nat -> (Nat, Nat)
divpos n m' = nat_rec (O, O) (incrDivMod m') n
-- Given a non-negative integer m' and
-- a pair of non-negative integers (q', r') with r <= m',
-- and defining m = m' + 1,
-- produce (q, r) with q*m + r = q'*m + r' + 1 and r <= m'.
incrDivMod :: Nat -> Nat -> (Nat, Nat) -> (Nat, Nat)
incrDivMod m' _ (q', r')
| r' == m' = (S q', O)
| otherwise = (q', S r')

The current copy of Software Foundations dated July 25, 2012, answers this quite concisely in the late chapter "Extraction2". The answer is that it can certainly be done, much like this:
Extraction Language Haskell
Extraction "divalg.hs" divalg
One more trick is necessary. Instead of a Prop, divalg must be a Type. Otherwise it will be erased in the process of extraction.
Uh oh, #Anthill is correct, I haven't answered the question because I don't know how to explain how Prof. Pierce accomplished that in his NormInType.v variant of his Norm.v and MoreStlc.v.
OK, here's the rest of my partial answer anyway.
Where "divalg" appears above, it will be necessary to provide a space-separated list of all of the propositions (which must each be redefined as a Type rather than a Prop) on which divalg relies. For a thorough, interesting, and working example of a proof extraction, one may consult the chapter Extraction2 mentioned above. That example extracts to OCaml, but adapting it for Haskell is simply a matter of using Extraction Language Haskell as above.
In part, the reason that I spent some time not knowing the above answer is that I have been using the copy of Software Foundations dated October 14, 2010, that I downloaded in 2011.

Related

Real numbers in Coq

In https://www.cs.umd.edu/~rrand/vqc/Real.html#lab1 one can read:
Coq's standard library takes a very different approach to the real numbers: An axiomatic approach.
and one can find the following axiom:
Axiom
completeness :
∀E:R → Prop,
bound E → (∃x : R, E x) → { m:R | is_lub E m }.
The library is not mentioned but in Why are the real numbers axiomatized in Coq? one can find the same description :
I was wondering whether Coq defined the real numbers as Cauchy sequences or Dedekind cuts, so I checked Coq.Reals.Raxioms and... none of these two. The real numbers are axiomatized, along with their operations (as Parameters and Axioms). Why is it so?
Also, the real numbers tightly rely on the notion of subset, since one of their defining properties is that is every upper bounded subset has a least upper bound. The Axiom completeness encodes those subsets as Props."
Nevertheless, whenever I look at https://coq.inria.fr/library/Coq.Reals.Raxioms.html I do not see any axiomatic approach, in particular we have the following lemma:
Lemma completeness :
forall E:R -> Prop,
bound E -> (exists x : R, E x) -> { m:R | is_lub E m }.
Where can I find such an axiomatic approach of the real numbers in Coq?
The description you mention is outdated indeed, because since I asked the question you linked, I rewrote the axioms defining Coq's standard library real numbers in a more standard way. The real numbers are now divided into 2 layers
constructive real numbers, that are defined in terms of Cauchy sequences and that use no axioms at all;
classical real numbers, that are a quotient set of constructive reals, and that do use 3 axioms to prove the least upper bound theorem that you mention.
Coq easily gives you the axioms underlying any term by the Print Assumptions command:
Require Import Raxioms.
Print Assumptions completeness.
Axioms:
ClassicalDedekindReals.sig_not_dec : forall P : Prop, {~ ~ P} + {~ P}
ClassicalDedekindReals.sig_forall_dec
: forall P : nat -> Prop,
(forall n : nat, {P n} + {~ P n}) -> {n : nat | ~ P n} + {forall n : nat, P n}
FunctionalExtensionality.functional_extensionality_dep
: forall (A : Type) (B : A -> Type) (f g : forall x : A, B x),
(forall x : A, f x = g x) -> f = g
As you can see these 3 axioms are purely logical, they do not speak about real numbers at all. They just assume a fragment of classical logic.
If you want an axiomatic definition of the reals in Coq, I provided one for the constructive reals
Require Import Coq.Reals.Abstract.ConstructiveReals.
And this becomes an interface for classical reals if you assume the 3 axioms above.
These descriptions are outdated. It used to be the case that the type R of real numbers was axiomatized, along with its basic properties. But nowadays (since 2019?) it is defined in terms of more basic axioms, more or less like one would do in traditional mathematics.

Proving equality between instances of dependent types

When attempting to formalize the class which corresponds to an algebraic structure (for example the class of all monoids), a natural design is to create a type monoid (a:Type) as a product type which models all the required fields (an element e:a, an operator app : a -> a -> a, proofs that the monoid laws are satisfied etc.). In doing so, we are creating a map monoid: Type -> Type. A possible drawback of this approach is that given a monoid m:monoid a (a monoid with support type a) and m':monoid b (a monoid wih support type b), we cannot even write the equality m = m' (let alone prove it) because it is ill-typed. An alternative design would be to create a type monoid where the support type is just another field a:Type, so that given m m':monoid, it is always meaningful to ask whether m = m'. Somehow, one would like to argue that if m and m' have the same supports (a m = a m) and the operators are equals (app m = app m', which may be achieved thanks to some extensional equality axiom), and that the proof fields do not matter (because we have some proof irrelevance axiom) etc. , then m = m'. Unfortunately, we can't event express the equality app m = app m' because it is ill-typed...
To simplify the problem, suppose we have:
Inductive myType : Type :=
| make : forall (a:Type), a -> myType.
.
I would like to have results of the form:
forall (a b:Type) (x:a) (y:b), a = b -> x = y -> make a x = make b y.
This statement is ill-typed so we can't have it.
I may have axioms allowing me to prove that two types a and b are same, and I may be able to show that x and y are indeed the same too, but I want to have a tool allowing me to conclude that make a x = make b y. Any suggestion is welcome.
A low-tech way to prove this is to insert a manual type-cast, using the provided equality. That is, instead of having an assumption x = y, you have an assumption (CAST q x) = y. Below I explicitly write the cast as a match, but you could also make it look nicer by defining a function to do it.
Inductive myType : Type :=
| make : forall (a:Type), a -> myType.
Lemma ex : forall (a b:Type) (x:a) (y:b) (q: a = b), (match q in _ = T return T with eq_refl => x end) = y -> make a x = make b y.
Proof.
destruct q.
intros q.
congruence.
Qed.
There is a nicer way to hide most of this machinery by using "heterogenous equality", also known as JMeq. I recommend the Equality chapter of CPDT for a detailed introduction. Your example becomes
Require Import Coq.Logic.JMeq.
Infix "==" := JMeq (at level 70, no associativity).
Inductive myType : Type :=
| make : forall (a:Type), a -> myType.
Lemma ex : forall (a b:Type) (x:a) (y:b), a = b -> x == y -> make a x = make b y.
Proof.
intros.
rewrite H0.
reflexivity.
Qed.
In general, although this particular theorem can be proved without axioms, if you do the formalization in this style you are likely to encounter goals that can not be proven in Coq without axioms about equality. In particular, injectivity for this kind of dependent records is not provable. The JMEq library will automatically use an axiom JMeq_eq about heterogeneous equality, which makes it quite convenient.

What is difference between `destruct` and `case_eq` tactics in Coq?

I understood destruct as it breaks an inductive definition into its constructors. I recently saw case_eq and I couldn't understand what it does differently?
1 subgoals
n : nat
k : nat
m : M.t nat
H : match M.find (elt:=nat) n m with
| Some _ => true
| None => false
end = true
______________________________________(1/1)
cc n (M.add k k m) = true
In the above context, if I do destruct M.find n m it breaks H into true and false whereas case_eq (M.find n m) leaves H intact and adds separate proposition M.find (elt:=nat) n m = Some v, which I can rewrite to get same effect as destruct.
Can someone please explain me the difference between the two tactics and when which one should be used?
The first basic tactic in the family of destruct and case_eq is called case. This tactic modifies only the conclusion. When you type case A and A has a type T which is inductive, the system replaces A in the goal's conclusion by instances of all the constructors of type T, adding universal quantifications for the arguments of these constructors, if needed. This creates as many goals as there are constructors in type T. The formula A disappears from the goal and if there is any information about A in an hypothesis, the link between this information and all the new constructors that replace it in the conclusion gets lost. In spite of this, case is an important primitive tactic.
Loosing the link between information in the hypotheses and instances of A in the conclusion is a big problem in practice, so developers came up with two solutions: case_eq and destruct.
Personnally, when writing the Coq'Art book, I proposed that we write a simple tactic on top of case that keeps a link between A and the various constructor instances in the form of an equality. This is the tactic now called case_eq. It does the same thing as case but adds an extra implication in the goal, where the premise of the implication is an equality of the form A = ... and where ... is an instance of each constructor.
At about the same time, the tactic destruct was proposed. Instead of limiting the effect of replacement in the goal's conclusion, destruct replaces all instances of A appearing in the hypotheses with instances of constructors of type T. In a sense, this is cleaner because it avoids relying on the extra concept of equality, but it is still incomplete because the expression A may be a compound expression f B, and if B appears in the hypothesis but not f B the link between A and B will still be lost.
Illustration
Definition my_pred (n : nat) := match n with 0 => 0 | S p => p end.
Lemma example n : n <= 1 -> my_pred n <= 0.
Proof.
case_eq (my_pred n).
Gives the two goals
------------------
n <= 1 -> my_pred n = 0 -> 0 <= 0
and
------------------
forall p, my_pred n = S p -> n <= 1 -> S p <= 0
the extra equality is very useful here.
In this question I suggested that the developer use case_eq (a == b) when (a == b) has type bool because this type is inductive and not very informative (constructors have no argument). But when (a == b) has type {a = b}+{a <> b} (which is the case for the string_dec function) the constructors have arguments that are proofs of interesting properties and the extra universal quantification for the arguments of the constructors are enough to give the relevant information, in this case a = b in a first goal and a <> b in a second goal.

How does elim work in Coq on /\ and \/?

In Coq Tutorial, section 1.3.1 and 1.3.2, there are two elim applications:
The first one:
1 subgoal
A : Prop
B : Prop
C : Prop
H : A /\ B
============================
B /\ A
after applying elim H,
Coq < elim H.
1 subgoal
A : Prop
B : Prop
C : Prop
H : A /\ B
============================
A -> B -> B /\ A
The second one:
1 subgoal
H : A \/ B
============================
B \/ A
After applying elim H,
Coq < elim H.
2 subgoals
H : A \/ B
============================
A -> B \/ A
subgoal 2 is:
B -> B \/ A
There are three questions. First, in the second example, I don't understand what inference rule (or, logical identity) is applied to the goal to generate the two subgoals. It is clear to me for the first example, though.
The second question, according to the manual of Coq, elim is related to inductive types. Therefore, it appears that elim cannot be applied here at all, because I feel that there are no inductive types in the two examples (forgive me for not knowing the definition of inductive types). Why can elim be applied here?
Third, what does elim do in general? The two examples here don't show a common pattern for elim. The official manual seems to be designed for very advanced users, since they define a term upon several other terms that are defined by even more terms, and their language is ambiguous.
Thank you so much for answering!
Jian, first let me note that the manual is open source and available at https://github.com/coq/coq ; if you feel that the wording / definition order could be improved please open an issue there or feel free to submit a pull request.
Regarding your questions, I think you would benefit from reading some more comprehensive introduction to Coq such as "Coq'art", "Software Foundations" or "Programs and Proofs" among others.
In particular, the elim tactic tries to apply the so called "elimination principle" for a particular type. It is called elimination because in a sense, the rule allows you to "get rid" of that particular object, allowing you to continue on the proof [I recommend reading Dummett for a more throughout discussion of the origins of logical connectives]
In particular, the elimination rule for the ∨ connective is usually written by logicians as follows:
A B
⋮ ⋮
A ∨ B C C
────────────────
C
that is to say, if we can derive C independently from A and B, then we can derive it from A ∨ B. This looks obvious, doesn't it?
Going back to Coq, it turns out that this rule has a computational interpretation thanks to the "Curry-Howard-Kolmogorov" equivalence. In fact, Coq doesn't provide most of the standard logical connectives as a built in, but it allow us to define them by means of "Inductive" datatypes, similar to those in Haskell or OCaml.
In particular, the definition of ∨ is:
Inductive or (A B : Prop) : Prop :=
| or_introl : A -> A \/ B
| or_intror : B -> A \/ B
that is to say, or A B is the piece of data that either contains an A or a B, together with a "tag", that allows us to "match" to know which one do we really have.
Now, the "elimination principle for or" has type:
or_ind : forall A B P : Prop, (A -> P) -> (B -> P) -> A \/ B -> P
The great thing of Coq is that such principle is not a "built-in", just a regular program! Think, could you write the code of the or_ind function? I'll give you a hint:
Definition or_ind A B P (hA : A -> P) (hB : B -> P) (orW : A ‌\/ B) :=
match orW with
| or_introl aW => ?
| or_intror bW => ?
end.
Once this function is defined, all that elim does, is to apply it, properly instantiating the variable P.
Exercise: solve your second example using apply and ord_ind instead of elim. Good luck!

Inductive definition for family of types

I have been struggling on this for a while now. I have an inductive type:
Definition char := nat.
Definition string := list char.
Inductive Exp : Set :=
| Lit : char -> Exp
| And : Exp -> Exp -> Exp
| Or : Exp -> Exp -> Exp
| Many: Exp -> Exp
from which I define a family of types inductively:
Inductive Language : Exp -> Set :=
| LangLit : forall c:char, Language (Lit c)
| LangAnd : forall r1 r2: Exp, Language(r1) -> Language(r2) -> Language(And r1 r2)
| LangOrLeft : forall r1 r2: Exp, Language(r1) -> Language(Or r1 r2)
| LangOrRight : forall r1 r2: Exp, Language(r2) -> Language(Or r1 r2)
| LangEmpty : forall r: Exp, Language (Many r)
| LangMany : forall r: Exp, Language (Many r) -> Language r -> Language (Many r).
The rational here is that given a regular expression r:Exp I am attempting to represent the language associated with r as a type Language r, and I am doing so with a single inductive definition.
I would like to prove:
Lemma L1 : forall (c:char)(x:Language (Lit c)),
x = LangLit c.
(In other words, the type Language (Lit c) has only one element, i.e. the language of the regular expression 'c' is made of the single string "c". Of course I need to define some semantics converting elements of Language r to string)
Now the specifics of this problem are not important and simply serve to motivate my question: let us use nat instead of Exp and let us define a type List n which represents the lists of length n:
Parameter A:Set.
Inductive List : nat -> Set :=
| ListNil : List 0
| ListCons : forall (n:nat), A -> List n -> List (S n).
Here again I am using a single inductive definition to define a family of types List n.
I would like to prove:
Lemma L2: forall (x: List 0),
x = ListNil.
(in other words, the type List 0 has only one element).
I have run out of ideas on this one.
Normally when attempting to prove (negative) results with inductive types (or predicates), I would use the elim tactic (having made sure all the relevant hypothesis are inside my goal (generalize) and only variables occur in the type constructors). But elim is no good in this case.
If you are willing to accept more than just the basic logic of Coq, you can just use the dependent destruction tactic, available in the Program library (I've taken the liberty of rephrasing your last example in terms of standard-library vectors):
Require Coq.Vectors.Vector.
Require Import Program.
Lemma l0 A (v : Vector.t A 0) : v = #Vector.nil A.
Proof.
now dependent destruction v.
Qed.
If you inspect the term, you'll see that this tactic relied on the JMeq_eq axiom to get the proof to go through:
Print Assumptions l0.
Axioms:
JMeq_eq : forall (A : Type) (x y : A), x ~= y -> x = y
Fortunately, it is possible to prove l0 without having to resort to features outside of Coq's basic logic, by making a small change to the statement of the previous lemma.
Lemma l0_gen A n (v : Vector.t A n) :
match n return Vector.t A n -> Prop with
| 0 => fun v => v = #Vector.nil A
| _ => fun _ => True
end v.
Proof.
now destruct v.
Qed.
Lemma l0' A (v : Vector.t A 0) : v = #Vector.nil A.
Proof.
exact (l0_gen A 0 v).
Qed.
We can see that this new proof does not require any additional axioms:
Print Assumptions l0'.
Closed under the global context
What happened here? The problem, roughly speaking, is that in Coq we cannot perform case analysis on terms of dependent types whose indices have a specific shape (such as 0, in your case) directly. Instead, we must prove a more general statement where the problematic indices are replaced by variables. This is exactly what the l0_gen lemma is doing. Notice how we had to make the match on n return a function that abstracts on v. This is another instance of what is known as "convoy pattern". Had we written
match n with
| 0 => v = #Vector.nil A
| _ => True
end.
Coq would see the v in the 0 branch as having type Vector.t A n, making that branch ill-typed.
Coming up with such generalizations is one of the big pains of doing dependently typed programming in Coq. Other systems, such as Agda, make it possible to write this kind of code with much less effort, but it was only recently shown that this can be done without relying on the extra axioms that Coq wanted to avoid including in its basic theory. We can only hope that this will be simplified in future versions.