Forcing evaluation of terms before extraction, or other ways to test extracted functions? - coq

In many parts of the standard library, there are a bunch of OCaml-native functions with no Coq counterpart. I implemented a Coq version of some of them (with added proofs to show that the Coq versions behave as I think they ought to), but now how do I check that this actually matches the OCaml version's behavior?
The easiest / most "obvious" way I can think of would be to take a test input x, compute f x in Coq, and extract a comparison f x = [[f x]] (where [[…]] marks evaluation in Coq), then repeat for some number of test cases. Unfortunately, I can't find a way to force evaluation in Coq.
Is there a trick to make this possible? Or is there some other standard way for testing behavior across extraction? (Manually going Compute (f x). and plugging the resulting terms as literals back into the sanity checks would be an ugly fallback… Unfortunately, those won't update automatically when the function is changed, and it would also be a lot of manual work that doesn't exactly encourage exhaustive tests…)
Minimal sample:
Definition foo (b : bool) : bool :=
match b with | true => false | false => true end.
Extract Inlined Constant foo => "not".
Extract Inlined Constant bool_eq => "(=)".
Definition foo_true := foo true.
Definition foo_false := foo false.
Definition foo_test : bool :=
andb (bool_eq (foo true) foo_true) (bool_eq (foo false) foo_false).
Recursive Extraction foo_test.
results in
(** val foo_true : bool **)
let foo_true = not true
(** val foo_false : bool **)
let foo_false = not false
(** val foo_test : bool **)
let foo_test = (&&) ((=) (not true) foo_true) ((=) (not false) foo_false)
which is unhelpful. All the ways I could think of to change the definition of foo_true/foo_false to try and have them pre-evaluated didn't work. I can't find any flag in the extraction mechanism to force evaluation… Both the reference manual's attribute index and the flags/options/tables index didn't contain anything that looks useful. Is there anything that I missed?

Starting a definition with Eval compute in evaluates it before registering the definition.
Definition foo_true := Eval compute in foo true.
Definition foo_false := Eval compute in foo false.
Definition foo_test : bool :=
andb (bool_eq (foo true) foo_true) (bool_eq (foo false) foo_false).
Recursive Extraction foo_test.

Related

Why is my definition not allowed because of strict positivity?

I have the following two definitions that result in two different error messages.
The first definition is declined because of strict positivity and the second one because of a universe inconsistency.
(* non-strictly positive *)
Inductive SwitchNSP (A : Type) : Type :=
| switchNSP : SwitchNSP bool -> SwitchNSP A.
Fail Inductive UseSwitchNSP :=
| useSwitchNSP : SwitchNSP UseSwitchNSP -> UseSwitchNSP.
(* universe inconsistency *)
Inductive SwitchNSPI : Type -> Type :=
| switchNSPI : forall A, SwitchNSPI bool -> SwitchNSPI A.
Fail Inductive UseSwitchNSPI :=
| useSwitchNSPI : SwitchNSPI UseSwitchNSPI -> UseSwitchNSPI.
Chatting on gitter revealed that universe (in)consistencies are checked first, that is, the first definition adheres this check, but then fails because of a strict positivity issue.
As far as I understand the strict positivity restriction, if Coq allows non-strictly positivity data type definitions, I could construct non-terminating functions without using fix (which is pretty bad).
In order to make it even more confusing, the first definition is accepted in Agda and the second one gives a strict positivity error.
data Bool : Set where
True : Bool
False : Bool
data SwitchNSP (A : Set) : Set where
switchNSP : SwitchNSP Bool -> SwitchNSP A
data UseSwitchNSP : Set where
useSwitchNSP : SwitchNSP UseSwitchNSP -> UseSwitchNSP
data SwitchNSPI : Set -> Set where
switchNSPI : forall A -> SwitchNSPI Bool -> SwitchNSPI A
data UseSwitchNSPI : Set where
useSwitchNSP : SwitchNSPI UseSwitchNSPI -> UseSwitchNSPI
Now my question is two-folded: first, what is the "evil example" I could construct with the above definition? Second, which of the rules applies to the above definition?
Some notes:
To clarify, I think that I do understand why the second definition is not allowed type-checking-wise, but still feel that there is nothing "evil" happening here, when the definition is allowed.
I first thought that my example is an instance of this question, but enabling universe polymorphism does not help for the second definition.
Can I use some "trick" do adapt my definition such that it is accepted by Coq?
Unfortunately, there's nothing super deep about this example. As you noted Agda accepts it, and what trips Coq is the lack of uniformity in the parameters. For example, it accepts this:
Inductive SwitchNSPA (A : Type) : Type :=
| switchNSPA : SwitchNSPA A -> SwitchNSPA A.
Inductive UseSwitchNSPA :=
| useSwitchNSPA : SwitchNSPA UseSwitchNSPA -> UseSwitchNSPA.
Positivity criteria like the one used by Coq are not complete, so they will reject harmless types; the problem with supporting more types is that it often makes the positivity checker more complex, and that's already one of the most complex pieces of the kernel.
As for the concrete details of why it rejects it, well, I'm not 100% sure. Going by the rules in the manual, I think it should be accepted?
EDIT: The manual is being updated.
Specifically, using the following shorter names to simplify the following:
Inductive Inner (A : Type) : Type := inner : Inner bool -> Inner A.
Inductive Outer := outer : Inner Outer -> Outer.
Correctness rules
Positivity condition
Here,
X = Outer
T = forall x: Inner X, X
So we're in the second case with
U = Inner X
V = X
V is easy, so let's do that first:
V = (X) falls in the first case, with no t_i, hence is positive for X
For U: is U = Inner X strictly positive wrt X?
Here,
T = Inner X
Hence we're in the last case: T converts to (I a1) (no t_i) with
I = Inner
a1 = X
and X does not occur in the t_i, since there are no t_i.
Do the instantiated types of the constructors satisfy the nested positivity condition?
There is only one constructor:
inner : Inner bool -> Inner X.
Does this satisfy the nested positivity condition?
Here,
T = forall x: Inner bool, Inner X.
So we're in the second case with
U = Inner bool
V = Inner X
X does not occur in U, so X is strictly positive in U.
Does V satisfy the nested positivity condition for X?
Here,
T = Inner X
Hence we're in the first case: T converts to (I b1) (no u_i) with
I = Inner
b1 = X
There are no u_i, so V satisfies the nested positivity condition.
I have opened a bug report. The manual is being fixed.
Two more small things:
I can't resist pointing that your type is empty:
Theorem Inner_empty: forall A, Inner A -> False.
Proof. induction 1; assumption. Qed.
You wrote:
if Coq allows non-strictly positivity data type definitions, I could construct non-terminating functions without using fix (which is pretty bad).
That's almost correct, but not exactly: if Coq didn't enforce strict positivity, you could construct non-terminating functions period, which is bad. It doesn't matter whether they use fix or not: having non-termination in the logic basically makes it unsound (and hence Coq prevents you from writing fixpoints that do not terminate by lazy reduction).

Are Coercions allowed within polymorphic datatypes in coq?

I'm using a lot of pairs with types that would normally be coerced freely. However, the notation doesn't explicitly provide the types for the pairs, even though the context does. The following code provides a simple MWE of the principle:
Coercion nat_to_bool (n : nat) : bool :=
match n with
| 0 => false
| _ => true
end.
Inductive newtype : Type :=
| firstconstr : bool -> newtype
| secondconstr : (bool * bool) -> newtype.
Check (3 : bool).
Check (firstconstr 3).
Fail Check (secondconstr (3, true)).
Check (secondconstr (#pair bool bool 3 true)).
The failure message is The term "(3, true)" has type "(nat * bool)%type" while it is expected to have type "(bool * bool)%type". Obviously the type system can figure out the type that it is supposed to be, but can't decide to coerce the value properly.
Is there a way to declare the coercion in such a way that tuples (and other polymorphic data types with inferred types) are coerced properly?
Maybe you can make a notation that desugars to an annotation so the coercion goes through, assuming the pair is always constructed explicitly (that seems fair for a typing judgement G |- x : T where x : T is a pair, although I think the whole judgement is more commonly seen as a triple).
Here's a PoC with the minimal example:
Notation "'secondconstr'' ( x , y )" := (secondconstr (x : bool, y)).
Check (secondconstr' (3, true)).
And in the context of your types judgement, it might look like this:
Notation "Gamma |- v : T" := (types Gamma (v : value, T))
(at level 100).

Encoding of inferrable records

As you probably know, records are somewhat special in ocaml, as each label has to be uniquely assigned to a nominal record type, i.e. the following function cannot be typed without context:
let f r = r.x
Proper first class records (i.e. things that behave like tuples with labels) are trivially encoded using objects, e.g.
let f r = r#x
when creating the objects in the right way (i.e. no self-recursion, no mutation), they behave just like records.
I am however, somewhat unhappy with this solution for two reasons:
when making records updatetable (i.e. by adding an explicit "with_l" method for each label l), the type is somewhat too loose (it should be the same as the original record). Admitted, one can enforce this equality, but this is still inconvenient.
I have the suspicion that the OCaml compiler does not infer that these records are actually immutable: In a function
let f r = r#x + r#x
would the compiler be able to run a common subexpression elimination?
For these reasons, I wonder if there is a better encoding:
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
Can this encoding avoid the problems mentioned above?
If I understand you correctly you're looking for a very special kind of polymorphism. You want to write a function that will work for all types, such that the type is a record with certain fields. This sounds more like a syntactic polymorphism in a C++ style, not as semantic polymorphism in ML style. If we will slightly rephrase the task, by capturing the idea that a field accessing is just a syntactic sugar for a field projection function, then we can say, that you want to write a function that is polymorphic over all types that provide a certain set of operations. This kind of polymorphism can be captured by OCaml using one of the following mechanisms:
functors
first class modules
objects
I think that functors are obvious, so I will show an example with first class modules. We will write a function print_student that will work on any type that satisfies the Student signature:
module type Student = sig
type t
val name : t -> string
val age : t -> int
end
let print_student (type t)
(module S : Student with type t = t) (s : t) =
Printf.printf "%s %d" (S.name s) (S.age s)
The type of print_student function is (module Student with type t = 'a) -> 'a -> unit. So it works for any type that satisfies the Student interface, and thus it is polymorphic. This is a very powerful polymorphism that comes with a price, you need to pass the module structure explicitly when you're invoking the function, so it is a System F style polymorphism. Functors will also require you to specify concrete module structure. So both are not inferrable (i.e., not an implicit Hindley-Milner-like style polymorphism, that you are looking for). For the latter, only objects will work (there are also modular implicits, that relax the explicitness requirement, but they are still not in the trunk, but they will actually answer your requirements).
With object-style row polymorphism it is possible to write a function that is polymorphic over a set of types conforming to some signature, and to infer this signature implicitly from the function definintion. However, such power comes with a price. Since object operations are encoded with methods and methods are just function pointers that are assigned dynamically in the runtime, you shouldn't expect any compile time optimizations. It is not possible to perform any static analysis on something that is bound dynamically. So, of course, no Common Subexpression elimination, nor inlining. For functors and first class modules, the optimization is possible on a newer branch of the compiler with flamba (See 4.03.0+flambda opam switch). But on a regular compiler installation no inlining will be performed.
Different approaches
What concerning other techniques. First of all we can use camlp{4,5}, or ppx or even m4 and cpp to preprocess code, but this would be hardly idiomatic and of doubtful usefulness.
Another way, is instead of writing a function that is polymorphic, we can try to find a suitable monomorphic data type. A direct approach would be to use a list of polymorphic variants, e.g.,
type attributes = [`name of string | `age of int]
type student = attribute list
In fact we even don't need to specify all these types ahead, and our function can require only those fields that are needed, a form of a row polymorphism:
let rec name = function
| [] -> raise Not_found
| `name n -> n
| _ :: student -> name student
The only problem with this encoding, is that you cannot guarantee that the same named attribute can occur once and only once. So it is possible that a student doesn't have a name at all, or, that is worser, it can have more then one names. Depending on your problem domain it can be acceptable.
If it is not, then we can use GADT and extensible variants to encode heterogenous maps, i.e., an associative data structures that map keys to
different type (in a regular (homogenous) map or assoc list value type is unified). How to construct such containers is beyond the scope of the answer, but fortunately there're at least two available implementations. One, that I use personally is called universal map (Univ_map) and is provided by a Core library (Core_kernel in fact). It allows you to specify two kinds of heterogenous maps, with and without a default values. The former corresponds to a record with optional field, the latter has default for each field, so an accessor is a total function. For example,
open Core_kernel.Std
module Dict = Univ_map.With_default
let name = Dict.Key.create ~name:"name" ~default:"Joe" sexp_of_string
let age = Dict.Key.create ~name:"age" ~default:18 sexp_of_int
let print student =
printf "%s %d"
(Dict.get student name) (Dict.get age name)
You can hide that you're using universal map using abstract type, as there is only one Dict.t that can be used across different abstractions, that may break modularity. Another example of heterogeneous map implementation is from Daniel Bunzli. It doesn't provide With_default kind of map, but has much less dependencies.
P.S. Of course for such a redundant case, where this only one operation it is much easier to just pass this operation explicitly as function, instead of packing it into a structure, so we can write function f from your example as simple as let f x r = x r + x r. But this would be the same kind of polymoprism as with first class modules/functors, just simplified. And I assume, that your example was specifically reduced to one field, and in your real use case you have more complex set of fields.
Very roughly speaking, an OCaml object is a hash table whose keys are its method name hash. (The hash of a method name can be obtained by Btype.hash_variant of OCaml compiler implementation.)
Just like objects, you can encode polymorphic records using (int, Obj.t) Hashtbl.t. For example, a function to get a value of a field l can be written as follows:
(** [get r "x"] is poly-record version of [r.x] *)
let get r k = Hashtbl.find t (Btype.hash_variant k))
Since it is easy to access the internals unlike objects, the encoding of {r with l = e} is trivial:
(** [copy_with r [(k1,v1);..;(kn,vn)]] is poly-record version of
[{r with k1 = v1; ..; kn = vn}] *)
let copy_with r fields =
let r = Hashtbl.copy r in
List.iter (fun (k,v) -> Hashtbl.replace r (Btype.hash_variant k) v) fields
and the creation of poly-records:
(** [create [(k1,v1);..(kn,vn)]] is poly-record version of [{k1=v1;..;kn=vn}] *)
let create fields = copy_with fields (Hashtbl.create (List.length fields))
Since all the types of the fields are squashed into one Obj.t, you have to use Obj.magic to store various types into this implementation and therefore this is not type-safe by itself. However, we can make it type-safe wrapping (int, Obj.t) Hashtbl.t with phantom type whose parameter denotes the fields and their types of a poly-record. For example,
<x : int; y : float> Poly_record.t
is a poly-record whose fields are x : int and y : float.
Details of this phantom type wrapping for the type safety is too long to explain here. Please see my implementation https://bitbucket.org/camlspotter/ppx_poly_record/src . To tell short, it uses PPX preprocessor to generate code for type-safety and to provide easier syntax sugar.
Compared with the encoding by objects, this approach has the following properties:
The same type safety and the same field access efficiency as objects
It can enjoy structural subtyping like objects, what you want for poly-records.
{r with l = e} is possible
Streamable outside of a program safely, since hash tables themselves have no closure in it. Objects are always "contaminated" with closures therefore they are not safely streamable.
Unfortunately it lacks efficient pattern matching, which is available for mono-records. (And this is why I do not use my implementation :-( ) I feel for it PPX reprocessing is not enough and some compiler modification is required. It will not be really hard though since we can make use of typing of objects.
Ah and of course, this encoding is very side effective therefore no CSE optimization can be expected.
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
For immutable records, yes. There is a standard theoretical duality between polymorphic records ("inferrable" records as you describe) and polymorphic variants. In short, a record { l_1 = v_1; l_2 = v_2; ...; l_n = v_n } can be implemented by
function `l_1 k -> k v_1 | `l_2 k -> k v_2 | ... | `l_n k -> k v_n
and then the projection r.l_i becomes r (`l_i (fun v -> v)). For instance, the function fun r -> r.x is encoded as fun r -> r (`x (fun v -> v)). See also the following example session:
# let myRecord = (function `field1 k -> k 123 | `field2 k -> k "hello") ;;
(* encodes { field1 = 123; field2 = "hello" } *)
val myRecord : [< `field1 of int -> 'a | `field2 of string -> 'a ] -> 'a = <fun>
# let getField1 r = r (`field1 (fun v -> v)) ;;
(* fun r -> r.field1 *)
val getField1 : ([> `field1 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField1 myRecord ;;
- : int = 123
# let getField2 r = r (`field2 (fun v -> v)) ;;
(* fun r -> r.field2 *)
val getField2 : ([> `field2 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField2 myRecord ;;
- : string = "hello"
For mutable records, we can add setters like:
let ref1 = ref 123
let ref2 = ref "hello"
let myRecord =
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
and use them like myRecord (`set_field1(456, fun v -> v)) and myRecord (`set_field2("world", fun v -> v)) for example. However, localizing ref1 and ref2 like
let myRecord =
let ref1 = ref 123 in
let ref2 = ref "hello" in
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
causes a value restriction problem and requires a little more polymorphic typing trick (which I omit here).
Can this encoding avoid the problems mentioned above?
The "common subexpression elimination" for (the encoding of) r.x + r.x can be done only if OCaml knows the definition of r and inlines it. (Sorry my previous answer was inaccurate here.)

Equality in COQ for enumerated types

I have a finite enumerated type (say T) in COQ and want to check elements for equality. This means, I need a function
bool beq_T(x:T,y:T)
The only way I managed to define such a function, is with a case by case analysis. This results in lots of match statements and looks very cumbersome. Therefore, I wonder if there is not a simpler way to achieve this.
Even simpler: Scheme Equality for ... generates two functions returning respectively a boolean and a sumbool.
The bad news is that the program that implements beq_T must necessarily consist of a large match statement on both of its arguments. The good news is that you can automatically generate/synthesize this program using Coq's tactic language. For example, given the type:
Inductive T := t0 | t1 | t2 | t3.
You can define beq_T as follows. The first two destruct tactics synthesize the code necessary to match on both x and y. The match tactic inspects the branch of the match that it is in, and depending on whether x = y, the tactic either synthesises the program that returns true or false.
Definition beq_T (x y:T) : bool.
destruct x eqn:?;
destruct y eqn:?;
match goal with
| _:x = ?T, _:y = ?T |- _ => exact true
| _ => exact false
end.
Defined.
If you want to see the synthesized program, run:
Print beq_T.
Thankfully, Coq already comes with a tactic that does almost what you want. It is called decide equality. It automatically synthesizes a program that decides whether two elements of a type T are equal. But instead of just returning a boolean value, the synthesized program returns a proof of the (in)equality of the two elements.
Definition eqDec_T (x y:T) : {x = y} + {x <> y}.
decide equality.
Defined.
With that program synthesized, it is easy to implement beq_T.
Definition beq_T' {x y:T} : bool := if eqDec_T x y then true else false.

Transforming a inductive value into an inductive value of another type

In database theory, one assumes the existence of two disjoint sets containing variables and constants.
I want to make the distinction between variables and constants at the type level of my values, using the two following inductive types:
Inductive variable :=
| var : string -> variable.
Inductive constant :=
| con : string -> constant.
I then would like to define a function mapping a variable to a constant using the same string internally.
I can reach this objective using the following definition:
Definition variable_to_constant : variable -> constant.
intros.
destruct H.
apply (con s).
Defined.
Check test.
test
: variable -> constant
Eval compute in (variable_to_constant (var "hello")).
= con "hello"
: constant
My question is: Is there a more elegant solution to define this function ? (I mean, not using the "Defined." keyword, maybe ...)
This is my first development using Coq, so if you think that my way of achieving this is wrong, please let me know :-)
Have a nice day !
Fabian Pijcke
You can simply write the code of the function directly rather than building it interactively using tactics:
Definition variable_to_constant : variable -> constant :=
fun v => match v with var str => con str end.
Or even the lighter:
Definition variable_to_constant (v : variable) : constant :=
match v with var str => con str end.