Datatype for terms over a signature in SML - smlnj

I want to implement an arbitrary signature in SML. How can I define a datatype for terms over that signature ?I would be needing it to write functions that checks whether the terms are well formed .

In my point of view, there are two major ways of representing an AST. Either as series of (possibly mutually recursive) datatypes or just as one big datatype. There are pros an cos for both.
If we define the following BNF (extracted from the SML definition and slightly simplified)
<exp> ::= <exp> andalso <exp>
| <exp> orelse <exp>
| raise <exp>
| <appexp>
<appexp> ::= <atexp> | <appexp> <atexp>
<atexp> ::= ( <exp>, ..., <exp> )
| [ <exp>, ..., <exp> ]
| ( <exp> ; ... ; <exp> )
| ( <exp> )
| ()
As stated this is simplified and much of the atexp is left out.
1. A series of possibly mutually recursive datatypes
Here you would for example create a datatype for expressions, declarations, patterns, etc.
Basicly you would create a datatype for each of the non-terminals in your BNF.
We would most likely create the following datatypes
datatype exp = Andalso of exp * exp
| Orelse of exp * exp
| Raise of exp
| App of exp * atexp
| Atexp of atexp
and atexp = Tuple of exp list
| List of exp list
| Seq of exp list
| Par of exp
| Unit
Notice that the non-terminal has been consumed into exp datatype instead of having it as its own. That would just clutter up the AST for no reason. You have to remember that a BNF is often written in such a way that it also defined precedens and assosiativity (e.g., for arithmetic). In such cases you can often simplify the BNF by merging multiple non-terminals into one datatype.
The good thing about defining multiple datatypes is that you kind of get some well formednes of your AST. If for example we also had non-terminal for declarations, we know that the AST will newer contain a declaration inside a list (as only expressions can be there). Because of this, most of you well formedness check is not nessesary.
This is however not always a good thing. Often you need to do some checking on the AST anyways, for example type checking. In many cases the BNF is quite large and thus the number of datatypes nessesary to model the AST is also quite large. Keeping this in mind, you have to create one function for each of your datatypes,for every type of modification you wan't to do on your AST. In many cases you only wan't to change a small part of your AST but you will (most likely) still need to define a function for each datatype. Most of these functions will basicly be the identity and then only in a few cases you will do the desired work.
If for example we wan't to count how many units there are in a given AST we could define the following functions
fun countUnitexp (Andalso (e1, e2)) = countUnitexp e1 + countUnitexp e2
| countUnitexp (Orelse (e1, e2)) = countUnitexp e1 + countUnitexp e2
| countUnitexp (Raise e1) = countUnitexp e1
| countUnitexp (App (e1, atexp)) = countUnitexp e1 + countUnitatexp atexp
| countUnitexp (Atexp atexp) = countUnitatexp atexp
and countUnitatexp (Tuple exps) = sumUnit exps
| countUnitatexp (List exps) = sumUnit exps
| countUnitatexp (Seq exps) = sumUnit exps
| countUnitatexp (Par exp) = countUnitexp exp
| countUnitatexp Unit = 1
and sumUnit exps = foldl (fn (exp,b) => b + countUnitexp exp) 0 exps
As you may see we are doing a lot of work, just for this simple task. Imagine a bigger grammar and a more complicated task.
2. One (big) datatype (nodes) -- and a Tree of these nodes
Lets combine the datatypes from before, but change them such that they don't (themself) contain their children. Because in this approach we build a tree structure that has a node and some children of that node. Obviously if you have an identifier, then the identifier needs to contain the actual string representation (e.g., variable name).
So lets start out by defined the nodes for the tree structure.
(* The comment is the kind of children and possibly specific number of children
that the BNF defines to be valid *)
datatype node = Exp_Andalso (* [exp, exp] *)
| Exp_Orelse (* [exp, exp] *)
| Exp_Raise (* [exp] *)
| Exp_App (* [exp, atexp] *)
(* Superflous:| Exp_Atexp (* [atexp] *) *)
| Atexp_Tuple (* exp list *)
| Atexp_List (* exp list *)
| Atexp_Seq (* exp list *)
| Atexp_Par (* [exp] *)
| Atexp_Unit (* [] *)
See how the Atexp from the tupe now becomes superflous and thus we remove it. Personally I think it is nice to have the comment next by telling which children (in the tree structure) we can expect.
(* Note this is a non empty tree. That is you have to pack it in an option type
if you wan't to represent an empty tree *)
datatype 'a tree = T of 'a * 'a tree list
(* Define the ast as trees of our node datatype *)
type ast = node tree
We then define a generic tree and define the type ast to be a "tree of nodes".
If you use some library then there is a big chance that such a tree structure is already present. Also it might be handy late on to extend this tree structure to contain more than just the node as data, however we just keep it simple here.
fun foldTree f b (T (n, [])) = f (n, b)
| foldTree f b (T (n, ts)) = foldl (fn (t, b') => foldTree f b' t)
(f (n, b)) ts
For this example we define a fold function over the tree, again if you are using a library then all these functions for folding, mapping, etc. are most likely already defined.
fun countUnit (Atexp_Unit) = 1
| countUnit _ = 0
If we then take the example from before, that we wan't to count the number of occurances of unit, we can then just fold the above function over the tree.
val someAST = T(Atexp_Tuple, [ T (Atexp_Unit, [])
, T (Exp_Raise, [])
, T (Atexp_Unit, [])
]
)
A simple AST could look like the above (note that this is actually not valid as we have a Exp_Raise with no children). And we could then do the counting by
foldTree (fn (a,b) => (countUnit a) + b) 0 someAST
The down side of this approach is that you have to write a check function that verifies that your AST is well formed, as there is no restrictions when you create the AST. This includes that the children are of the correct "type" (e.g., only Exp_* as children in an Exp_Andalso) and that there are the correct number of children (e.g., exactly two children in Exp_Andalso).
This approach also requires a bit of builk getting started, given you don't use some library that has a tree defined (including auxilary functions for modifying the tree). However in the long run it pays of.

Related

How does one do an else statement in Coq's functional programming language?

I am trying to count the # of occurrences of an element v in a natlist/bag in Coq. I tried:
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => 0
| h :: tl => match h with
| v => 1 + (count v tl)
end
end.
however my proof doesn't work:
Example test_count1: count 1 [1;2;3;1;4;1] = 3.
Proof. simpl. reflexivity. Qed.
Why doesn't the first piece of code work? What is it doing when v isn't matched?
I also tried:
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => 0
| h :: tl => match h with
| v => 1 + (count v tl)
| _ => count v tl
end
end.
but that also gives an error in Coq and I can't even run it.
Functional programming is sort of new to me so I don't know how to actually express this in Coq. I really just want to say if h matches v then do a +1 and recurse else only recurse (i.e. add zero I guess).
Is there a simple way to express this in Coq's functional programming language?
The reason that I ask is because it feels to me that the match thing is very similar to an if else statement in "normal" Python programming. So either I am missing the point of functional programming or something. That is the main issue I am concerned I guess, implicitly.
(this is similar to Daniel's answer, but I had already written most of it)
Your problem is that in this code:
match h with
| v => 1 + (count v tl)
end
matching with v binds a new variable v. To test if h is equal to v, you'll have to use some decision procedure for testing equality of natural numbers.
For example, you could use Nat.eqb, which takes two natural numbers and returns a bool indicating whether they're equal.
Require Import Nat.
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => 0
| h :: tl => if (eqb h v) then (1 + count v t1) else (count v t1)
end.
Why can't we simply match on the term we want? Pattern matching always matches on constructors of the type. In this piece of code, the outer match statement matches with nil and h :: t1 (which is a notation for cons h t1 or something similar, depending on the precise definition of bag). In a match statement like
match n with
| 0 => (* something *)
| S n => (* something else *)
end.
we match on the constructors for nat: 0 and S _.
In your original code, you try to match on v, which isn't a constructor, so Coq simply binds a new variable and calls it v.
The match statement you tried to write actually just shadows the v variable with a new variable also called v which contains just a copy of h.
In order to test whether two natural numbers are equal, you can use Nat.eqb which returns a bool value which you can then match against:
Require Import Arith.
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => 0
| h :: tl => match Nat.eqb v h with
| true => 1 + (count v tl)
| false => count v tl
end
end.
As it happens, for matching of bool values with true or false, Coq also provides syntactic sugar in the form of a functional if/else construct (which is much like the ternary ?: operator from C or C++ if you're familiar with either of those):
Require Import Arith.
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => 0
| h :: tl => if Nat.eqb v h then
1 + (count v tl)
else
count v tl
end.
(Actually, it happens that if works with any inductive type with exactly two constructors: then the first constructor goes to the if branch and the second constructor goes to the else branch. However, the list type has nil as its first constructor and cons as its second constructor: so even though you could technically write an if statement taking in a list to test for emptiness or nonemptiness, it would end up reversed from the way you would probably expect it to work.)
In general, however, for a generic type there won't necessarily be a way to decide whether two members of that type are equal or not, as there was Nat.eqb in the case of nat. Therefore, if you wanted to write a generalization of count which could work for more general types, you would have to take in an argument specifying the equality decision procedure.

Encoding of inferrable records

As you probably know, records are somewhat special in ocaml, as each label has to be uniquely assigned to a nominal record type, i.e. the following function cannot be typed without context:
let f r = r.x
Proper first class records (i.e. things that behave like tuples with labels) are trivially encoded using objects, e.g.
let f r = r#x
when creating the objects in the right way (i.e. no self-recursion, no mutation), they behave just like records.
I am however, somewhat unhappy with this solution for two reasons:
when making records updatetable (i.e. by adding an explicit "with_l" method for each label l), the type is somewhat too loose (it should be the same as the original record). Admitted, one can enforce this equality, but this is still inconvenient.
I have the suspicion that the OCaml compiler does not infer that these records are actually immutable: In a function
let f r = r#x + r#x
would the compiler be able to run a common subexpression elimination?
For these reasons, I wonder if there is a better encoding:
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
Can this encoding avoid the problems mentioned above?
If I understand you correctly you're looking for a very special kind of polymorphism. You want to write a function that will work for all types, such that the type is a record with certain fields. This sounds more like a syntactic polymorphism in a C++ style, not as semantic polymorphism in ML style. If we will slightly rephrase the task, by capturing the idea that a field accessing is just a syntactic sugar for a field projection function, then we can say, that you want to write a function that is polymorphic over all types that provide a certain set of operations. This kind of polymorphism can be captured by OCaml using one of the following mechanisms:
functors
first class modules
objects
I think that functors are obvious, so I will show an example with first class modules. We will write a function print_student that will work on any type that satisfies the Student signature:
module type Student = sig
type t
val name : t -> string
val age : t -> int
end
let print_student (type t)
(module S : Student with type t = t) (s : t) =
Printf.printf "%s %d" (S.name s) (S.age s)
The type of print_student function is (module Student with type t = 'a) -> 'a -> unit. So it works for any type that satisfies the Student interface, and thus it is polymorphic. This is a very powerful polymorphism that comes with a price, you need to pass the module structure explicitly when you're invoking the function, so it is a System F style polymorphism. Functors will also require you to specify concrete module structure. So both are not inferrable (i.e., not an implicit Hindley-Milner-like style polymorphism, that you are looking for). For the latter, only objects will work (there are also modular implicits, that relax the explicitness requirement, but they are still not in the trunk, but they will actually answer your requirements).
With object-style row polymorphism it is possible to write a function that is polymorphic over a set of types conforming to some signature, and to infer this signature implicitly from the function definintion. However, such power comes with a price. Since object operations are encoded with methods and methods are just function pointers that are assigned dynamically in the runtime, you shouldn't expect any compile time optimizations. It is not possible to perform any static analysis on something that is bound dynamically. So, of course, no Common Subexpression elimination, nor inlining. For functors and first class modules, the optimization is possible on a newer branch of the compiler with flamba (See 4.03.0+flambda opam switch). But on a regular compiler installation no inlining will be performed.
Different approaches
What concerning other techniques. First of all we can use camlp{4,5}, or ppx or even m4 and cpp to preprocess code, but this would be hardly idiomatic and of doubtful usefulness.
Another way, is instead of writing a function that is polymorphic, we can try to find a suitable monomorphic data type. A direct approach would be to use a list of polymorphic variants, e.g.,
type attributes = [`name of string | `age of int]
type student = attribute list
In fact we even don't need to specify all these types ahead, and our function can require only those fields that are needed, a form of a row polymorphism:
let rec name = function
| [] -> raise Not_found
| `name n -> n
| _ :: student -> name student
The only problem with this encoding, is that you cannot guarantee that the same named attribute can occur once and only once. So it is possible that a student doesn't have a name at all, or, that is worser, it can have more then one names. Depending on your problem domain it can be acceptable.
If it is not, then we can use GADT and extensible variants to encode heterogenous maps, i.e., an associative data structures that map keys to
different type (in a regular (homogenous) map or assoc list value type is unified). How to construct such containers is beyond the scope of the answer, but fortunately there're at least two available implementations. One, that I use personally is called universal map (Univ_map) and is provided by a Core library (Core_kernel in fact). It allows you to specify two kinds of heterogenous maps, with and without a default values. The former corresponds to a record with optional field, the latter has default for each field, so an accessor is a total function. For example,
open Core_kernel.Std
module Dict = Univ_map.With_default
let name = Dict.Key.create ~name:"name" ~default:"Joe" sexp_of_string
let age = Dict.Key.create ~name:"age" ~default:18 sexp_of_int
let print student =
printf "%s %d"
(Dict.get student name) (Dict.get age name)
You can hide that you're using universal map using abstract type, as there is only one Dict.t that can be used across different abstractions, that may break modularity. Another example of heterogeneous map implementation is from Daniel Bunzli. It doesn't provide With_default kind of map, but has much less dependencies.
P.S. Of course for such a redundant case, where this only one operation it is much easier to just pass this operation explicitly as function, instead of packing it into a structure, so we can write function f from your example as simple as let f x r = x r + x r. But this would be the same kind of polymoprism as with first class modules/functors, just simplified. And I assume, that your example was specifically reduced to one field, and in your real use case you have more complex set of fields.
Very roughly speaking, an OCaml object is a hash table whose keys are its method name hash. (The hash of a method name can be obtained by Btype.hash_variant of OCaml compiler implementation.)
Just like objects, you can encode polymorphic records using (int, Obj.t) Hashtbl.t. For example, a function to get a value of a field l can be written as follows:
(** [get r "x"] is poly-record version of [r.x] *)
let get r k = Hashtbl.find t (Btype.hash_variant k))
Since it is easy to access the internals unlike objects, the encoding of {r with l = e} is trivial:
(** [copy_with r [(k1,v1);..;(kn,vn)]] is poly-record version of
[{r with k1 = v1; ..; kn = vn}] *)
let copy_with r fields =
let r = Hashtbl.copy r in
List.iter (fun (k,v) -> Hashtbl.replace r (Btype.hash_variant k) v) fields
and the creation of poly-records:
(** [create [(k1,v1);..(kn,vn)]] is poly-record version of [{k1=v1;..;kn=vn}] *)
let create fields = copy_with fields (Hashtbl.create (List.length fields))
Since all the types of the fields are squashed into one Obj.t, you have to use Obj.magic to store various types into this implementation and therefore this is not type-safe by itself. However, we can make it type-safe wrapping (int, Obj.t) Hashtbl.t with phantom type whose parameter denotes the fields and their types of a poly-record. For example,
<x : int; y : float> Poly_record.t
is a poly-record whose fields are x : int and y : float.
Details of this phantom type wrapping for the type safety is too long to explain here. Please see my implementation https://bitbucket.org/camlspotter/ppx_poly_record/src . To tell short, it uses PPX preprocessor to generate code for type-safety and to provide easier syntax sugar.
Compared with the encoding by objects, this approach has the following properties:
The same type safety and the same field access efficiency as objects
It can enjoy structural subtyping like objects, what you want for poly-records.
{r with l = e} is possible
Streamable outside of a program safely, since hash tables themselves have no closure in it. Objects are always "contaminated" with closures therefore they are not safely streamable.
Unfortunately it lacks efficient pattern matching, which is available for mono-records. (And this is why I do not use my implementation :-( ) I feel for it PPX reprocessing is not enough and some compiler modification is required. It will not be really hard though since we can make use of typing of objects.
Ah and of course, this encoding is very side effective therefore no CSE optimization can be expected.
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
For immutable records, yes. There is a standard theoretical duality between polymorphic records ("inferrable" records as you describe) and polymorphic variants. In short, a record { l_1 = v_1; l_2 = v_2; ...; l_n = v_n } can be implemented by
function `l_1 k -> k v_1 | `l_2 k -> k v_2 | ... | `l_n k -> k v_n
and then the projection r.l_i becomes r (`l_i (fun v -> v)). For instance, the function fun r -> r.x is encoded as fun r -> r (`x (fun v -> v)). See also the following example session:
# let myRecord = (function `field1 k -> k 123 | `field2 k -> k "hello") ;;
(* encodes { field1 = 123; field2 = "hello" } *)
val myRecord : [< `field1 of int -> 'a | `field2 of string -> 'a ] -> 'a = <fun>
# let getField1 r = r (`field1 (fun v -> v)) ;;
(* fun r -> r.field1 *)
val getField1 : ([> `field1 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField1 myRecord ;;
- : int = 123
# let getField2 r = r (`field2 (fun v -> v)) ;;
(* fun r -> r.field2 *)
val getField2 : ([> `field2 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField2 myRecord ;;
- : string = "hello"
For mutable records, we can add setters like:
let ref1 = ref 123
let ref2 = ref "hello"
let myRecord =
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
and use them like myRecord (`set_field1(456, fun v -> v)) and myRecord (`set_field2("world", fun v -> v)) for example. However, localizing ref1 and ref2 like
let myRecord =
let ref1 = ref 123 in
let ref2 = ref "hello" in
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
causes a value restriction problem and requires a little more polymorphic typing trick (which I omit here).
Can this encoding avoid the problems mentioned above?
The "common subexpression elimination" for (the encoding of) r.x + r.x can be done only if OCaml knows the definition of r and inlines it. (Sorry my previous answer was inaccurate here.)

Is there a way to derive Num class functions in own data type in Haskell?

Let's say I have a type declaration:
data MyType = N Double | C Char | Placeholder
I want to be able to treat MyType as a Double whenever it's possible, with all the Num, Real, Fractional functions resulting in N (normal result) for arguments wrapped in the N constructor, and Placeholder for other arguments
> (N 5.0) + (N 6.0)
N 11.0
> (N 5.0) + (C 'a')
Placeholder
Is there a way to do this other than simply defining this class as an instance of those classes in a manner similar to:
instance Num MyType where
(+) (N d1) (N d2) = N (d1+d2)
(+) _ _ = Placeholder
...
(which seems counter-productive)?
There is no generic deriving available in standard Haskell: currently, deriving is only available as defined by the compiler for specific Prelude typeclasses: Read, Show, Eq, Ord, Enum, and Bounded.
The Glasgow Haskell Compiler (GHC) apparently has extensions that support generic deriving. However, I don't know if it would actually save you any work to try and use them: how many typeclasses do you need to derive a Num instance from? And, are you sure that you can define an automatic scheme for deriving Num that will always do what you want?
As noted in the comments, you need to describe what your Num instance will do in any case. And describing and debugging a general scheme is certain to be more work than describing a particular one.
No, you can't do this automatically, but I think what leftaroundabout could have been getting at is that you can use Applicative operations to help you.
data MyType n = N n | C Char | Placeholder deriving (Show, Eq, Functor)
instance Applicative MyType where
pure = N
(<*>) = ap
instance Monad MyType where
N n >>= f = f n
C c >>= _ = C c
Placeholder >>= _ = Placeholder
Now you can write
instance Num n => Num (MyType n) where
x + y = (+) <$> x <*> y
abs = fmap abs
...

Faster code for 'distinct' on lists

This question refers to code generation with the Isabelle/HOL theorem prover.
When I export code for the distinct function on lists
export_code distinct in Scala file -
I get the following code
def member[A : HOL.equal](x0: List[A], y: A): Boolean = (x0, y) match {
case (Nil, y) => false
case (x :: xs, y) => HOL.eq[A](x, y) || member[A](xs, y)
}
def distinct[A : HOL.equal](x0: List[A]): Boolean = x0 match {
case Nil => true
case x :: xs => ! (member[A](xs, x)) && distinct[A](xs)
}
This code has quadratic runtime. Is there a faster version available? I think of something like importing "~~/src/HOL/Library/Code_Char" for strings at the beginning of my theory and efficient code generation for lists is set up.
A better implementation for distinct would be to sort the list in O(n log n) and iterate over the list once. But I guess one can do better?
Anyway, is there a faster implementation for distinct and maybe other functions from Main available?
I do not know of any faster implementation in Isabelle2013's library, but you can easily do it yourself as follows:
Implement a function distinct_sorted that determines distinctness on sorted lists.
Prove that distinct_sorted indeed implements distinct on sorted lists
Prove a lemma that implements distinct via distinct_list and sorting, and declare it as the new code equation for distinct.
In summary, this looks as follows:
context linorder begin
fun distinct_sorted :: "'a list => bool" where
"distinct_sorted [] = True"
| "distinct_sorted [x] = True"
| "distinct_sorted (x#y#xs) = (x ~= y & distinct_sorted (y#xs))"
lemma distinct_sorted: "sorted xs ==> distinct_sorted xs = distinct xs"
by(induct xs rule: distinct_sorted.induct)(auto simp add: sorted_Cons)
end
lemma distinct_sort [code]: "distinct xs = distinct_sorted (sort xs)"
by(simp add: distinct_sorted)
Next, you need an efficient sorting algorithm. By default, sort uses insertion sort. If you import Multiset from HOL/Library, sort will be implemented by quicksort. If you import Efficient Mergesort from the Archive of Formal Proofs, you get merge sort.
While this can improve efficiency, there's also a snag: After the above declarations, you can execute distinct only on lists whose elements are instances of the type class linorder. As this refinement happens only inside the code generator, your definitions and theorems in Isabelle are not affected.
For example, to apply distinct to a list of lists in any code equation, you first have to define a linear order on lists: List_lexord in HOL/Library does so by picking the lexicographic order, but this requires a linear order on the elements. If you want to use string, which abbreviates char list, Char_ord defines the usual order on char. If you map characters to the character type of the target language with Code_Char, you also need the adaptation theory Code_Char_ord for the combination with Char_ord.

Decoupling the data to be manipulated from the proofs that the manipulations are justified

I have a type of lists whose heads and tails must be in a certain sense "compatible":
Inductive tag := A | B. (* Just an example *)
Inductive element : tag -> tag -> Set :=
| AA : element A A
| AB : element A B
| BB : element B B. (* Also just an example *)
Inductive estack : tag -> tag -> Set :=
| ENil : forall t, estack t t
| ECons : forall r s t, element r s -> estack s t -> estack r t.
However, I do not like this code very much, for the following reasons:
It is not modular: The ad-hoc list data constructors are intrinsically coupled with the proofs that the heads and tails are compatible - the tags.
It does not favor code reuse: I am forced to redefine the usual list functions (such as list concatenation) and re-prove the usual list theorems (such as the associativity of list concatenation).
I have a different approach in mind, which consists of three steps:
Defining a single type of tagged elements (as opposed to a family of tagged types of elements):
Inductive taggedElement := Tagged : forall t1 t2, element t1 t2 -> taggedElement.
Defining the type of arbitrary (that is, either valid or invalid) lists of tagged elements:
Definition taggedElementStack := list taggedElement.
Defining a valid list of tagged elements as a tuple whose elements are an arbitrary list of tagged elements and a proof that the elements are compatible with the adjacent ones.
(* I have no idea how to do this in Coq, hence the question!
*
* I am going to use pseudomathematical notation. I am not well versed in either
* mathematics or theoretical computer science, so please do not beat me with a
* stick if I say something that is completely bogus!
*
* I want to construct the type
*
* (tes : taggedElementStack, b : proof that P(tes) holds)
*
* where P(tes) is a predicate that is only true when, for every sublist of tes,
* including tes itself, the heads and tails are compatible.
*)
How would I perform the third step in Coq?
Look at your estack, what does it do? Generalize! element is just a relation (A -> A -> Set), tag is just a Set. What do you get?
Inductive RTList {I : Set} (X : Rel I) : Rel I :=
| RTNil : forall {i : I}, RTList X i i
| RTCons : forall {i j k : I}, X i j -> RTList X j k -> RTList X i k.
(Rel ist just a Definition with Rel I = I -> I -> Set.)
Reflexive-transitive closure! That is common, reusable and modular. Or so you'd think.
The only implementation I found in Coq's libs is in Coq.Relations.Relation_Operators, named clos_refl_trans, differently structured and locked into Prop (all according to the docs, didn't try it).
You'll probably have to re-implement that or find a library somewhere. At least, you'll only have to do this once (or up to three times for Set, Prop and Type).
Your other idea will probably be harder to manage. Look at NoDup for something that's similar to your description, you might be able to reuse the pattern. If you really want that. NoDup uses In, which is a function that checks if an element is in a list. The last time I tried using it, I found it considerably harder to solve proofs involving In. You can't just destruct but have to apply helper lemmas, you have to carefully unfold exactly $n levels, folding back is hard etc. etc. I'd suggest that unless it's truly necessary, you'd better stick with data types for Props.