Counting number of different elements in a list in Coq - coq

I'm trying to write a function that takes a list of natural numbers and returns as output the amount of different elements in it. For example, if I have the list [1,2,2,4,1], my function DifElem should output "3". I've tried many things, the closest I've gotten is this:
Fixpoint DifElem (l : list nat) : nat :=
match l with
| [] => 0
| m::tm =>
let n := listWidth tm in
if (~ In m tm) then S n else n
end.
My logic is this: if m is not in the tail of the list then add one to the counter. If it is, do not add to the counter, so I'll only be counting once: when it's the last time it appears. I get the error:
Error: The term "~ In m tm" has type "Prop"
which is not a (co-)inductive type.
In is part of Coq's list standard library Coq.Lists.List. It is defined there as:
Fixpoint In (a:A) (l:list A) : Prop :=
match l with
| [] => False
| b :: m => b = a \/ In a m
end.
I think I don't understand well enough how to use If then statements in definitions, Coq's documentation was not helpful enough.
I also tried this definition with nodup from the same library:
Definition Width (A : list nat ) := length (nodup ( A ) ).
In this case what I get as error is:
The term "A" has type "list nat" while it is expected to have
type "forall x y : ?A0, {x = y} + {x <> y}".
And I'm quiet confused as to what's going on here. I'd appreciate your help to solve this issue.

You seem to be confusing propositions (Prop) and booleans (bool). I'll try to explain in simple terms: a proposition is something you prove (according to Martin-Lof's interpretation it is a set of proofs), and a boolean is a datatype which can hold only 2 values (true / false). Booleans can be useful in computations, when there are only two possible outcomes and no addition information is not needed. You can find more on this topic in this answer by #Ptival or a thorough section on this in the Software Foundations book by B.C. Pierce et al. (see Propositions and Booleans section).
Actually, nodup is the way to go here, but Coq wants you to provide a way of deciding on equality of the elements of the input list. If you take a look at the definition of nodup:
Hypothesis decA: forall x y : A, {x = y} + {x <> y}.
Fixpoint nodup (l : list A) : list A :=
match l with
| [] => []
| x::xs => if in_dec decA x xs then nodup xs else x::(nodup xs)
end.
you'll notice a hypothesis decA, which becomes an additional argument to the nodup function, so you need to pass eq_nat_dec (decidable equality fot nats), for example, like this: nodup eq_nat_dec l.
So, here is a possible solution:
Require Import Coq.Arith.Arith.
Require Import Coq.Lists.List.
Import ListNotations.
Definition count_uniques (l : list nat) : nat :=
length (nodup eq_nat_dec l).
Eval compute in count_uniques [1; 2; 2; 4; 1].
(* = 3 : nat *)
Note: The nodup function works since Coq v8.5.

In addition to Anton's solution using the standard library I'd like to remark that mathcomp provides specially good support for this use case along with a quite complete theory on count and uniq. Your function becomes:
From mathcomp Require Import ssreflect ssrfun ssrbool eqtype ssrnat seq.
Definition count_uniques (T : eqType) (s : seq T) := size (undup s).
In fact, I think the count_uniques name is redundant, I'd prefer to directly use size (undup s) where needed.

Using sets:
Require Import MSets.
Require List. Import ListNotations.
Module NatSet := Make Nat_as_OT.
Definition no_dup l := List.fold_left (fun s x => NatSet.add x s) l NatSet.empty.
Definition count_uniques l := NatSet.cardinal (no_dup l).
Eval compute in count_uniques [1; 2; 2; 4; 1].

Related

How to write boolean comparsion function in Coq

I'm trying to remove all integers that are greater than 7 from a list as follows
filter (fun n => n > 7).
However I get the following error
The term "n > 7" has type "Prop" while it is expected to have type "bool".
I am new to Coq, how can I fix it?
The problem is that #List.filter nat expects a function of type nat -> bool but you supplied a function of type nat -> Prop. Here, #List.filter nat is List.filter applied to the type argument nat. One difference between bool and Prop is that bool is decidable while Prop is not: there are propositions P such that neither P nor ~P are known; one can always determine whether something is true or false.
In order to resolve this situation, you need to write a function of type nat -> bool that returns true when applied to an argument greater than 7 and false otherwise. You can take advantage of the fact that the standard library defines boolean comparison functions over the natural numbers. I would also suggest reading the first volume of Software Foundations to familiarize yourself with Coq. It is more accessible and easy-going than some other prominent introductions (it was used in a program verification course at my university that presupposed little functional programming experience).
Here is a minimal example using only the builtin list type and notations:
Require Import Coq.Lists.List.
Import ListNotations.
Fixpoint filterb {A : Type} (f : A -> bool) (xs : list A) : list A :=
match xs with
| [] => []
| x :: xs => if f x then x :: filterb f xs else filterb f xs
end.
Fixpoint ltb (n m : nat) : bool :=
match n, m with
| n , 0 => false
| 0 , S m => true
| S n, S m => ltb n m
end.
Eval compute in (filterb (fun n => ltb 7 n) [5;6;7;8;9]).
(* = [8;9] *)

Use condition in a ssreflect finset comprehension

I have a function f that takes a x of fintype A and a proof of P x to return an element of fintype B. I want to return the finset of f x for all x satisfying P, which could be written like this :
From mathcomp
Require Import ssreflect ssrbool fintype finset.
Variable A B : finType.
Variable P : pred A.
Variable f : forall x : A, P x -> B.
Definition myset := [set (#f x _) | x : A & P x].
However, this fails as Coq does not fill the placeholder with the information on the right, and I do not know how to provide it explicitly.
I did not find an indication on how to do that, both in the ssreflect code and book. I realize that I could probably do it by using a sigma-type {x : A ; P x} and modifying a bit f, but it feels more convoluted than it should. Is there a simple / readable way to do that ?
Actually the easiest way to make things work in the way you propose is to use a sigma type:
Definition myset := [set f (tagged x) | x : { x : A | P x }].
But indeed, f is a bit of a strange function, I guess we'll need to know more details about your use case to understand where are you going.

How can I match on a specific value in Coq?

I'm trying to implement a function that simply counts the number of occurrences of some nat in a bag (just a synonym for a list).
This is what I want to do, but it doesn't work:
Require Import Coq.Lists.List.
Import ListNotations.
Definition bag := list nat.
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => O
| v :: t => S (count v t)
| _ :: t => count v t
end.
Coq says that the final clause is redundant, i.e., it just treats v as a name for the head instead of the specific v that is passed to the call of count. Is there any way to pattern match on values passed as function arguments? If not, how should I instead write the function?
I got this to work:
Fixpoint count (v:nat) (s:bag) : nat :=
match s with
| nil => O
| h :: t => if (beq_nat v h) then S (count v t) else count v t
end.
But I don't like it. I'd rather pattern match if possible.
Pattern matching is a different construction from equality, meant to discriminate data encoded in form of "inductives", as standard in functional programming.
In particular, pattern matching falls short in many cases, such as when you need potentially infinite patterns.
That being said, a more sensible type for count is the one available in the math-comp library:
count : forall T : Type, pred T -> seq T -> nat
Fixpoint count s := if s is x :: s' then a x + count s' else 0.
You can then build your function as count (pred1 x) where pred1 : forall T : eqType, T -> pred T , that is to say, the unary equality predicate for a fixed element of a type with decidable (computable) equality; pred1 x y <-> x = y.
I found in another exercise that it's OK to open up a match clause on the output of a function. In that case, it was "evenb" from "Basics". In this case, try "eqb".
Well, as v doesn't work in the match, I thought that maybe I could ask whether the head of the list was equal to v. And yes, it worked. This is the code:
Fixpoint count (v : nat) (s : bag) : nat :=
match s with
| nil => 0
| x :: t =>
match x =? v with
| true => S ( count v t )
| false => count v t
end
end.

Defining a finite automata Coq

I am learning Coq and I'd like to use it to formalize Regular languages theory, specially finite automata. Let's say I have a structure for an automata as follows:
Record automata : Type := {
dfa_set_states : list state;
init_state : state;
end_state : state;
dfa_func: state -> terminal -> state;
}.
Where state is an inductive type as:
Inductive state:Type :=
S.
And the type terminal terminal is
Inductive terminal:Type :=
a | b.
I am trying to define it so later I'll be able to generalize the definition for any regular language. For now, I'd want to construct an automata which recognizes the language (a * b *), which is all words over the {a,b} alphabet. Does anyone have an idea on how to build some kind of fixpoint function that will run the word (which I see as a list of terminal) and tell me if that automata recgonizes that word or not? Any idea/help will be greatly apreciated.
Thanks in advance,
Erick.
Because you're restricting yourself to regular languages, this is quite simple: you just have to use a fold. Here is a sample:
Require Import Coq.Lists.List.
Import ListNotations.
Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.
Record dfa (S A : Type) := DFA {
initial_state : S;
is_final : S -> bool;
next : S -> A -> S
}.
Definition run_dfa S A (m : dfa S A) (l : list A) : bool :=
is_final m (fold_left (next m) l (initial_state m)).
This snippet is a little bit different from your original definition in that the state and alphabet components are now type parameters of the DFA, and in that I have replaced the end state with a predicate that answers whether we are in an accepting state or not. The run_dfa function simply iterates the transition function of the DFA starting from the initial state, and then tests whether the last state is an accepting state.
You can use this infrastructure to describe pretty much any regular language. For instance, here is an automaton for recognizing a*b*:
Inductive ab := A | B.
Inductive ab_state : Type :=
ReadA | ReadB | Fail.
Definition ab_dfa : dfa ab_state ab := {|
initial_state := ReadA;
is_final s := match s with Fail => false | _ => true end;
next s x :=
match s, x with
| ReadB, A => Fail
| ReadA, B => ReadB
| _, _ => s
end
|}.
We can prove that this automaton does what we expect. Here is a theorem that says that it accepts strings of the sought language:
Lemma ab_dfa_complete n m : run_dfa ab_dfa (repeat A n ++ repeat B m) = true.
Proof.
unfold run_dfa. rewrite fold_left_app.
assert (fold_left (next ab_dfa) (repeat A n) (initial_state ab_dfa) = ReadA) as ->.
{ now simpl; induction n as [| n IH]; simpl; trivial. }
destruct m as [|m]; simpl; trivial.
induction m as [|m IH]; simpl; trivial.
Qed.
We can also state a converse, that says that it accepts only strings of that language, and nothing else. I have left the proof out; it shouldn't be hard to figure it out.
Lemma ab_dfa_sound l :
run_dfa ab_dfa l = true ->
exists n m, l = repeat A n ++ repeat B m.
Unfortunately, there is not much we can do with this representation besides running the automaton. In particular, we cannot minimize an automaton, test whether two automata are equivalent, etc. These functions also need to take as arguments lists that enumerate all elements of the state and alphabet types, S and A.

Inductive definition for family of types

I have been struggling on this for a while now. I have an inductive type:
Definition char := nat.
Definition string := list char.
Inductive Exp : Set :=
| Lit : char -> Exp
| And : Exp -> Exp -> Exp
| Or : Exp -> Exp -> Exp
| Many: Exp -> Exp
from which I define a family of types inductively:
Inductive Language : Exp -> Set :=
| LangLit : forall c:char, Language (Lit c)
| LangAnd : forall r1 r2: Exp, Language(r1) -> Language(r2) -> Language(And r1 r2)
| LangOrLeft : forall r1 r2: Exp, Language(r1) -> Language(Or r1 r2)
| LangOrRight : forall r1 r2: Exp, Language(r2) -> Language(Or r1 r2)
| LangEmpty : forall r: Exp, Language (Many r)
| LangMany : forall r: Exp, Language (Many r) -> Language r -> Language (Many r).
The rational here is that given a regular expression r:Exp I am attempting to represent the language associated with r as a type Language r, and I am doing so with a single inductive definition.
I would like to prove:
Lemma L1 : forall (c:char)(x:Language (Lit c)),
x = LangLit c.
(In other words, the type Language (Lit c) has only one element, i.e. the language of the regular expression 'c' is made of the single string "c". Of course I need to define some semantics converting elements of Language r to string)
Now the specifics of this problem are not important and simply serve to motivate my question: let us use nat instead of Exp and let us define a type List n which represents the lists of length n:
Parameter A:Set.
Inductive List : nat -> Set :=
| ListNil : List 0
| ListCons : forall (n:nat), A -> List n -> List (S n).
Here again I am using a single inductive definition to define a family of types List n.
I would like to prove:
Lemma L2: forall (x: List 0),
x = ListNil.
(in other words, the type List 0 has only one element).
I have run out of ideas on this one.
Normally when attempting to prove (negative) results with inductive types (or predicates), I would use the elim tactic (having made sure all the relevant hypothesis are inside my goal (generalize) and only variables occur in the type constructors). But elim is no good in this case.
If you are willing to accept more than just the basic logic of Coq, you can just use the dependent destruction tactic, available in the Program library (I've taken the liberty of rephrasing your last example in terms of standard-library vectors):
Require Coq.Vectors.Vector.
Require Import Program.
Lemma l0 A (v : Vector.t A 0) : v = #Vector.nil A.
Proof.
now dependent destruction v.
Qed.
If you inspect the term, you'll see that this tactic relied on the JMeq_eq axiom to get the proof to go through:
Print Assumptions l0.
Axioms:
JMeq_eq : forall (A : Type) (x y : A), x ~= y -> x = y
Fortunately, it is possible to prove l0 without having to resort to features outside of Coq's basic logic, by making a small change to the statement of the previous lemma.
Lemma l0_gen A n (v : Vector.t A n) :
match n return Vector.t A n -> Prop with
| 0 => fun v => v = #Vector.nil A
| _ => fun _ => True
end v.
Proof.
now destruct v.
Qed.
Lemma l0' A (v : Vector.t A 0) : v = #Vector.nil A.
Proof.
exact (l0_gen A 0 v).
Qed.
We can see that this new proof does not require any additional axioms:
Print Assumptions l0'.
Closed under the global context
What happened here? The problem, roughly speaking, is that in Coq we cannot perform case analysis on terms of dependent types whose indices have a specific shape (such as 0, in your case) directly. Instead, we must prove a more general statement where the problematic indices are replaced by variables. This is exactly what the l0_gen lemma is doing. Notice how we had to make the match on n return a function that abstracts on v. This is another instance of what is known as "convoy pattern". Had we written
match n with
| 0 => v = #Vector.nil A
| _ => True
end.
Coq would see the v in the 0 branch as having type Vector.t A n, making that branch ill-typed.
Coming up with such generalizations is one of the big pains of doing dependently typed programming in Coq. Other systems, such as Agda, make it possible to write this kind of code with much less effort, but it was only recently shown that this can be done without relying on the extra axioms that Coq wanted to avoid including in its basic theory. We can only hope that this will be simplified in future versions.