Coq: Implementation of splitstring and proof that nothing gets deleted - coq

after working for a whole day on this with no success, I might get some help here.
I implemented a splitString function in Coq: It takes a String (In my case a list ascii) and a function f: ascii->bool. I want to return a list of strings (In my case a list (list ascii)) containing all the substrings. This means that the input string has been split at all asciis where f is true. Note that my output also includes the delimiter as a string (list ascii) of length 1.
My first question: exists this function in a library somewhere? Many other, non-functional, programming languages I know includes this function in the default library.
I didn't found something, so I implemented it by myself:
Fixpoint split_string (f: ascii->bool) (z s: list ascii): list (list ascii) :=
match s with
| [] => [rev z]
| h::t => match f h with
| true => ([rev z]++[[h]])++(split_string f [] t)
| false => (split_string f (h::z) t)
end
end.
The function needs to be called with an empty z, like Compute split_string isWhite [] some_string.
The clue of that List z is that the current string gets saved in it until a delimiter is found, then the whole string z gets returned. I don't see another way of solving this.
The problem with the List z is that, when it comes to proofing, it makes trouble.
I want to proove that, when the output of the splitString function gets flattened (in coq with concat) it is equal to the input, because the splitString method does not remove information. I formulated a theorem:
Theorem not_more_not_less_splitWhite: forall (s: list ascii),
s = concat (split_string isWhite [] s).
But every time when I try to solve this with induction, I get stuck because the List z is not empty anymore (since one char which is not white has been processed). Then I can never apply the induction hypothesis. This is how far I've made it:
Proof. intros s. induction s.
- simpl. reflexivity.
- simpl. destruct isWhite eqn:W.
* simpl. rewrite <- IHs. reflexivity.
*
I found myself in willing to induce in s a second time, but I think this is bullshit and does not use the power of induction. So, if the answer to my first question is no, my second question is how do I solve this, or is there a better implementation for splitString.
Thank You!

This problem is common whenever proving things by induction about a fixpoint with an accumulator, such as yours. The standard advice is to find a stronger statement that has your desired result as a corollary. This stronger statement should be about all lists, not only the empty list. The latter should hopefully be easier to prove with induction, since the stronger statement leads to a stronger induction hypothesis.
In your case, I guess (but haven't checked) the stronger statement could be something like:
Theorem not_more_not_less_splitWhite_stronger: forall (z s: list ascii),
rev z ++ s = concat (split_string isWhite z s).

Here is a proof of your theorem:
Goal forall p l buf, rev buf ++ l = concat (split_string p buf l).
Proof.
induction l as [ | a l IHl].
- intro; now cbn.
- cbn; case_eq (p a); intro Ha.
+ intro buf; repeat rewrite concat_cons; now rewrite <- IHl.
+ intro buf; rewrite <- IHl; cbn; now rewrite <- app_assoc.
Qed.
Please note that it worked thanks to the universal quantification on buf in the induction hypothesis. It was made easier thanks to the order of
quantifications in the goal statement.
Ana's statement can be proved the same way, with a small bookkeeping sequence before the induction:
Goal forall p buf l, rev buf ++ l = concat (split_string p buf l).
Proof.
intros p buf l; revert buf.
induction l as [ | a l IHl].
(* ... *)

Related

Can any one help me how to prove this therom in coq

~ (exists x:D, ~ R x)->(forall y:D, R y)
I have worked on it for quite a long time, but it seems that I cannot use the left part of the implication well.
This is the first part of my code:
Parameter D: Set.
Parameter x: D.
Parameter y: D.
Parameter R: D->Prop.
Lemma b: ~(exists x:D, ~ (R x))->(forall y:D, (R y)).
Can anyone help me figure out how to write the rest of of the code?
Your question is a bit vague, as you don't specify what D and R are, and where you are stuck in your proof. Try providing a minimal working example, with an explicit fail tactic for where you're stuck in the proof.
In classical logic (the one you're use to in maths), as you have the excluded middle rule, you can always do a case analysis on whether something is true or false. In vanilla Coq, built for intuitionistic logic, it's not the case. Your result is actually not provable if the predicate R is not decidable (if it's not either true or false on every input : forall (x:D), R x \/ ~R x), if the type D is not empty.
Try adding the decidability of R as an hypothesis and reprove it. It should follow more or less this structure (the key being the case analysis on whether (R y) is true or false) :
Parameter D: Set.
Parameter R: D -> Prop.
Lemma yourGoal :
(forall x, R x \/ ~ R x) -> (* Decidability of R *)
~ ( exists x, ~ (R x) )->
forall y, (R y).
Proof.
intros Hdec Hex y. (* naming the hypothesis for convenience *)
specialize (Hdec y).
destruct Hdec as [H_Ry_is_true | H_Ry_is_false]. (* case analysis, creates two goals *)
+ (* (R y) is true, which is our goal. *)
assumption.
+ (* (R y) is false, which contradicts Hex *)
exfalso. (* transform your goal into False *)
apply Hex.
(* should be easy from here, using the [exists] tactic *)
Qed.
Ps: this exact result (and its link with excluded middle) is mentioned in Software foundations, which is a great resource to learn Coq and logic : https://softwarefoundations.cis.upenn.edu/lf-current/Logic.html#not_exists_dist

question about intros [=] and intros [= <- H]

I am a beginner at coq.
I do not know the meaning of intros [=] and intros [= <- H] . and I could not find an easy explanation. Would someone explain these two to me please?
Regards
The documentation for this is here. I will add a little explanation note.
The first historical use of intro patterns is to decompose data that is packed in inductive objects on the fly. Here is a first easy example (tested with coq 8.13.2).
Lemma forall A B, A /\ B -> B /\ A.
Proof.
If you run the tactic intros A B H then the hypothesis H will be a proof of A /\ B. Morally, this contains knowledge that A holds, but it cannot be used as such, because it is a proof of a stronger fact. It is often the case that users want directly to decompose this hypothesis, this would normally be done by typing destruct H as [Ha Hb]. But if you know right away that you will not keep hypothesis H, why not find a shorter expression. This is what the intro pattern is used for.
So you type the following command and have the resulting goal:
Intros A B [Ha Hb].
(* resulting goal
A, B : Prop
Ha : A
Hb : B
============================
B /\ A
*)
Abort.
I will not finish this proof. But you get the idea of what intro patterns are for: decompose information on the fly when inductive types (like conjunction here) pack several pieces of information together.
Now, equality information also can pack several pieces of information together. Assume now that we are working with lists of natural numbers and we have the following equality.
Require Import List.
Lemma intro_pattern_example2 n m p q l1 l2 :
(n :: S m :: l1) = (p :: S q :: l2) -> q :: p :: l2 = m :: n :: l1.
The equality in the left-hand side of the implication is an equality between two lists, but it actually packs several more elementary pieces of information: n = p, m = q, and l1 = l2. If you just type intros H, you obtain the equality between two lists of length 3, but if you type intros [=], you ask the proof system to explore the structure of each equality member and check when constructors appear so that the smaller pieces of information can be placed in separate hypothesis instead of the big one. This is a short hand for the use of the injection tactic. Here is the example.
intros [= Hn Hm Hl1].
(*resulting goal:
n, m, p, q : nat
l1, l2 : list nat
Hn : n = p
Hm : m = q
Hl1 : l1 = l2
============================
q :: p :: l2 = m :: n :: l1
*)
So you see, this intro pattern unpacks information that would otherwise be stuck in a more complex hypothesis.
Now, when an hypothesis is an equality, there is another action you might want to perform right away. You might want to rewrite with it. In intro patterns, this is done by replacing the name you would give to that equality with an arrow. Let's test this on the previous goal.
Undo.
intros [= -> -> ->].
(* resulting goal
p, q : nat
l2 : list nat
============================
q :: p :: l2 = q :: p :: l2
*)
Now this goal can be solved quickly with reflexivity, trivial, or auto. Please note that the hypotheses were used to rewrite, but they were not kept in the goal context, so this possibility to rewrite directly from the intro pattern has to be used with caution, because you are actually losing some information.
The [= ] intro pattern is used especially for equalities and when both members are datatype constructors. It exploits the natural injectivity property of these constructors. there is another property that is respected by datatype constructors. It is the fact that two pieces of data with different head constructors can never be equal. This is exploited in Coq by the discriminate tactic. The [=] intro pattern is shorthand for both the injection and discriminate tactics.

What should be done when simpl does not reduce all the necessary steps?

The following example is from chapter Poly of the Software Foundations book.
Definition fold_length {X : Type} (l : list X) : nat :=
fold (fun _ n => S n) l 0.
Theorem fold_length_correct : forall X (l : list X),
fold_length l = length l.
Proof.
intros.
induction l.
- simpl. reflexivity.
- simpl.
1 subgoal
X : Type
x : X
l : list X
IHl : fold_length l = length l
______________________________________(1/1)
fold_length (x :: l) = S (length l)
I expected it to simplify a step here on the left side. It certainly should be able to.
Theorem fold_length_correct : forall X (l : list X),
fold_length l = length l.
Proof.
intros.
induction l.
- simpl. reflexivity.
- simpl. rewrite <- IHl. simpl.
1 subgoal
X : Type
x : X
l : list X
IHl : fold_length l = length l
______________________________________(1/1)
fold_length (x :: l) = S (fold_length l)
During the running of the tests I had an issue where simpl would refuse to dive in, but reflexivity did the trick, so I tried the same thing here and the proof succeeded.
Note that one would not expect reflexivity to pass given the state of the goal, but it does. In this example it worked, but it did force me to do the rewrite in the opposite direction of what I intended originally.
Is it possible to have more control over simpl so that it does the desired reductions?
For the purposes of this answer, I'll assume the definition of fold is something along the lines of
Fixpoint fold {A B: Type} (f: A -> B -> B) (u: list A) (b: B): B :=
match u with
| [] => b
| x :: v => f x (fold f v b)
end.
(basically fold_right from the standard library). If your definition is substantially different, the tactics I recommend might not work.
The issue here is the behavior of simpl with constants that have to be unfolded before they can be simplified. From the documentation:
Notice that only transparent constants whose name can be reused in the recursive calls are possibly unfolded by simpl. For instance a constant defined by plus' := plus is possibly unfolded and reused in the recursive calls, but a constant such as succ := plus (S O) is never unfolded.
This is a bit hard to understand, so let's use an example.
Definition add_5 (n: nat) := n + 5.
Goal forall n: nat, add_5 (S n) = S (add_5 n).
Proof.
intro n.
simpl.
unfold add_5; simpl.
exact eq_refl.
Qed.
You'll see that the first call to simpl didn't do anything, even though add_5 (S n) could be simplified to S (n + 5). However, if I unfold add_5 first, it works perfectly. I think the issue is that plus_5 is not directly a Fixpoint. While plus_5 (S n) is equivalent to S (plus_5 n), that isn't actually the definition of it. So Coq doesn't recognize that its "name can be reused in the recursive calls". Nat.add (that is, "+") is defined directly as a recursive Fixpoint, so simpl does simplify it.
The behavior of simpl can be changed a little bit (see the documentation again). As Anton mentions in the comments, you can use the Arguments vernacular command to change when simpl tries to simplify. Arguments fold_length _ _ /. tells Coq that fold_length should be unfolded if at least two arguments are provided (the slash separates between the required arguments on the left and the unnecessary arguments on the right).[sup]1[\sup]
A simpler tactic to use if you don't want to deal with that is cbn which works here by default and works better in general. Quoting from the documentation:
The cbn tactic is claimed to be a more principled, faster and more predictable replacement for simpl.
Neither simpl with Arguments and a slash nor cbn reduce the goal to quite what you want in your case, since it'll unfold fold_length but not refold it. You could recognize that the call to fold is just fold_length l and refold it with fold (fold_length l).
Another possibility in your case is to use the change tactic. It seemed like you knew already that fold_length (a :: l) was supposed to simplify to S (fold_length l). If that's the case, you could use change (fold_length (a :: l)) with (S (fold_length l)). and Coq will try to convert one into the other (using only the basic conversion rules, not equalities like rewrite does).
After you've gotten the goal to S (fold_length l) = S (length l) using either of the above tactics, you can use rewrite -> IHl. like you wanted to.
I thought the slashes only made simpl unfold things less, which is why I didn't mention it before. I'm not sure what the default actually is, since putting the slash anywhere seems to make simpl unfold fold_length.

Induction proofs on MSets

I'm trying to use MSet library in a Coq development and I need a map function, which is absent from the library, but can be implemented using fold, as usual.
In the following gist, I've put a simplification of what I'm working on, full of axioms, just to get straight to the point.
My problem is to prove a property of the following map function:
Definition map (f : Exp -> Exp) s
:= MSet.fold (fun a ac => MSet.add (f a) ac) MSet.empty s.
Which uses fold from Coq MSet library. The property that I want to show is:
Lemma map_lemma : forall s f e, In e (map f s) -> exists e', In e' s /\ e = f e'.
Proof.
induction s using set_induction ; intros ; try fsetdec.
Which is intended to show that if an element e in the set map f s, then exists another element e' in s, s.t. e = f e'. My difficulty is to prove the inductive case, since the induction hypothesis produced by set_induction does not seems useful at all.
Could someone provide me any clues on how should I proceed?
First, I think there is a problem in your definition of smap. You must swap MSet.empty and s, otherwise you can prove:
Lemma snap_trivial : forall f s, smap f s= s.
Proof.
intros. reflexivity.
Qed.
With the right definition, you can use the fold_rec lemma that is adapted to this kind of goal.

How to apply a function once during simplification in Coq?

From what I understand, function calls in Coq are opaque.
Sometimes, I need to use unfold to apply it and then fold to turn the function definition/body back to its name. This is often tedious. My question is, is there an easier way to let apply a specific instance of a function call?
As a minimal example, for a list l, to prove right-appending [] does not change l:
Theorem nil_right_app: forall {Y} (l: list Y), l ++ [] = l.
Proof.
induction l.
reflexivity.
This leaves:
1 subgoals
Y : Type
x : Y
l : list Y
IHl : l ++ [] = l
______________________________________(1/1)
(x :: l) ++ [] = x :: l
Now, I need to apply the definition of ++ (i.e. app) once (pretending there are other ++ in the goal which I don't want to apply/expand). Currently, the only way I know to implement this one time application is to first unfold ++ and then fold it:
unfold app at 1. fold (app l []).
giving:
______________________________________(1/1)
x :: l ++ [] = x :: l
But this is inconvenient as I have to figure out the form of the term to use in fold. I did the computation, not Coq. My question boils down to:
Is there a simpler way to implement this one-time function application to the same effect?
You can use simpl, compute or vm_compute if you want to ask Coq to perform some computation for you. If the definition of the function is Opaque, the above solution will fail, but you could first prove a rewriting lemma such as:
forall (A:Type) (a:A) (l1 l2: list A), (a :: l1) ++ l2 = a :: (l1 ++ l2).
using your technique, and then rewrite with it when necessary.
Here is an example using simpl:
Theorem nil_right_app: forall {Y} (l: list Y), l ++ nil = l.
Proof.
(* solve the first case directly *)
intros Y; induction l as [ | hd tl hi]; [reflexivity | ].
simpl app. (* or simply "simpl." *)
rewrite hi.
reflexivity.
Qed.
To answer your comment, I don't know how to tell cbv or compute to only compute a certain symbol. Note that in your case, they seem to compute too eagerly and simpl works better.