Basic Pumping Lemma proof doesn't make sense - pumping-lemma

Proving that a^n b^n, n >= 0, is non-regular.
Using the string a^p b^p.
Every example I've seen claims that y can either contain a's, b's, or both. But I don't see how y can contain anything other than a's, because if y contains any b's, then the length of xy must be greater than p, which makes it invalid.
Conversely, for examples such as:
www, w is {a, b}*, the string used is a^p b a^p b a^p b. In the proofs I've seen, it claims that y cannot contain anything other than a's, for the reason I stated above. Why is this different?
Also throwing in another question:
Describe the error in the following "proof" that 0* 1* is not a regular language. (An
error must exist because 0* 1* is regular.) The proof is by contradiction. Assume
that 0* 1* is regular. Let p be the pumping length for 0* 1* given by the pumping
lemma. Choose s to be the string OP P. You know that s is a member of 0* 1*, but
a^p b^p cannot be pumped. Thus you have a contradiction. So 0* 1* is not regular.
I can't find any problem with this proof. I only know that 0*1* is a regular language because I can construct a DFA.

The pumping lemma states that for a regular language L:
for all strings s greater than p there exists a subdivision s=xyz such that:
For all i, xyiz is in L;
|y|>0; and
|xy|<p.
Now the claim that y can only contain a's or b's originates from the first item. Since if it contained both a's and b's, with i=2, this would result in a string of the form aa...abb...baa...b, etc. That's what the statement wants to say.
The third part indeed, makes it obvious that y can only contain a's. In other words, what the textbooks say is a conclusion derived from the first item.
Finally if you combine 1., 2. and 3., one reaches contradiction, because we know y must contain at least one character (2.), the string can only contain a's. Say y contains k a's. If we would "pump" this with i=2, the result is that we generate a string:
s'=xy2z=ap+kbp
We know however that s' is not part of L, which it should be by 1., so we reach inconsistency.
You can thus only make the proof work by combining the three items. It's not enough to know that y consist only out of a's: that doesn't result in contradiction. It's because there is no subdivision available that satisfies all three constraints simultaneously.
About your second question. In that case, L looks different. You can't reuse the proof of a^nb^n because L is perfectly happy if the string contains more a's. In other words, you can't find a contradiction. In other words, the last item of the proof fails. As long as y contains only one type of characters - regardless of its length - it can satisfy all three constraints.

Related

Definition of 3NF

I'm rather confused about the definition of 3NF.
Let R be a relation with attribute set X.
Suppose Y -> A is a functional dependency where A is a non-prime attribute and Y is a subset of X.
If Y is a proper subset of any candidate key for R, then the relation is not in 3NF (and not even in 2NF) because this is a partial dependency, which is not permitted in 2NF (and by extension 3NF).
If Y is a non-prime attribute, the relation is not in 3NF because this is a transitive dependency of the non-prime attribute A on any candidate key through the non-prime attribute Y.
But what if Y is a set containing both prime and non-prime attributes? What if A is a subset of Y? What if Y contains only prime attributes, but those prime attributes come from different keys of R so that Y is not a proper subset of any particular key of R? What if Y contains only, but multiple non-prime attributes? Which of these cases violates the requirements of 3NF and why?
TL;DR Get definitions straight.
To know whether a case violates 3NF you have to look at the criteria used in some definition.
Your question is rather like asking, I know an even number is one that is divisible by 2 or one whose decimal representation ends in 0, 2, 4, 6 or 8, but what if it's three times a square? Well, you have to use the definition--show that the given conditions imply that it's divisible by two or that its decimal representation ends in one of those digits. Why do you even care about other properties than the ones in the definition?
When some FDs (functional dependencies) hold, others must also hold. We say the latter are implied by the former. So when given FDs hold usually tons of others also hold. So one or more arbitrary FDs holding doesn't necessarily tell you anything about any normal forms might hold. Eg when U is a superset of V, U → V must hold; such FDs are called trivial because they are implied by any collection of FDs. Eg when U → V, every superset of U determines every subset of V. Armstrong's axioms are some rules that can be mechanically applied to find all FDs that hold. There are algorithms to find a canonical/minimal/irreducible cover for a given set, a set of FDs that imply all those in it with no proper subset that does. There are also algorithms to determine whether a relation satisfies certain NFs (normal forms), and to decompose them into components with higher NFs when they're not.
Sometimes we think there is a case that the definition doesn't handle but really we have got the definition wrong.
The definition you are trying to refer to for a relation being in 3NF actually requires that there be no transitive functional dependence of a non-prime attribute on a candidate key.
In your non-3NF example you should say there is a transitive FD, not "this is a transitive FD", because the violating FD is of the form CK → A not Y → A. Also, U → V is transitive when there is an X where U → X AND X → V AND NOT X → V. It doesn't matter whether X is a prime attribute.
PS It's not very helpful to ask "why" something is or isn't so in mathematics. We describe a situation in terms of some givens, and a bunch of things follow. We can say that if certain of the givens weren't so then that thing wouldn't be so. But if certain other givens weren't so then it might also not be so. We can give a proof that something is or isn't so as "why" but it's not the only proof.

Unable to formulate a prover9 axiom

I'm trying to teach basic set theory to Prover9. The following definition of membership seems to work very well (the second axiom is just to make lists unordered):
member(x,[x:y]).
[x,y]=[y,x].
With this, I can have Prover9 prove 'complicated' things like member([A,B],[C,[A,B]]) and others.
However, I must be doing something wrong when I use it to define subsets:
subset(x,y) <-> (member(z,x) -> member(z,y)).
Prover9 clausifies this as subset(x,y) | -member(z,y) and uses it to prove false clauses, like subset([A],[B,C]).
What am I missing?
Your "second axiom ... just to make lists unordered" looks suspicious.
Note [x,y] is a two-element list. So your axiom is saying nothing about lists in general. Your 'complicated' examples, are still 2-element lists. So not very complicated. I think you'll be unable to prove member(A, [C, B, A])
Contrast that [x:y] in your first axiom is a 1-or-more-element list. The y might be nil, or might be any number of elements. IOW y is a list, whereas in [x,y], y is an element of a list.
See http://www.cs.unm.edu/~mccune/prover9/manual/2009-11A/, 'Clauses & Formulas' at 'list notation'.
I'd go:
member(x, [x:y]).
member(x, z) -> member(x, [y:z]).
(But that defines a bag, not a set.)
I think the quantification over variables is a red herring.
Edit: Errk. I'm wrong. So I don't know why I got this result:
The example that #Doug points to doesn't need to quant z 'inside'
the rhs of equivalence. You can just remove all the explicit
quantification, and the proof still works.
OK. Let's apply the rewrite rules per the Manual 'Clauses & Formulas' at "If non-clausal formulas are entered ...".
That definition of subset is an equivalence (aka bi-implication); rewrite as two separate axioms: 'forwards' and backwards' implications; rewrite each of those using p -> q ==> -p | q.
In the 'forwards' direction we get:
-subset(x, y) | (member(z, x) -> member(z, y)).
Doesn't matter whether the z is quantified narrow or wide.
In the 'backwards' direction:
-(member(z, x) -> member(z, y)) | subset(x, y).
Here if we quantify z narrowly around the implication, that's inside the scope of negation; and so different to quantifying across the whole formula.
Conclusion: both your definition of member( ) and of subset( ) are wrong.
BTW Are you sure you need to define member? The proof #Doug points to just takes member(x,y) as primitive.

why number of string should be greater than or equal to number of states in pumping lemma?

If L is a regular language, then there exists a constant n (which depends on L) such that for every string w in the language L, such that the length of w is greater than or equal to n, we can divide w into three strings, w = xyz.
w = length of string. n = Number of States.
Why should we pick w greater than or equal to n?
and what is Pumping length?
If you look at the complete statement of the lemma (http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages), you can see that it is actually stating that every string is formed by a prefix x, a part that can be repeated any number of times y and a suffix z. Now it is obvious that, in the shortest case (when the repeating part is taken only once), the length of w equals the number of states needed for the language. This Wikipedia image is very useful:
http://en.wikipedia.org/wiki/File:Pumping-Lemma_xyz_svg.svg
You seem to be misunderstanding the lemma (which you also have not stated completely), and mixing aspects of a proof with what you did state. The lemma says that for every regular language L, there is a constant p such that every string of at least p symbols that belongs to L has a non-empty substring of length no greater than p that can be "pumped", always yielding another element of L. The constant p is the (a) "pumping length".
This can be proved by observing that if a language is regular then there is a finite state automaton that accepts it, and taking p to be the number of states in that automaton (details omitted).
That does not imply, however, that the number of states in the smallest FSA the recognizes a given regular language is the smallest possible pumping length for that language. For instance, consider the language consisting of the union of { an } and { bn } for all n. You need a four-state FSA to recognize this language, but its minimum pumping length is 1.

What forms of goal in Coq are considered to be "true"?

When I prove some theorem, my goal evolves as I apply more and more tactics. Generally speaking the goal tends to split into sub goals, where the subgoals are more simple. At some final point Coq decides that the goal is proven. How this "proven" goal may look like? These goals seems to be fine:
a = a. (* Any object is identical to itself (?) *)
myFunc x y = myFunc x y. (* Result of the same function with the same params
is always the same (?) *)
What else can be here or can it be that examples are fundamentally wrong?
In other words, when I finally apply reflexivity, Coq just says ** Got it ** without any explanation. Is there any way to get more details on what it actually did or why it decided that the goal is proven?
You're actually facing a very general notion that seems not so general because Coq has some user-friendly facility for reasoning with equality in particular.
In general, Coq accepts a goal as solved as soon as it receives a term whose type is the type of the goal: it has been convinced the proposition is true because it has been convinced the type that this proposition describes is inhabited, and what convinced it is the actual witness you helped build along your proof.
For the particular case of inductive datatypes, the two ways you are going to be able to proved the proposition P a b c are:
by constructing a term of type P a b c, using the constructors of the inductive type P, and providing all the necessary arguments.
or by reusing an existing proof or an axiom in the environment whose type you can get to match P a b c.
For the even more particular case of equality proofs (equality is just an inductive datatype in Coq), the same two ways I list above degenerate to this:
the only constructor of equality is eq_refl, and to apply it you need to show that the two sides are judgementally equal. For most purposes, this corresponds to goals that look like T a b c = T a b c, but it is actually a slightly more broad notion of equality (see below). For these, all you have to do is apply the eq_refl constructor. In a nutshell, that is what reflexivity does!
the second case consists in proving that the equality holds because you have other equalities in your context, nothing special here.
Now one part of your question was: when does Coq accept that two sides of an equality are equal by reflexivity?
If I am not mistaken, the answer is when the two sides of the equality are αβδιζ-convertible.
What this grossly means is that there is a way to make them syntactically equal by repeated applications of:
α : sane renaming of non-free variables
β : computing reducible expressions
δ : unfolding definitions
ι : simplifying matches
ζ : expanding let-bound expressions
[someone please correct me if more rules apply or if I got one wrong]
For instance some of the things that are not captured by these rules are:
equality of functions that do more or less the same thing in different ways:
(fun x => 0 + x) = (fun x => x + 0)
quicksort = mergesort
equality of terms that are stuck reducing but would be equal:
forall n, 0 + n = n + 0

Prove language irregular with pumping Lemma

I am trying to prove that the following language is not regular using the pumping lemma
L= { a^i b^j | i^2 > j}
Any tips on this? I am completely stuck.
Thanks.
The pumping lemma says:
If a language A is regular => there is a number p (pumping length) where, if s is any string in L such that |s| >= p, then s may be divided into three pieces s=xyz, satisfying the following condition:
xyiz is in L for each i>=0
|y|>=0
p>=|xy|
The right way to show that a certain language L is not regular is to suppose L regular and try to reach a contradiction.
Lets try to demonstrate that L = {0n1n}|n>=0} is not regular.
We start assuming to the contrary that L is regular.
You can think about this kind of demonstration as a game:
Challenger: He choose the pumping length p. You cannot do any presumption on it.
You: Now it is your turn: choose the "kind" of string that represents the irregularity of the language.
Lets say that the string is in the form 0p1p.
A good tip in this step is to try to limit the adversary next move.
Challenger: He presents to you a string s in the form 0p1p.
You: It's time to pump! If you chose correctly the form of the string in your previous move, you can do some assumption. In our case, for example, we know that the substring y consists only of 0s (at least one 0 because |y|>0), because |xy|<=p and first p-elements are 0s.
Now we show that it exists i>=0 such that xyiz is not in L. For example, for i=2 the string xyyz has more 0s than 1s and so is not a member of L. This case is a contradiction. => L is not regular.
Never forget to demonstrate why the pumped string cannot be a member of L.
If you have any doubt, feel free to ask :)
Cheers.
To the above answer, "The pumping lemma says: If a language A is regular => there is a number p (pumping length) where, if s is any string in L such that |s| >= p, then s may be divided into three pieces s=xyz, satisfying the following condition:"
You mean "If a language L is regular"
Also, the three conditions
1. xy^iz is in L for each i>=0
2. |y|>=0
3. p>=|xy|
The second should be just |y| > 0 not >=
Say you choose the string:
a^2b^5
aabbbbb. Which is in the language.
Now your opponent can choose XYZ.
Their options:
1.) X(empty)Y(some a's)
2.) X(some a's)Y(some a's and some b's)
3.) X(some a's)Y(some a's)
Based on their possible choices, you pump up Y using Y^i where i is an arbitrary number of your choice.
Say they choose 1.)
X(-)Y(a)Z(abbbbb)
If you "pump" up Y^i choosing i = 0. The new string becomes abbbbb. Which is not in the language.
Repeat this for each possible choice of the opponent, if you can pump up Y in a way that produces a string that is not in the language L, then you've succeeded in proving that the language is not regular.