Definition for relation to be in 3NF using canonical cover - database-normalization

We are using following definition of 3NF:
A schema R is in third normal form (3NF) if for all FD α → β in F
+, at least one of the following holds:
α → β is trivial (i.e., β ⊆ α).
α is a superkey for R
Each attribute A in β – α is contained in a candidate key for R
(prime).
I don't understand the third condition for this definition. OK, each atribute A in β
–β
–
α means? What set of attributes does it include?
α is contained in candidate key of R. What set of attributes does it include?

β – α is the set of the attributes β minus the attributes that are in α, if any. So the third rule says that we can have attributes determinated by something which is not a superkey, but only if those attributes are primes (i.e. part of a candidate key), and obviously are not already in the determinant α (otherwise we have a trivial dependency).
So, when a relation schema is not in third normal form? When all the three conditions are false: that is when we have at least a non-trivial dependency whose determinant is not a superkey (so neither a candidate key) and whose right part contains attributes that are not prime.

Related

Could someone please give me an example of a 3NF *DECOMPOSITION* that is not in BCNF? (I have no problem determining this for non-decompositions.)

It seems to me that Bernstein's synthesis / 3NF synthesis always yields BCNF subrelations, but that's apparently not true.
When one uses 3NF synthesis, one will have subrelations as a result, and they will each consist of either:
just one functional dependency along with all attributes of the schema, so the left side of the lone functional dependency will be a superkey, and that subrelation will therefore be in BCNF.
multiple functional dependencies each of which have the same left side, so they're each superkeys, and that subrelation will therefore be in BCNF.
no functional dependency where the schema includes the attributes making up the primary key of the original / non-decomposed relation, which would satisfy BCNF vacuously because of there being no functional dependencies.
What is an example of the 3NF synthesis algorithm yielding a non-BCNF decomposition and why it is so?
Bernstein's algorithm returns (one or more) components in EKNF, which lies between 3NF & BCNF.
Your claims of "that subrelation will therefore be in BCNF" are wrong. The FDs that hold in a component are all the ones in the closure of the original relation whose attributes are all in the component. So FDs could hold in a component that are not out of its superkeys. (Which by definition of BCNF is just another way of saying a component could be not in BCNF. Obviously--since we are told that the algorithm doesn't always give BCNF.)
Since your reasoning is unsound, finding a counterexample seems moot. But just about any presentation of BCNF gives an example non-BCNF 3NF relation, which it then decomposes to BCNF. You can join the non-BCNF 3NF relation with a projection on attributes of one of its CKs extended by a fresh non-prime attribute, and Bernstein's algorithm can decompose back to the 2 tables.
Chris Date's classic An Introduction to Database Systems has a non-BCNF 3NF schema R(S, J, T) with minimal/irreducible cover
{S, J} -> T
{T} -> J
CKs are {S, J} & {T, J}. Berstein gives component (S, J, T)--non-BCNF 3NF input R--in which both given FDs hold--plus redundant component (T, J).
For an example with an additional non-redundant component, extend the cover by {T} -> X. CKs are the same. {S, J} -> T again gives (S, J, T)--non-BCNF--plus component (T, J, X).
So, could someone please give me an example of the 3NF synthesis algorithm yielding a non-BCNF decomposition and tell why it is so?
A better "So, [...]" would be, So, what is wrong with my reasoning? You would do well to examine the assumptions you made about what FDs could hold in a component. (That article happens to point out (with reference) that "A 3NF table that does not have multiple overlapping candidate keys is guaranteed to be in BCNF.")
There is no "why" in mathematics. We assume things ("assumptions", "axioms", "premises") & other things follow. We can ask for a proof of something, but the proof does not say "why" the something is so, it's a demonstration that it is so. "Why" might be used trying to ask for a proof or for steps that you got wrong in or are missing from whatever almost-proof you have in mind.
PS Such a ubiquitous non-BCNF 3NF relation is Today's Court Bookings in the Wikipedia article on BCNF as I write. But beware that that particular example has perhaps unintuitive FDs. Indeed beware that almost every relational model Wikipedia page--including that one--has errors & misconceptions. So do many, many textbooks, especially re normalization.
The answer of philipxy is correct. Since you are asking for an example, here there are a couple of them.
The relation (with a cover of the functional dependencies):
R (A B C D)
A B → C
C → D
D → B
through the synthesis algorithm is decomposed in:
R1 (A B C)
R2 (C D)
R3 (B D)
and R1 is not in BCNF for the dependency C → B (the candidate key is AB). Note that C → B is not present in the original cover, but is a dependency implied from it.
Here is another (classical) example:
Phones (AreaCode, PhoneNumber, Subscriber, Town, Street)
AreaCode, PhoneNumber → Town
AreaCode, PhoneNumber → Subscriber
AreaCode, PhoneNumber → Street
Town → AreaCode
The Bernsteins’s synthesis algorithm produces two subschemas:
R1 (AreaCode, PhoneNumber, Subscriber, Town, Street)
AreaCode, PhoneNumber → Town
AreaCode, PhoneNumber → Subscriber
AreaCode, PhoneNumber → Street
and:
R2 (Town, AreaCode)
Town → AreaCode
since R2 is included in R1, the algorithm eliminates the second relation. The resulting relation is in 3NF but not in BCNF, since the relation has two candidate keys, (AreaCode, PhoneNumber) and (PhoneNumber, Town) and the functional dependency Town → AreaCode violates the BCNF.

Convert a relation into BCNF

R (A B C)
AB -> C,
C -> A
AB is the minimal super key which is a candidate key.
AB -> C is good.
But C -> A doesn't hold good since prime attribute depends on Non Prime attribute. I know how to decompose till 3 NF. I also know why relation is not in BCNF.
But I don't know how to break this relation into BCNF. How can I do that?
This relation can be decomposed in BCNF using for instance the analysis algorithm, which produces the following decomposition:
R1(A, C) (with non trivial dependency C → A and candidate key C)
R2(B, C) (without non trivial dependencies and so with candidate key (B, C))
But this decomposition does not preserve the dependencies: in fact, the dependency A B → C is lost. So, with such decomposition, the constraint due to this dependency cannot be simply enforced on the decomposed relations. And no other decomposition in BCNF can be produced such that the dependencies are maintained. We can note also that the relation is already in 3NF.

Definition of 3NF

I'm rather confused about the definition of 3NF.
Let R be a relation with attribute set X.
Suppose Y -> A is a functional dependency where A is a non-prime attribute and Y is a subset of X.
If Y is a proper subset of any candidate key for R, then the relation is not in 3NF (and not even in 2NF) because this is a partial dependency, which is not permitted in 2NF (and by extension 3NF).
If Y is a non-prime attribute, the relation is not in 3NF because this is a transitive dependency of the non-prime attribute A on any candidate key through the non-prime attribute Y.
But what if Y is a set containing both prime and non-prime attributes? What if A is a subset of Y? What if Y contains only prime attributes, but those prime attributes come from different keys of R so that Y is not a proper subset of any particular key of R? What if Y contains only, but multiple non-prime attributes? Which of these cases violates the requirements of 3NF and why?
TL;DR Get definitions straight.
To know whether a case violates 3NF you have to look at the criteria used in some definition.
Your question is rather like asking, I know an even number is one that is divisible by 2 or one whose decimal representation ends in 0, 2, 4, 6 or 8, but what if it's three times a square? Well, you have to use the definition--show that the given conditions imply that it's divisible by two or that its decimal representation ends in one of those digits. Why do you even care about other properties than the ones in the definition?
When some FDs (functional dependencies) hold, others must also hold. We say the latter are implied by the former. So when given FDs hold usually tons of others also hold. So one or more arbitrary FDs holding doesn't necessarily tell you anything about any normal forms might hold. Eg when U is a superset of V, U → V must hold; such FDs are called trivial because they are implied by any collection of FDs. Eg when U → V, every superset of U determines every subset of V. Armstrong's axioms are some rules that can be mechanically applied to find all FDs that hold. There are algorithms to find a canonical/minimal/irreducible cover for a given set, a set of FDs that imply all those in it with no proper subset that does. There are also algorithms to determine whether a relation satisfies certain NFs (normal forms), and to decompose them into components with higher NFs when they're not.
Sometimes we think there is a case that the definition doesn't handle but really we have got the definition wrong.
The definition you are trying to refer to for a relation being in 3NF actually requires that there be no transitive functional dependence of a non-prime attribute on a candidate key.
In your non-3NF example you should say there is a transitive FD, not "this is a transitive FD", because the violating FD is of the form CK → A not Y → A. Also, U → V is transitive when there is an X where U → X AND X → V AND NOT X → V. It doesn't matter whether X is a prime attribute.
PS It's not very helpful to ask "why" something is or isn't so in mathematics. We describe a situation in terms of some givens, and a bunch of things follow. We can say that if certain of the givens weren't so then that thing wouldn't be so. But if certain other givens weren't so then it might also not be so. We can give a proof that something is or isn't so as "why" but it's not the only proof.

Basic Pumping Lemma proof doesn't make sense

Proving that a^n b^n, n >= 0, is non-regular.
Using the string a^p b^p.
Every example I've seen claims that y can either contain a's, b's, or both. But I don't see how y can contain anything other than a's, because if y contains any b's, then the length of xy must be greater than p, which makes it invalid.
Conversely, for examples such as:
www, w is {a, b}*, the string used is a^p b a^p b a^p b. In the proofs I've seen, it claims that y cannot contain anything other than a's, for the reason I stated above. Why is this different?
Also throwing in another question:
Describe the error in the following "proof" that 0* 1* is not a regular language. (An
error must exist because 0* 1* is regular.) The proof is by contradiction. Assume
that 0* 1* is regular. Let p be the pumping length for 0* 1* given by the pumping
lemma. Choose s to be the string OP P. You know that s is a member of 0* 1*, but
a^p b^p cannot be pumped. Thus you have a contradiction. So 0* 1* is not regular.
I can't find any problem with this proof. I only know that 0*1* is a regular language because I can construct a DFA.
The pumping lemma states that for a regular language L:
for all strings s greater than p there exists a subdivision s=xyz such that:
For all i, xyiz is in L;
|y|>0; and
|xy|<p.
Now the claim that y can only contain a's or b's originates from the first item. Since if it contained both a's and b's, with i=2, this would result in a string of the form aa...abb...baa...b, etc. That's what the statement wants to say.
The third part indeed, makes it obvious that y can only contain a's. In other words, what the textbooks say is a conclusion derived from the first item.
Finally if you combine 1., 2. and 3., one reaches contradiction, because we know y must contain at least one character (2.), the string can only contain a's. Say y contains k a's. If we would "pump" this with i=2, the result is that we generate a string:
s'=xy2z=ap+kbp
We know however that s' is not part of L, which it should be by 1., so we reach inconsistency.
You can thus only make the proof work by combining the three items. It's not enough to know that y consist only out of a's: that doesn't result in contradiction. It's because there is no subdivision available that satisfies all three constraints simultaneously.
About your second question. In that case, L looks different. You can't reuse the proof of a^nb^n because L is perfectly happy if the string contains more a's. In other words, you can't find a contradiction. In other words, the last item of the proof fails. As long as y contains only one type of characters - regardless of its length - it can satisfy all three constraints.

Decide whether a relationship is in BCNF

I check the definition on Wikipedia for BCNF
A relational schema R is in Boyce–Codd normal form if and only if for every one of its dependencies X → Y,
at least one of the following conditions hold:[4]
X → Y is a trivial functional dependency (Y ⊆ X)
X is a superkey for schema R
Now if R={P,Q,S}
and F={PQ->S, PS->Q, QS->P}
I think it is not in BCNF, am I right?
If I am wrong, could you give me some idea why?
Otherwise, if we are asked that which FD violates the BCNF, since any determinant in FD could be the superkey, what should we give?
Informally, a relation is in BCNF if every arrow for every FD is an arrow out of a candidate key. In this case, the candidate keys are PQ, PS, and QS, so every arrow is an arrow out of a candidate key. I think it's in BCNF.