How is every binary relation BCNF? - database-normalization

So, as part of my assignment, I have to prove that any relation with two attributes is in BCNF.
As per my understanding, if for a relation we have 3rd normal form and one non key attribute functionally determine key attribute, it violates the BCNF.
Say my relation consists of two attributes A1,A2
Scenario1(only one functional dependency)
A1 -> A2 (so A1 is the key, and A2 does not FD A1 : so no violation)
same applies for
A2 -> A1
But what if
A1->A2 and A2->A1
Here key can be either A1, A2. And the other non key attribute functionally determines the key.

In each functional dependency X -> Y, X and Y are sets of attributes. This requires special attention when either X or Y is an empty set1. So, in the example with only two attributes A1 and A2, we have all the possible non-trivial dependencies:
1. {} -> {A1}
2. {} -> {A2}
3. {} -> {A1 A2}
4. {A1} -> {A2}
5. {A2} -> {A1}
All the other possible dependencies are trivial dependencies, i.e. the right set is a subset of the left set (for instance {A1} -> {}, {} -> {}, {A1} -> {A1}, {A1 A2} -> {A1}, etc.). We know that these dependencies always hold, so they are not considered in the definition of the normal forms.
1. When empty sets are excluded from dependencies, the theorem is true
Consider the dependencies 4 and 5. We have four possible cases:
1. Only 4 holds, so we have: {A1} -> {A2}
this means that {A1} is a candidate key (since from {A1} -> {A2} we can derive that {A1}->{A1 A2}), and the BCNF condition is satisfied since each dependency has a superkey as determinant;
2. Only 5 holds, so we have: {A2} -> {A1}
equivalent to the previous case, only the role of A1 and A2 is exchanged;
3. Neither 4 nor 5 hold (no functional dependencies),
so the BCNF is formally satisfied (since no dependency violates the BCNF); and, finally:
4. both hold, so we have {A1} -> {A2} and {A2} -> {A1}
also in this case the relation is in BCNF, since {A1} and {A2} are both candidate keys, since they determine all the attributes (simply put together 1 and 2 above).
2. When we allow the empty set in the functional dependencies, the theorem is not true
Consider a relation R(A1, A2), with a cover F of the dependencies
F = { {}-> {A1} }
The meaning of {} -> {A1}, by recalling the definition of functional dependency, is that the column A1 has a constant value. So we have a relation with two columns, one of which has always the same value. In this case the only candidate key is {A2}, since {A2}+ = {A1 A2}, with {A1 A2} a superkey, and the relation is not in BCNF since a non-trivial functional dependency ({} -> {A1}) has a determinant which is not a superkey.
1 Note that in the scientific literature on the subject (as well as in books on databases) the possibility of empty sets in functional dependences is sometimes explicitly excluded (for instance, see: Tsou, Don-Min, and Patrick C. Fischer. “Decomposition of a Relation Scheme into Boyce-Codd Normal Form.” ACM SIGACT News 14, no. 3 (July 1, 1982): 23–29. https://doi.org/10.1145/990511.990513), while sometimes it is allowed, or not discussed.

For any relation to be in BCNF, the following must holds.
X → Y is a trivial functional dependency (Y ⊆ X).
X is a superkey for schema R
Wikipedia link here
For Example, there is a relation R = {A,B} with two attributes.
The only possible (non-trivial) FD's are {A}->{B} and {B}->{A}.
So, there are four possible cases:
1. No FD's holds in R. {C.K = AB}, Since it is an all key relation it's always in BCNF.
2. Only A->B holds. In this case {C.K = A} and relation satisfies BCNF.
3. Only B->A holds. In this case {C.K = B} and relation satisfies BCNF.
4. Both A->B and B->A holds. In this case there are two keys {CK = A and B} and
relation satisfies BCNF.
Hence, every Binary Relation (A relation with two attributes) is always in BCNF!

To prove any relation with two attributes is in BCNF.
Rule For Boyce-Codd Normal Form:
A relation R is in BCNF if R is in Third Normal Form and for every FD,LHS is super key
so if, A1 and A2 are the only attributes: A1 -> A2 and A2 -> A1 as functional dependencies, then in both functional dependencies, the left-hand side is a super key. Which satisfies the condition of BCNF.

Related

Decomposing tables where some attributes aren't in any minimal, non-trivial functional dependency

While writing a library to automatically decompose a given table, I came across some special cases where the first steps of Bernstein's synthesis for 3NF (ACM 1976) give me results I don't expect.
Take this simple case:
a b
---
1 1
2 1
1 2
2 2
By my understanding, this is the full set of functional dependencies:
{} -> {}
a -> {}
a -> a
b -> {}
b -> b
ab -> {}
ab -> a
ab -> b
ab -> ab
By eye, we can see that both attributes together form a candidate key, and there's no normalisation to be done. However, suppose we take the full FD set above, and try to apply Bernstein to decompose the table. We expect to get the same table back.
Bernstein has the following first steps:
Eliminate extraneous attributes. We simplify to
{} -> {} (repeated)
a -> a (repeated)
b -> b (repeated)
ab -> ab
Find a non-redundant covering. ab -> ab is redundant by augmentation, so we have
{} -> {}
a -> a
b -> b
I'd say the latter two are redundant as well, by reflexivity. If we keep them, the remaining steps give two non-equivalent keys, which result in two separate relations after applying the rest of Bernstein synthesis. If we don't keep them, there's nothing to work with, so the remaining steps give no tables.
Where is the problem in the above?
This appears to be solved by an addendum to Bernstein's synthesis, that I came across in lecture videos from Gary Boeticcher, then at UHCL: if the decomposition does not contain a table containing one of the original table's candidate keys, then adding an additional table with one of those candidate keys will make the decomposition lossless. In this case, after applying Bernstein's synthesis, and getting no tables in return, we could add a table with both attributes a and b. This gives us back the original table, as we'd expect.

Graph database: get common parent node

I want to select the first common boss for two employees in draph.
My model is simple:
name: string
boss_of: uids
Lets assume the following data where each arrow denotes the boss_of edge:
A -> B
A -> C
B -> D
C -> E
E -> F
E -> G
So, given F And D the query should return A, for F and G the result is obviously E.
I tried using allofterms but found no solution as there may be a different number of nodes
between the co-workers and their common boss. Is it possible at all to formulate such a query?
I am trying to explore dgraph (or graph databases at all), so maybe I am just overseeing something.
You can use K-Shortest Path Queries
The middle one in the response is the closest common entity.

Database Normalization mistake

I'm preparing an exam and on my texts I found an example I don't understand.
On the Relation R(A,B,C,D,E,F) I got the following functional dependencies:
FD1 A,B -> C
FD2 C -> B
FD3 C,D -> E
FD4 D -> F
Now I think all The FD are in 3NF (none is in BCNF), but the text says FD1 and FD2 to be in 2NF and FD3 and FD4 to be in 1NF. Where am I making mistakes (or is it the text wrong).
I found alternative keys to be ABD and ACD
Terminology
It is highly improper to say that: “a Functional Dependency in is in a certain Normal Form”, since only a relation schema can be (or not) in a Normal Form. What can be said is that a Functional Dependency violates a certain Normal Form (so that the schema that contains it is not in that Normal Form).
Normal forms
It can be shown that a relation schema is in BCNF if every FD given has as determinant a superkey. Since, has you have correctly noted, the only candidate keys here are ABD and ACD, every dependency violates that Normal Form. So, the schema is not in BCNF.
To be in 3NF, a relation schema must have all the given functional dependencies such that either the determinant is a superkey, or every attribute of the determinate is a prime attribute, that is it is an attribute of some candidate key. In your example this is true for B and C, but not for E and F, so FD3 and FD4 violates the 3NF. So, the schema is neither in 3NF.
The 2NF, which is only of historical interest and not particularly useful in the normalization theory, is a normal form for which the relation schema does not have functional dependencies in which non-prime attributes depend on part of keys. This is not true again for FD3 and FD4, so that the relation is neither in 2NF.

Is an FD(functional dependency) fully fd when x->y and z->y where z is not a subset of x?

I have seen many examples about fully functional dependencies, but they use to say that:
x->y such that y shouldn't be determined by any proper subset of x, x has to be a key.
But, what if y is determined by an attribute other than the proper subset or subset of x.
Suppose that I have a students table which consists of rollno(primary key), name, phone no unique not null, email unique not null.
As rollno is a primary key, let it be x and take name as y.
now x->y, but phone or email also determine y(name) which are not subsets of x. Is this still called a fully functional dependent?
If yes, should we check the determinants of y which are only subsets of x?
If no, what is the mistake I did?
x->y such that y shouldn't be determined by any proper subset of x, x has to be a key.
You are confusing the definition of "full functional dependency" with the definition of "2NF". The definition of fully functionally dependent has nothing to do with superkeys or candidate keys or primary keys. And for a relation to be in 2NF, if X is a candidate key and Y is non-prime then Y can't be determined by any proper subset of X.
A functional dependency X -> Y is partial when Y is also functionally dependent on a proper/smaller subset of X. Otherwise it is full. It doesn't matter what else is true.
A superkey is a column or set of columns that functionally determines every column. If there is no smaller superkey inside it then it is a candidate key. A relation is in 2NF when every attribute is fully functionally dependent on every candidate key. It doesn't matter what else is true.
You can pick one candidate key to call primary key. So a primary key is a candidate key. Otherwise the notion of "primary key" is irrelevant to functional dependencies and normalization.
(In SQL primary key means the same as unique not null, namely superkey. Which is a candidate key only if there's no smaller superkey in it. So a set declared primary key might not even be a primary key. And in SQL you can't declare {} as a superkey.)
As rollno is a primary key, let it be x and take name as y.
A primary key is a candidate key, so {rollno} determines every attribute and no proper subset of {rollno} determines every attribute. So {}, the only proper subset of {rollno}, is not a superkey. ({} is a superkey when there can only ever be at most one row in the table.) But it's still possible that {} -> name. (That would be if the name column only contains at most one name at a time.) Then {rollno} -> name would be partial because its proper subset {} determines name.
now x->y, but phone or email also determine y(name) which are not subsets of x. Is this still called a fully functional dependent?
If no proper subset of {rollno} determines name then {rollno} -> name fully, otherwise partially. That's what the definition says. Nothing else matters. But we don't know whether proper subset of {rollno} determines name because you didn't say whether {} -> name.
If {rollno}, {phoneno} and {email} are candidate keys and {} doesn't determine name then name is fully functionally dependent on all three (because no proper subset of any of them determines name).
You are saying:
x->y such that y shouldn't be determined by any proper subset of x, x has to be a key.
but this mixes two different concepts, that of “full functional dependency”, and that of “key”.
A functional dependency is full if you cannot remove any element of the left part without losing the propoerty of determining the right part. So if a functional dependency has only one attribute on the left part (like rollno → name), it is always complete.
A (candidate) key on the other hand is a set of attributes that determines all the attributes of a relation, and such that you cannot remove any attribute from it without losing the property of being a key (so, it is not a superkey).
In your example there are three different keys, rollno, phone, and email, each of them composed by a single attribute.
Of course, if you know that the set of attributes X is a key, you can write that X → T, where T are all the attributes of the relation, and this functional dependency is complete.

Transitive Dependency issue in Data Normalization

I wish to find the transitive dependencies present in this table.
Here,
SID: Staff ID
PS: Pay scale
S_name: Staff Name
C_P_HR: Charge per hour
CNIC: ID card
First start off by listing all the functional dependencies. Logically since SID is the primary key, everything depends on the primary key. Also, CNIC forms a candidate key
{SID}+ -> {PS, S_NAME, C_P_HR, CNIC, DESIGNATION, SALARY}
{S_NAME}+ -> {SID, PS, CNIC, C_P_HR, DESIGNATION, SALARY}
{CNIC}+ -> {SID, PS, S_NAME, C_P_HR, DESIGNATION, SALARY}
{PS}+ -> {C_P_HR,DESIGNATION}
Now, a transitive dependency needs to satisfy these criteria:
A → B
It is not the case that B → A
B → C
Upon inspection of all the Functional Dependencies, none of them satisfy the criteria. Thus, we can conclude that there ae no transitive dependencies in the table.