ER Diagram to Database conversion - rdbms

Suppose i have Two strong entity E1 and E2 connected by a 1 to many relationship R.
E1 <--------- R ---------- E2
How many table will be created when i will convert the above ER diagram into database ?
I know that when E2 will be in total participation answer will be 2. Since, E2's primary key will merge perfectly. I am not sure about above. I have seen multiple places and found different answer. I am looking for some solid argument with answer.
Answer can be 2 or 3. I want to know which is more correct.

Chen's original method mapped every entity relation and relationship relation to a separate table. This would produce 3 tables:
E1 (e1 PK)
E2 (e2 PK)
R (e2 PK, e1)
Full participation by either E1 or E2 can be handled by an FK constraint.
As you can see, E2 and R have the same determinant / PK. This allows us to combine the two relations into one table, using a nullable e1 column if E2 participates partially in the relationship, non-nullable if it participates fully. Full participation by E1 still requires an FK constraint:
E1 (e1 PK)
E2 (e2 PK, e1)
I want to know which is more correct.
Logically, the two solutions are pretty much equivalent.
Making 3 tables maintains the structure of the conceptual (ER) model, but produces more tables which increases complexity in one way. On the other hand, it avoids nulls which create their own complexity.
Making 2 tables reduces the number of tables but introduces nulls. In addition, we have to resort to different mechanisms (nullable columns vs FK constraints) to implement a single concept (full participation).
Other requirements can also affect the decision. If I have 50 optional attributes, I certainly don't want to deal with 50 distinct tables! However, if I wanted to create another relationship (R2) which only applies to values in E2 which are already participating in R, I could enforce that constraint in the first design using an FK constraint: R2 (e2) referencing R (e2). In the second design, I would need to use a trigger since I only want to allow references to e2 which have non-null e1 values.
There is no ultimately correct answer. Conceptual, logical and physical modeling address different concerns, and as-yet unknown requirements will affect your model and contradict your decisions. As in programming, try to keep things simple, refactor continuously and hope for the best.

Related

Equivalence relations are to Groups, as partial order relations are to...?

I'm a beginning student of category theory so the question is a little hazy. Apologies if it is too basic.
An equivalence relation induces a "symmetric category" (bad terminology?), where you can back from any arrow. The category induced by a group has a different symmetry. How are these two specifically related? Is an equivalence relation somehow an algebra, like a group, that specializes the category axioms? Is it more deeply analogous to a group in some way?
I know that a category can also be induced by a partial order - which encodes anti-symmetry rather than symmetry . Is there a corresponding algebra encoding antisymetry (like a group but encoding anti-symmetry instead)? I know a partial order itself has the algebra of a lattice.
A set with an equivalence relation is often called a setoid. Categorically, a setoid is a thin groupoid. A groupoid may be thought of as a "multi-object group" in the same way that a category is a "multi-object monoid": that is, the endomorphisms of every object in a groupoid form a group.
A partial order is a thin skeletal category (a preorder is simply a thin category). Therefore, the algebraic structure corresponding to a partial order (or preorder), in the same way that groups correspond to equivalence relations, is a monoid.
The relationship "an X is just a one-object Y" is called horizontal categorification, where for your examples we have:
X = group, Y = groupoid.
X = monoid, Y = category.

Third Normal Form Conditions

I know that for a Relation to be 3NF It has to be 2NF and no transitive dependencies should exist but I couldn't answer the following question:
For a relationship to be 3NF :
A) All Attributes should depend on the primary key.
B) The relationship should only have one Foreign Key.
C) The relationship should only have one Primary Key.
D) The Relationship's Table should only have atomic values
D applies on a 3NF relationship because it's one of the conditions of 1NF and for a relationship to be 3NF it has to be 2NF and 1NF.
C is too general and doesn't apply just on 3NF but my book has chosen it as the answer!
B is not related to Normalization and A may be considered as 2NF but they didn't say all non-attributes so I don't know actually, what is the right answer here?
By definition of "superkey", all attributes depend on a superkey. By definition of "CK" (candidate key) as a superkey containing no smaller superkey, all attributes depend on a CK. By definition of "PK" (primary key) as a distinguished CK, all attributes depend on a PK. So A is an answer.
FKs (foreign keys) are irrelevant to normalization. So B is not an answer.
By definition of "PK", a relation/schema can have at most one, which we pick from among the CKs. There can always be a PK, because there is always at least one CK. Whether you must pick a PK depends on your textbook--PKs per se have no role in normalization theory. Unfortunately "should only have one" is not clear, because it might mean exacly one & it might mean at most one. So if it agrees with your textbook, C is an answer; otherwise not. Go with your textbook.
Presentations that talk about "atomic" values require them in either the definition of "relation" or the definition of "1NF" & higher NFs. So for your textbook presumably D is an answer. But actually the notion of atomic values, although ubiquitous, is confused & also "1NF" has no single meaning. Go with your textbook.
(None of the options guarantee 3NF.)
PS Your characterization of 3NF is not correct. Only certain transitive FDs (functional dependencies) matter--3NF is when/iff 2NF & no non-CK attribute is transitively dependent on a CK. (If one's "is in 1NF" is just "is a relation" then one can drop the "2NF &".) And be sure you get the correct definition of "transitive FD"--for sets X & Y, X->Y is transitive when/iff there exists set S where X->S & S->Y & not S->X & not S=Y. Get correct definitions from a good textbook.

Understanding BCNF Functional Dependency

I was following this tutorial for BCNF decomposition. The functional dependencies given are:
A->BCD
BC->AD
D->B
These are concerned with the relation R(A,B,C,D). The conditions for BCNF include:
The relation must be in 3NF and when X->Y, X must be a superkey
The given relation doesn't have a transitive FD but D->B is a partial FD--or is it that the three FDs represent 3 separate relations?
If they represent 3 separate relations, why is it that D is not a key and if they are all in the same relation then D->B is a partial functional dependency.
If we write the given set of FDs with singleton right-hand side, we have -
A->BA->CA->DBC->ABC->DD->B
We can see at once 2 transitive dependencies. We have A->D and D->B so we don't need A->B and also we have BC->A and A->D so we don't need BC->D. So now we have -
A->CA->DBC->AD->B
or
A->CDBC->AD->B
The keys here are A, BC and CD. Since each attribute of the relation R comes at least once in each of the keys, all the attributes in your relation R are prime attributes.
Note that if a relation has all prime attributes then it is already in 3NF.
Hence the given relation R is in 3NF. I hope you get why you are completely wrong here - "The given relation though doesn't have a transitive FD but D->B is a partial FD ". I just proved that the relation is in 3NF which is a higher normal form then 2NF and hence in turns proves that the relation is in 2NF and hence no partial dependency.
To be in BCNF, for each functional dependency X->Y, X should be a key. We see that the last functional dependency D->B violates this since D is not a key. Therefore to convert into BCNF we can break our relationship R into R1 and R2 as -
R1(A,C,D)
R2(B,D)

Functional dependencies - BCNF normalization issue

I need help about a normalization issue.
Consider a relation R(ABC)
with the following functional dependencies:
AB --> C
AC --> B
How can i modify this to Boyce–Codd normal form ?
If i leave it like this, it's a relation with a key attribute transitionally-dependent of a key-candidate.
I tried splitting into several relations but that way i lose information.
A relational schema R is in Boyce–Codd normal form if and only if for
every one of its dependencies X → Y, at least one of the following
conditions hold:
X → Y is a trivial functional dependency (Y ⊆ X)
X is a superkey for schema R
From Wikipedia
R has two candidate keys, AB and AC. It's clear that the second rule above applies here. So R is in BCNF.
If i leave it like this, it's a relation with a key attribute
transitionally-dependent of a key-candidate. I tried splitting into
several relations but that way i lose information.
I'm not quite sure what you're getting at here, but I think the terminology in English includes
prime attribute (an attribute that's part of any candidate key)
transitively dependent (but that refers to non-prime attributes)
candidate key (not key-candidate)
This relation is in BCNF
The AC and AB are super keys and the attributes B and C depend upon the super keys and so they are in BCNF
and
There is no Transitive dependency in this relation
Hope,this helps

JPQL: Inner Join without duplicate records

Below is a question which supposedly was part of the official exam from Sun:
A Reader entity has a one-to-many, bidirectional relationship with a
Book entity. Two Reader entities are persisted, each having two Book
entities associated with them. For example, reader 1 has book a and
book b, while reader 2 has book c and book d. Which query returns a
Collection of fewer than four elements?
A. SELECT b.reader FROM Book b
B. SELECT r FROM Book b INNER JOIN b.reader r
C. SELECT r FROM Reader r INNER JOIN r.books b
D. SELECT r from Book b LEFT JOIN b.reader r LEFT JOIN FETCH r.books
Given answer is C, which I believe is incorrect. From what I understand, SQL with inner join of two tables will be generated by JPA provider. Therefore, in all cases we will get 4 records. I've run a test with one-to-many relation and duplicates were included.
Who is wrong, me or Sun?
Answer from Mike Keith, EJB 3.0 co-specification lead:
There are a couple of statements related to duplicates in the spec.
The JOIN FETCH is a variation of the JOIN, but it does state that similar JOIN semantics apply (except that more data is selected). The spec (section 4.4.5.3 of JPA v2.0) gives an example of duplicate Department rows being returned despite the fact that the Employee objects are not in the select clause.
The more direct reference is in the SELECT section (section 4.8 of JPA v2.0), where it clearly states
"If DISTINCT is not specified, duplicate values are not eliminated."
Many JPA providers do in fact remove the duplicates for a few reasons:
a) Convenience of the users because some users are not knowledgable enough in SQL and are not expecting them
b) There is not typically a use case for requiring dups
c) They may be added to a result set and if object identity is maintained the dups get eliminated automatically
C is correct, joins to ToMany relationships should not return duplicates. The JPA provider should automatically use a distinct to filter these out. I believe this is what the spec requires, although it may be one of those less well defined areas of the spec.
If a join fetch is used, the I believe the spec actually requires the duplicates to be returned. Which is odd, can see why you would every want duplicates. If you put a distinct on a join fetch, then they will be filtered (in memory, as all rows need to be selected).
This is how EclipseLink works anyway.
All of the other cases select Books not readers, so get the duplicates, C selects Readers so should not get duplicates.