I am working with an ERD. It is supposedly a logical model and I am to make a physical model from it. I should be formatting in UML and our DBMS is PostgreSQL.
Some of my research (http://www.1keydata.com/datawarehousing/data-modeling-levels.html // http://en.wikipedia.org/wiki/Logical_data_model#Conceptual.2C_Logical_.26_Physical_Data_Model) indicates that this ERD may have too much information in it to be a logical model and that it may actually be closer to the physical.
My questions are as follows:
What do the bold labels mean?
What do the white "N"s and red "U"s at the end of some entries mean?
What is the difference between a dashed line (relationship) and a solid one?
What is the difference between the "crows foot" and the broken line on either end of the relationship?
Is this closer to the physical model or logical model? What would I have to do to convert it from one to the other?
This is the ERD:
Could bold text indicate primary key attributes?
That's not part of any standard ER modelling notation. By no means certain but my guess would be U means unique, N means nullable.
A solid line means an identifying relationship. Dashed line means a non-identifying one. It's usually not an especially important distinction but look those terms up if you want to know more.
One to many relationship. The crows foot represents the "many" side of the relationship; the short line across is the "one" side. Where the "one" symbol appears at both ends that's a one-to-one relationship.
In the context of information modelling a logical model means a semantic model - a model that's more about the business domain than about an actual database design. Exactly what goes into the logical model and at what level of detail depends a lot on the intended audience for the model and on how you want to use it. Turning it into a "physical" model means making it into a design for a database with the technical features and any changes you would need for your chosen DBMS platform (specific data types for example).
Logical/physical models in the information modelling sense should not be confused with what are termed the logical level and physical level in DBMS architecture and database theory. In principle relational database tables (AKA relation variables) are always "logical" level constructs but in data modelling terms they are part of a so-called "physical" model. That unfortunate choice of modelling terminology is responsible for a lot of confusion and misunderstandings.
Related
I have an ERD Diagram of an E-commerce with the following entities Product , Tag , ProductTag,Category and other entities of course.
I tried to convert it into class diagram as follows:
1- removed the id
2- converted the foreign key into object of the type i'm refering to(product_id converted into => product: Product)
my question is , is this good approach to follow on all my entities? does it like achieve the SOLID principle? I have a presentation in 2 days and I want to be very sure of what I have made , any comment or modification would be really enough .I also chose these tables because they represent one to many and many to many. thanks in advance.
Basically your approach is correct. It's just a couple of UML specifications you got wrong.
The label in the middle of the connectors is just the name of the connector. Unless you do some OCL wizardry this name is meaningless. There is a way to adorn it with a black triangle to show the reading direction. This sometimes helps business people to understand how classes are related to each other (see Fig. 11.27 on p. 202 of UML 2.5). But usually you would not use it.
The shared aggregation has no semantics (p. 110 of UML: Indicates that the Property has shared aggregation semantics. Precise semantics of shared aggregation varies by application area and modeler.). So leave the open diamond away. Composite (filled diamond) can be used to show responsibility (when I'm killed I will kill my composites first). Usually it adds too little to be really useful, it only heats up the futile composition-discussion.
The navigation-direction is incorrect. The AC in the middle sees both connected classes so it's shown without any arrow. If you have an additional (directed) association you place it as lone (extra) connector. In that case put role names towards any end. That makes navigation clearer than just a simple arrow. I for myself use arrows only on rough sketches on the drawing board.
P.S. Just noticing that you have operations in your classes that have the same name as the class and take one paramter being also the class. I would guess you intend to show a constructor here. In that case you would make it Classname():Classname and provide only the paramaters that are needed for the constructor. Else these opreations don't seem to make much sense. Similarly the CRUD operations seem to work on a list of 'itself' which is also probably not desired. You would have a collection class which handles the base class where these operation make sense. So to summarize: you would only add getter/setter operations for the (private) properties matching the columns from your table.
P.P.S.: As per Christophe's comment it's a good idea to adorn the class instantiation operation with a <<create>> stereotype which highlights its purpose. See p. 196 of UML 2.5:
This stereotype is part of the standard (see p. 677) and the table on p. 678 states:
Specifies that the designated feature creates an instance of the classifier to which the feature is attached.
On the modeling part of your question, there’s already a perfect answer. For the records, I’d nevertheless like to add a complementary answer on the SOLID part:
Single responsibility: your classes have more than one reason to change, because you may want to change Product for what it is (e.g. add more product-related attributes), but you may also want to change the class to add new getByXxx() operations to find products in the database based on other criteria, independently of what a product really is. SO it's not complying.
Open-closed principle: we cannot tell
Liskov substitution principle: in absence of inheritance, this is not relevant. Moreover, you couldn't tell without having precondition, postcondition and invariant constraints.
Interface segregation principe: is probably not compliant, because you impose an implicit interface that all inheriting class would have to provide, even if they don't need it (e.g. products not stored in a database). A first step in the right direction, would be to use an interface for the common database operations.
Dependency inversion: we cannot tell but probably it isn't , because update(), delete(),... probably depends on some database, so that you can't switch it to another database. With DIP, you'd inject the database in the class that use it, so that you could at any moment inject another database that offers the same interface.
You didn't ask, but your design seems to correspond to active records. If you want to go for a cleaner, more SOLID design, you should prefer factor out the database related code to either repositories or table data gateways.
First time posting here as I was told to seek help from this community if I was ever stuck!!
I was recently introduced to databases this semester and I have a hard time grasping the bridge entity that is meant to erase the many-to-many relationships.
The classic example would be the relationship between STUDENT and CLASS;
where STUDENT can be in many CLASSES and a CLASS can have many STUDENTS.
The M-M relationship is fixed by introducing the ENROLL entity. Here we would read: a STUDENT can ENROLL in many CLASSES, and a CLASS may have many STUDENTS ENROLLED in it, however each STUDENT can be ENROLLED in a CLASS only once.
In my case, I tried to fix a M-M relationship issue between PRODUCT and RAW MATERIAL for a pharmaceutical company by introducing an INGREDIENT entity, which looks like this:
RAW MATERIAL 1----M INGREDIENT M----1 PRODUCT
I am not sure if the bridge works out because I have trouble interpreting it like the STUDENT-CLASS example above.
How would you interpret this?
The concept of "bridge" or "associative" entities came from network data modeling and was a way of handling many-to-many binary as well as ternary and higher relationships. Network data modeling is a simple physical data model based on representing entities as records and relationships as references/pointers.
Since the 1970s, the relational model of data has been developed which uses relations (tables) to record relationships between sets of values (which represent business entities, measurements and labels), allowing for the direct representation of many-to-many relationships and ternary and higher relationships.
The entity-relationship model was an attempt to provide more conceptual structure on top of the relational model, by distinguishing entity relations from relationship relations.
My point with the history is that in modern data modeling, we no longer resolve or erase many-to-many (or ternary or higher) relationships (unless you're using an object-relational mapper or framework based on the network data model). Tables with composite keys, consisting of two or more entity keys, directly represent relationships, and allow us to handle attributes on relationships as well, another feature missing from network data modeling.
In your case, it may be useful to add a Quantity attribute on your Ingredient relationship. The interpretation here is that Raw material refers to a type of material rather than a specific piece or selection of raw material. Students have identity, raw materials generally don't.
Note that pharmaceutical companies may well track specific batches of raw materials.
Whenever a proper ER diagram is drawn for a database and then mapped to the relational schema, I was informed that it guarantees 3NF.
Is this claim true?
If not, can anyone provide me a counter example.
Also, please tell me whether any normal form can be claimed to be strictly followed when relational schema is mapped from a perfect ER diagram?
The short answer is no. Depending on the analysis and design approach there could be examples of ER models that appear perfectly sound in ER terms but don't necessarily translate to a relational schema in 3NF. ER modelling and notation is not really expressive enough or formal enough to guarantee that all functional dependencies are correctly enforced in database designs. Experienced database designers are conscious of this and apply other techniques to come up with the "proper" design.
Terry Halpin devised a formal method for database design that guarantees a relational schema satisfying 5th Normal Form (see orm.net). He uses the Object Role Modelling approach, not ER modelling.
The diagram just shows what entities and attributes you have and how entities relate to one-another. Your attributes can violate the normal forms. An ER diagram is just a representation, it does not enforce any rules.
There is nothing about representing a model in an ER diagram that implies satisfaction of 3NF.
The thinking behind the erroneous claim may be based on the idea that when you, for example, convert a repeating group from columns to rows in a child table, or remove partially dependent columns to another table, you are increasing the normal form of your relations. However, the diagrammatic convention doesn't enforce this in any way.
Let's see an example (in oracle):
CREATE TABLE STUDENT (
ID INTEGER PRIMARY KEY,
NAME VARCHAR2(64) NOT NULL,
RESIDENCE_STREET VARCHAR2(64),
RESIDENCE_CITY VARCHAR2(64),
RESIDENCE_PROVINCE VARCHAR2(64),
RESIDENCE_POSTALCODE NUMBER(8)
);
In some countries postal code uses prefixes to identify the region or province, so RESIDENCE_PROVINCE has a functional dependency from RESIDENCE_POSTALCODE. But RESIDENCE_POSTALCODE is a non-prime attribute. Then this easy and common example is "legal" and it is not in 3NF.
I'm done DDD for a couple of years now and still its challenging when it comes to designing Aggregates. Thats the fun part of DDD and it makes your head spin. I'm asking this question since I'm architect in a project and we're in the middle of designing the model. Its an iteration when model evolves parallel with GUI and requirement gathering together with customer.
Now to the problem. Our scenario is that we are facing some Aggregates that are growing into very large AR's. I think I'm good at finding Value objects and avoiding the anemic domain model trap. But I've never been in this situation.
One example is that our system should represent a mobile telecom antenna. The antenna is located on a green field. But the antenna can have a shelter with equipment. Antenna can have microwave links, it can have fiber lines in ground, it can have radio elements, it can have power supply. Face it. If Antenna is terminated... all these dependencies are removed as well. Since they are part of the installation (except for the green field :))
But You get the picture. The antenna model is complex... And large AR's are inflexible regarding to concurrency locks, performance, memory consumption.
After reading Vaughn Vernons very good paper on Effective AR design http://dddcommunity.org/library/vernon_2011/ I realize that We need to start chopping our big AR's up in pieces.
My Idea is to do like Vernon suggest to move out for example MicrowaveLinks to a separate AR (even if its not in reality).
The MicrowaveLink Entity, now AR, is reference Antenna by Id. In MicrowaveLink Entity class we have a value object property that is AntennaId.
Our Uses cases support this scenario. We rarely list antenna and links together. So loading MicrowaveLinks is possible through a MicrowaveLinkRepository.ListByAntenna(Guid antennaId)
1) Have you done this AR split before and how did you do it?
2) Did you manage to support this AR --> AR relationship intact through both domain constraints and DB (we use EF 5 as ORM)?
My optimal goal is to be able to skip a Antenna.Microwaves Collection on Antenna. So Antenna are not aware if Links. The Links are aware of what Antenna they are mounted on.
And At MicrowaveLink Entity I only want a AntennaId Property, with hopefully, a DB Constraints that make sure that Antenna exists.
I'm aware of that I can manually add FK constraints in Seed method in EF or in DB directly through T-SQL scripting. But can this relationship be supported in some way by EF5 Code First Fluent mapping?
By the sounds of it you have an Installation AR. When requiring an AR in another you should model the contained AR as a only the ID in the container or a VO if required.
You need to have hard edges around your ARs.
Back to the Order / OrderLine example :)
An OrderLine seems to 'require' a Product but you shouldn't ever give a Product instance tot eh OrderLine. Instead only model, say, the ProductName and ProductId as a VO in the OrderLine. Now you have a distinct edge to your Order AR.
Hope that helps somewhat.
We are developing an extension (in C# .NET env.) for a GIS application, which will has predefined types
for modeling the real world objects, start from GenericObject, and goes to more specific types like Pipe and Road with their detailed properties and methods like BottomOfPipe, Diameter and so on.
Surely, there will be an Object Model, Interfaces, Inheritance and lots of other essential parts in the TypeLibrary, and by now we fixed some of them. But as you may know, designing an Object Model is a very ambiguous work, and (I as much as I know), can be done in many different ways and many different results and weaknesses.
Is there any distinct rules in designing O.M.: the Hierarchy, the way of defining Interfaces, abstract and coclasses enums?
Any suggestion, reference or practice?
A couple of good ones:
SOLID
Single responsibility principle
Open/closed principle
Liskoff substitution principle
Interface segregation principle
Dependency inversion principle
More information and more principles here:
http://mmiika.wordpress.com/oo-design-principles/
Check out Domain-Driven Design: Tackling Complexity in the Heart of Software. I think it will answer your questions.
what they said, plus it looks like you are modeling real-world entities, so:
restrict your object model to exactly match the real-world entities.
You can use inheritance and components to reduce the code/model, but only in ways that make sense with the underlying domain.
For example, a Pipe class with a Diameter property would make sense, while a DiameterizedObject class (with a Diameter property) with a GeometryType property of GeometryType.Pipe would not. Both models could be made to work, but the former clearly corresponds to the problem domain, while the latter implements an artificial (non-real-world) perspective.
One additional clue: you know you've got the model right when you find yourself discovering new features in the code that you didn't plan from the start - they just 'naturally' fall out of the model. For example, your model may have Pipe and Junction classes (as connectivity adapters) sufficient to solve the immediate problem of (say) joining different-diameter pipes to each other and calculating flow rates, maximum pressures, and structural integrity. You later realize that since you modeled the structural and connectivity properties of the Pipes and Junctions accurately (within the requirements of the domain) you can also create a JungleGym object from connected pipes and correctly calculate how much structural load it will bear.
This is an extreme example, but it should get the point across: correct object models support extension and often manifest beneficial unexpected properties and features (not bugs!).
The Liskov Substitution Principle, often expressed in terms of "is-a".
Many examples of OOP would be better off making use of "has-a" (in c++ private inheritance or explicit composition) rather than public inheritance ("is-a")
Getting Inheritance right is hard. Doing so with interfaces (pure virtual classes) is often easier than for base/sub classes
Check out the "principles" of Object oriented design. These have guidelines for all the questions you ask.
References:
"Object oriented software construction" by Robert Martin
http://www.objectmentor.com/resources/publishedArticles.html
Checkout the "Design Principles" articles at the above site. They are the best references available.
"BottomOfPipe"? Is that another way of saying the depth of the Pipe below the Road?
Any kind of design is difficult and can be done different ways. There are no guarantees that your design will work when you create it.
The advantage that people who design ball bearings and such have is many more years of experience and data to determine what works and what does not. Software doesn't have as much time or hard data.
Here's some advice:
Inheritance means IS-A. If that doesn't hold, don't use inheritance.
A deep hierarchy is probably a sign of trouble.
From Scott Meyers: Make non-leaf classes interfaces or abstract.
Prefer composition to inheritance.