I work in cattle production and I am learning about database design with postgreSQL. Now I am working on an entity attribute relationship model for a database that allows to register the allocation of the pastures in which cattle graze. In the logic of this business an animal can be assigned to several grazing groups during its life. Each grazing group in turn has a duration and is composed of several pastures in which the animals graze according to a rotation calendar. In this way, at a specific time, animals graze in a pasture that is part of a grazing group.
I have a situation in which many grazing groups can be assigned to many animals as well as many pastures. Trying to model this problem I find a fan trap because there are two one-to-many relationships for a single table. According to this, I would like to ask you about how one can deal with this type of relationship in which one entity relates to two others in the form of many-to-many relationships.
I put a diagram on the problem.
model diagram
Thanks
Traditionally, using a link table (the ones you call assignment) between two tables has been the right way to do many-to-many relationships. Other choices include having an ARRAY of animal ids in grazing group, using JSONB fields etc. Those might prove to be problematic later, so I'd recommend going the old way.
If you want to keep track of history, you can add an active boolean field (to the link table probably) to indicate which assignment is current or have a start date and end date for each assignment. This also makes it possible to plan future assignments. To make things easier, make VIEWs showing only current assignment and further VIEWs to show JOINed tables.
Since there's no clear question in your post, I'd just say you are going the right way.
Related
In the Parse.com API reference for Swift on iOS, it is very clear when to use the different kinds of One-to-Many relationships, based on the expected size of the Many side.
But I find it less clear on what kind of Many-to-Many relationships to use when both sides could be very large.
In my case, I have a Charity object that my Users can make small (often one-dollar) contributions to--so each User could conceivably make thousands of these contributions, and each Charity could have thousands of Users making contributions to it.
The Many-to-Many options listed for this kind of thing are Parse Relations, Join Tables, and Arrays, of which the docs explain:
Arrays should be used when the relationship will reliably include under 100 references, which is very clear and helpful guidance that I should not use Arrays.
The docs say Parse Relations could be used, for instance, to connect Books with multiple Authors and Authors with multiple Books--a situation in which a given Book is unlikely to have over 100 Authors, and only rarely will an Author have over 100 Books--so it's unclear if this is appropriate when both sides could be very large, as in my case.
The docs say Join Tables should be used when extra metadata should be attached to each relationship, so for one thing, I don't at present have an explicit need for this, and for another, the docs don't seem to even mention anything about how or if it matters how large each side of the Many-to-Many relationship is.
In the absence of any other information, it looks like I should use Join Tables, but only because the docs don't imply that I shouldn't, and not for the reason the docs say I should.
Which seems like a flimsy rationale.
I would greatly appreciate any guidance anyone can give.
Behind the scenes, when you use Relation, Parse Server automatically creates a Joint Table for you and delivers some APIs for easily managing and fetching its data. So, in terms of performance, it should be very similar.
The downside of the Relation is the impossibility to add new fields to this "Joint Table" it creates. So, if you need, for example, to store the charities that each of the users like, a relation between User and Charity would be a good fit, because you just need to store that the relation exists and do not need to store any extra information.
On the other hand, if you need to store the donations that each user did to each of the charities, I'd create a Joint Table called Donation or UserCharity with a pointer to the User class, a pointer to the Charity class, and the value of the donation. In this case, Relation is not a fit because you need to store the donation value.
Have a question about architecture: I have 2 subjects, DocumentLetter and DocumentOther, both should be approved by managers.
What would be better: to use 2 additional models DocumentLetterApprove and DocumentOtherApprove with entity relations, OR one additional table without relations but contains info about model identity (columns ModelName and ModelID)?
Or another example, attachments for different documents.
Letter, contract - 2 different tables and each should have own attachment.
I can use additional table for each model (for letter and for contract) or create one table with fields field ModelName and ModelID?
Personally, I would favor keeping the separate entities /w the relationships if there is any possibility that the related entities (approvals) could be in any way different depending on what they are applied to. I avoid ambiguously linked tables unless they represent a large 1 to many entity that might be associated to one of a number of other entities.
The problem with using something like a "ParentType" + "ParentId" is that you cannot leverage any form of FK constraint between the related tables. This also means you cannot leverage EF relationships given there will probably be times loading one of documents and wanting to know if it is approved and details from the approval.
If an Approval for the different document types is expected to be identical then I would sooner declare a common Approval table/Entity and put an ApprovalId on each of the document type tables to establish a many-to-1 relationship from the document to the approval.
If an approval is identical and can form a many to many, then a suitable many-to-many relation table DocumentLetter - DocumentLetterApproval (FKs) - Approval (Approval details) can be employed.
If a Letter approval vs. other approval could be different then: DocumentLetter - DocumentLetterApproval (Approval details)
Design decisions like this usually come from considerations around DRY (Don't Repeat Yourself). What advice I can give is that KISS (Keep It Stupidly Simple) should trump DRY, and that DRY should only apply to logic/structure that is proven to be identical. (not merely expected to be identical, or worse, expected to be similar) DRY should be a re-factoring consideration for constant improvement, not an up-front design decision. Coding for DRY too early ends up costing you time when you paint yourself into corners. By keeping code fluid these relationships can be proven, then if they are proven to be identical, re-factored into a single entity. Time is still spent re-factoring, but re-factoring to make code structure better rather than making code worse when having to work around design assumptions.
An example where i might consider an ambiguous loosely linked linked table would be something like File Attachments. I might have several entities that can hold references to 1 or more attachments. Attachments are not something I would need to link to often, but rather through an explicit action that I could fire off an additional query for anyways since I'm not about to pre-load attachment details when loading a document. In this case an attachment table might have a ParentType and ParentId indexed so that I can quickly get details for a particular document or other entity. I would never try to do something like Context.Documents.Include(x => x.Attachments) or the like, there would be no such reference available. Attachments would always be accessed by single document so I would resort to Context.Attachments.Where(x => x.ParentType == ParentTypes.DocumentLetter && x.ParentId == documenLetterId).ToList();
I have experience working on systems that were designed solely with these types of ambiguously linked tables. They are not only extremely slow as they scale to any reasonable size, but they are also extremely error prone as systems evolve and the nature of the relationships change. Records have a tendency to get out of sync with the expected rules.
I've read through a bunch of tutorials to the best of my ability, but I'm still stumped on how to handle my current application. I just can't quite grasp it.
My application is simply a read-only directory that lists employees by their company, department, or sorted in alphabetical order.
I am pulling down JSON data in the form of:
Employee
Company name
Department name
First name
Last name
Job title
Phone number
Company
Company name
Department
Company name
Department name
As you can see, the information here is pretty redundant. I do not have control over the API and it will remain structured this way. I should also add that not every employee has a department, and not every company has departments.
I need to store this data, so that it persists. I have chosen Core Data to do this (which I'm assuming was the right move), but I do not know how to structure the model in this instance. I should add that I'm very new to databases.
This leads me to some questions:
Every example I've seen online uses relationships so that the information can be updated appropriately upon deletion of an object - this will not be the case here since this is read-only. Do I even need relationships for this case then? These 3 sets of objects are obviously related, so I am just assuming that I should structure it this way. If it is still advised to create relationships, then what do I gain out of creating those relationships in a read-only application? (For instance, does it make searching my data easier and cleaner? etc.)
The tutorials I've looked at don't seem to have all of this redundant data. As you can see, "company name" appears as a property in each set of objects. If it would be advised that I create relationships amongst my entities (which are Employee, Company, Department), can someone show me how this should look so that I may get an idea of what to do? (This is of course assuming that I should use relationships in my model.)
And I would imagine that this would be the set of rules:
Each company has many or no departments
Each department has 1 or many employees
Each employee has 1 company and 1 (or no) department
Please let me know if I'm on the right track here. If you need clarification, I will try my best.
Yes, use relationships. Make them bi-directional.
The redundant information in your feed doesn't matter, ignore it. If you received partial data it could be used to build the relationships, but you don't need to use it.
You say this data comes from an API, so it isn't read-only as far as the app is concerned. Worry more about how you're going to use the data in the app than how it comes from the server when designing your data model.
I am designing an app which tracks data on Game objects. Each Game has a name, a date and other attributes. The problem I am having arises because I want the user to be able to add more names (for example) to pick from in the application. (in this case from a UITableView). So the user is presented with a list of names to choose from, and if the one they want is not in the list, they can add one to the list.
My solution is that I currently have a second entity called GameName so that I can show the user a list of those game names to pick from when they are adding a new Game. I just call an NSFetchRequest on all the GameName objects and display them in the UITableView. There doesn't have to be a Game object created yet to do this.
My dilemma is that I want to know if this is a good practice. It seems that if I do it this way, I will end up having a lot of entities with just one attribute, for the sake of allowing the user to pick from and add to a customizable list.
I hope this makes sense. I can clarify anything upon request.
Your approach is fine, and is commonly used in database design. The entity you want to add is called a "domain table" in databases. See this page, in particular this paragraph:
In a normalized data model, the reference domain is typically specified in a reference table. Following the previous example, a Gender reference table would have exactly two records, one per allowed value—excluding NULL. Reference tables are formally related to other tables in a database by the use of foreign keys.
Of course, you probably want to have an optional relationship between the GameName and Game entities.
This may be a dumb question, but I've always wondered what's the best way to do this.
Suppose we have a database with two tables: Users and Orders (one user can have many orders), and in any OOP language you have two classes to represent those tables User and Order. In the database it's evident that the 'order' will have the 'user' ID because it's a one to many relationship (because one user can have many orders) and the user won't have any order ID. But in code what's the best practice out of the following three?
a) Should the user have an array of Orders?
b) Should the order have the user ID?
c) Should the order have a reference to the user object?
Or are there more efficient ways to tackle this? I've always done it in different ways, they all have both pros and cons, but I've never asked an expert's opinion.
Thanks in advance!
In this instance, the User could have an array of orders if you're performing operations on the User that also involves orders that they own.
Whenever I design my classes, objects that are related contain pointers to each other, so I can access the Orders from the User and the User from an Order.
I don't believe there is a best practice as it really depends on what you're trying to accomplish. With Users and Orders, I could see you starting with an Order and needing to access the User and vice versa; therefore, in your situation it sounds like you should map the objects both ways.
One word of warning, just be careful not to create a circular reference. If you delete both objects without removing the reference, it could create a memory leak.
You are asking about what is known as "object relational mapping" (ORM). I think the best way to learn what you want to learn is to look at some well established ORM libraries [such as ActiveRecord(Ruby) or Hibernate (Java)] and see how they do it.
With that in mind:
a) If the application requires it there should be access to an array (or similar enumeration) of objects representing the users orders through the user object. However this will usually best involve lazy loading (i.e. the orders will usually not be pulled from the database when the user pulled from the database....the orders will be subsequently queried when the application needs access to them). After objects are lazy loaded they can be cached by the ORM to eliminate the need for further queries on that istantiation.
b) Unless for performance reasons you only pull specific columns you're usually going to pull all columns when pulling an order. So it would include the user id.
c) Answer a applies to this as well.