I have two tables/collections; Users and Groups. A user can be a member of any number of groups and a user can also be an owner of any number of groups. In a relational database I'd probably have a third table called UserGroups with a UserID column, a GroupID column and an IsOwner column.
I'm using MongoDB and I'm sure there is a different approach for this kind of relationship in a document database. Should I embed the list of groups and groups-as-owner inside the Users table as two arrays of ObjectIDs? Should I also store the list of members and owners in the Groups table as two arrays, effectively mirroring the relationship causing a duplication of relationship information?
Or is a bridging UserGroups table a legitimate concept in document databases for many to many relationships?
Thanks
What I've seen done, and what I currently use are embedded arrays with node id's in each document.
So document user1 has property groups: [id1,id2]
And document group1 has property users: [user1]. Document group2 also has property users: [user1].
This way you get a Group object and easily select all related users, and the same for the User.
This takes a bit more work when creating and updating the object. When you say 2 objects are related, you have to update both objects.
There's also a concept DBReferences in MongoDB and depending on your driver, it'll pull referenced objects automatically when retrieving a document.
http://www.mongodb.org/display/DOCS/Database+References#DatabaseReferences-DBRef
In-case anyone interested, I just bumped into a very good article posted in mongoDB blog. 6 Rules of Thumb for MongoDB Schema Design. There are 3 parts in this article, after reading all 3 you'll have a good understanding.
Let's understand Many to Many Relations with an examples
books to authors
students to teachers
The books to authors is a few to few relationship, so we can have either an array of books or authors inside another's document. Same goes for students to teachers. We could also embed at the risk of duplication. However this will required that each student has a teacher in the system before insertion and vice versa. The application logic may always not allow it. In other words, the parent object must exist for the child object to exist.
But when you have many to many relationship, use two collections and have a true linking.
Related
I work in cattle production and I am learning about database design with postgreSQL. Now I am working on an entity attribute relationship model for a database that allows to register the allocation of the pastures in which cattle graze. In the logic of this business an animal can be assigned to several grazing groups during its life. Each grazing group in turn has a duration and is composed of several pastures in which the animals graze according to a rotation calendar. In this way, at a specific time, animals graze in a pasture that is part of a grazing group.
I have a situation in which many grazing groups can be assigned to many animals as well as many pastures. Trying to model this problem I find a fan trap because there are two one-to-many relationships for a single table. According to this, I would like to ask you about how one can deal with this type of relationship in which one entity relates to two others in the form of many-to-many relationships.
I put a diagram on the problem.
model diagram
Thanks
Traditionally, using a link table (the ones you call assignment) between two tables has been the right way to do many-to-many relationships. Other choices include having an ARRAY of animal ids in grazing group, using JSONB fields etc. Those might prove to be problematic later, so I'd recommend going the old way.
If you want to keep track of history, you can add an active boolean field (to the link table probably) to indicate which assignment is current or have a start date and end date for each assignment. This also makes it possible to plan future assignments. To make things easier, make VIEWs showing only current assignment and further VIEWs to show JOINed tables.
Since there's no clear question in your post, I'd just say you are going the right way.
I've read through a bunch of tutorials to the best of my ability, but I'm still stumped on how to handle my current application. I just can't quite grasp it.
My application is simply a read-only directory that lists employees by their company, department, or sorted in alphabetical order.
I am pulling down JSON data in the form of:
Employee
Company name
Department name
First name
Last name
Job title
Phone number
Company
Company name
Department
Company name
Department name
As you can see, the information here is pretty redundant. I do not have control over the API and it will remain structured this way. I should also add that not every employee has a department, and not every company has departments.
I need to store this data, so that it persists. I have chosen Core Data to do this (which I'm assuming was the right move), but I do not know how to structure the model in this instance. I should add that I'm very new to databases.
This leads me to some questions:
Every example I've seen online uses relationships so that the information can be updated appropriately upon deletion of an object - this will not be the case here since this is read-only. Do I even need relationships for this case then? These 3 sets of objects are obviously related, so I am just assuming that I should structure it this way. If it is still advised to create relationships, then what do I gain out of creating those relationships in a read-only application? (For instance, does it make searching my data easier and cleaner? etc.)
The tutorials I've looked at don't seem to have all of this redundant data. As you can see, "company name" appears as a property in each set of objects. If it would be advised that I create relationships amongst my entities (which are Employee, Company, Department), can someone show me how this should look so that I may get an idea of what to do? (This is of course assuming that I should use relationships in my model.)
And I would imagine that this would be the set of rules:
Each company has many or no departments
Each department has 1 or many employees
Each employee has 1 company and 1 (or no) department
Please let me know if I'm on the right track here. If you need clarification, I will try my best.
Yes, use relationships. Make them bi-directional.
The redundant information in your feed doesn't matter, ignore it. If you received partial data it could be used to build the relationships, but you don't need to use it.
You say this data comes from an API, so it isn't read-only as far as the app is concerned. Worry more about how you're going to use the data in the app than how it comes from the server when designing your data model.
I am new to MongoDB moving in from traditional SQL relational approach. I am working on a simple “Category has many Products” scenario (c#.Net). Where Category has
List<Product>
My questions are.
Question 1: On Add Product screen I have a drop down for Categories. So on Submit should I
First Insert Product in Products Collection and then
Push this Product in Nested Product of Categories collection.
_categoryCollection.Update(id, Update< Category>.Push…)
Question 2:
Or
We shouldn’t just have anything called “Product Collection”. Instead we should have only one Categories collection with Nested Products in it. And on submit just Push this new Product in respective Category.
Question 2.1 : What if we want to do this association for product with category after the product is added. ?
Or
Question 3:
Considering question one. Should we have CategoryId in Product entity ? does this makes any sense in No SQL concepts ?
I've always found this MongoDB article a good resource for such questions.
http://docs.mongodb.org/ecosystem/use-cases/product-catalog/
The questions you need to ask are, how will the data be accessed? What are my objects and how are they formed? Start with your programming first, create your classes (domain objects) and your access patterns, then worry about Mongo. You'll see, Mongo won't really get in your way. That is what is was meant to do.
So, going back to your scenario. If you know the categories are going to be big in number, need to be tightly controlled and manipulated often, then you could have a second collection for them and reference back to that collection's _id field in your category field in the product documents. Important is, the values themselves for the categories should be stored with each product document, in order to have fast reads due to one less query or the need to join the data.
Scott
A few considerations can be made here:
If a category has many products but a product cannot belong to more than one category, then
If number of products is not expected to be very large per category, then
Nest products inside category document
Else, use a different collection for products and use field 'categoryId' in them
Else, use use a different collection for products and use field 'categoryId' in them
Nest documents only when they have one definite parent and they are not huge or too many. Otherwise, the parent document will get huge with no way to control its size.
Let's say I'm making a social app with MongoDB database, and I want users to be able befriend each other. Of course friendship is a mutual relation and user ids are integers. What would be the best approach?
Every user has a list of friend ids. Every time a bond is created/severed, both users' lists have to be updated.
Create join table 'friendship' containing IDs of 2 users. Every time bond is created I have to create two entries. 1->2 and 2->1
As no. 2, but always create only 1 bond with rule: lower_usr_id -> higher_usr_id. Assuming there are a lot of people and friendships. Wouldn't it save a lot of space and time?
It sounds like you're rather unclear about how MongoDB works. Joins aren't something that appears in MongoDB, and if you're trying to use MongoDB like a relational database you're doing it wrong.
I'm no expert on MongoDB, but I believe there are two common methods of modelling a one-to-many relationship:
Embedding one document inside another
Using references
Embedding a document inside another makes sense where the parent document in some sense "owns" the child document. For instance, in the context of a blogging application, a comment is owned by a post, so it might make sense to embed the comment inside the post.
For your use case, I don't believe that would be appropriate since the relationship is between objects of the same type. It would therefore make sense to record friendships as a reference to another object in the same collection.
Check out this link for further details.
This may be a dumb question, but I've always wondered what's the best way to do this.
Suppose we have a database with two tables: Users and Orders (one user can have many orders), and in any OOP language you have two classes to represent those tables User and Order. In the database it's evident that the 'order' will have the 'user' ID because it's a one to many relationship (because one user can have many orders) and the user won't have any order ID. But in code what's the best practice out of the following three?
a) Should the user have an array of Orders?
b) Should the order have the user ID?
c) Should the order have a reference to the user object?
Or are there more efficient ways to tackle this? I've always done it in different ways, they all have both pros and cons, but I've never asked an expert's opinion.
Thanks in advance!
In this instance, the User could have an array of orders if you're performing operations on the User that also involves orders that they own.
Whenever I design my classes, objects that are related contain pointers to each other, so I can access the Orders from the User and the User from an Order.
I don't believe there is a best practice as it really depends on what you're trying to accomplish. With Users and Orders, I could see you starting with an Order and needing to access the User and vice versa; therefore, in your situation it sounds like you should map the objects both ways.
One word of warning, just be careful not to create a circular reference. If you delete both objects without removing the reference, it could create a memory leak.
You are asking about what is known as "object relational mapping" (ORM). I think the best way to learn what you want to learn is to look at some well established ORM libraries [such as ActiveRecord(Ruby) or Hibernate (Java)] and see how they do it.
With that in mind:
a) If the application requires it there should be access to an array (or similar enumeration) of objects representing the users orders through the user object. However this will usually best involve lazy loading (i.e. the orders will usually not be pulled from the database when the user pulled from the database....the orders will be subsequently queried when the application needs access to them). After objects are lazy loaded they can be cached by the ORM to eliminate the need for further queries on that istantiation.
b) Unless for performance reasons you only pull specific columns you're usually going to pull all columns when pulling an order. So it would include the user id.
c) Answer a applies to this as well.