Let's say I'm making a social app with MongoDB database, and I want users to be able befriend each other. Of course friendship is a mutual relation and user ids are integers. What would be the best approach?
Every user has a list of friend ids. Every time a bond is created/severed, both users' lists have to be updated.
Create join table 'friendship' containing IDs of 2 users. Every time bond is created I have to create two entries. 1->2 and 2->1
As no. 2, but always create only 1 bond with rule: lower_usr_id -> higher_usr_id. Assuming there are a lot of people and friendships. Wouldn't it save a lot of space and time?
It sounds like you're rather unclear about how MongoDB works. Joins aren't something that appears in MongoDB, and if you're trying to use MongoDB like a relational database you're doing it wrong.
I'm no expert on MongoDB, but I believe there are two common methods of modelling a one-to-many relationship:
Embedding one document inside another
Using references
Embedding a document inside another makes sense where the parent document in some sense "owns" the child document. For instance, in the context of a blogging application, a comment is owned by a post, so it might make sense to embed the comment inside the post.
For your use case, I don't believe that would be appropriate since the relationship is between objects of the same type. It would therefore make sense to record friendships as a reference to another object in the same collection.
Check out this link for further details.
Related
In the Parse.com API reference for Swift on iOS, it is very clear when to use the different kinds of One-to-Many relationships, based on the expected size of the Many side.
But I find it less clear on what kind of Many-to-Many relationships to use when both sides could be very large.
In my case, I have a Charity object that my Users can make small (often one-dollar) contributions to--so each User could conceivably make thousands of these contributions, and each Charity could have thousands of Users making contributions to it.
The Many-to-Many options listed for this kind of thing are Parse Relations, Join Tables, and Arrays, of which the docs explain:
Arrays should be used when the relationship will reliably include under 100 references, which is very clear and helpful guidance that I should not use Arrays.
The docs say Parse Relations could be used, for instance, to connect Books with multiple Authors and Authors with multiple Books--a situation in which a given Book is unlikely to have over 100 Authors, and only rarely will an Author have over 100 Books--so it's unclear if this is appropriate when both sides could be very large, as in my case.
The docs say Join Tables should be used when extra metadata should be attached to each relationship, so for one thing, I don't at present have an explicit need for this, and for another, the docs don't seem to even mention anything about how or if it matters how large each side of the Many-to-Many relationship is.
In the absence of any other information, it looks like I should use Join Tables, but only because the docs don't imply that I shouldn't, and not for the reason the docs say I should.
Which seems like a flimsy rationale.
I would greatly appreciate any guidance anyone can give.
Behind the scenes, when you use Relation, Parse Server automatically creates a Joint Table for you and delivers some APIs for easily managing and fetching its data. So, in terms of performance, it should be very similar.
The downside of the Relation is the impossibility to add new fields to this "Joint Table" it creates. So, if you need, for example, to store the charities that each of the users like, a relation between User and Charity would be a good fit, because you just need to store that the relation exists and do not need to store any extra information.
On the other hand, if you need to store the donations that each user did to each of the charities, I'd create a Joint Table called Donation or UserCharity with a pointer to the User class, a pointer to the Charity class, and the value of the donation. In this case, Relation is not a fit because you need to store the donation value.
Consider a scenario of an application where I have users and projects and the requirement is users shall be assigned to projects. One user can be assigned to multiple projects. This is a many to many relationship. So what is the best way to model such a requirement.
I will like to discuss few approaches to model such a requirement :
- Embeded data model
In this approach I will embedd the user documents inside projects document.
Advantages : you get all the required data in one API call OR by fetching one single document.
Disadvantages : Data duplicacy which is OK
Real problem is if you update user information for eg user mobile no or name from users screen then this updated information should also be reflected under all embedded user documents. For this some bulk update query should be fired.
But is this the right way ???
- Embedding object references instead of objects (which is normalised)
In this case if we embedd user id's instead of user objects then the problem mentioned above wont be there but then we will have to make multiple network calls to get required data or make a seperate relation kond of document as we do in SQL.
Is this the best way ??
We have a same scenario, so i embed objectId. and for fill data for clients, populate users data in find function.
contract.find({}).populate('user').then(function(){});
There are few hard and fast rules, but usually with many-to-many relationships you would prefer references over embedding. This doesn't mean your data is totally flat/normalized.
For example, you could have a user document with an array of project ids. You could have the reverse for projects.
Think about your queries and how you will structure them. That can give you other hints about how to structure your documents.
I am new to MongoDB moving in from traditional SQL relational approach. I am working on a simple “Category has many Products” scenario (c#.Net). Where Category has
List<Product>
My questions are.
Question 1: On Add Product screen I have a drop down for Categories. So on Submit should I
First Insert Product in Products Collection and then
Push this Product in Nested Product of Categories collection.
_categoryCollection.Update(id, Update< Category>.Push…)
Question 2:
Or
We shouldn’t just have anything called “Product Collection”. Instead we should have only one Categories collection with Nested Products in it. And on submit just Push this new Product in respective Category.
Question 2.1 : What if we want to do this association for product with category after the product is added. ?
Or
Question 3:
Considering question one. Should we have CategoryId in Product entity ? does this makes any sense in No SQL concepts ?
I've always found this MongoDB article a good resource for such questions.
http://docs.mongodb.org/ecosystem/use-cases/product-catalog/
The questions you need to ask are, how will the data be accessed? What are my objects and how are they formed? Start with your programming first, create your classes (domain objects) and your access patterns, then worry about Mongo. You'll see, Mongo won't really get in your way. That is what is was meant to do.
So, going back to your scenario. If you know the categories are going to be big in number, need to be tightly controlled and manipulated often, then you could have a second collection for them and reference back to that collection's _id field in your category field in the product documents. Important is, the values themselves for the categories should be stored with each product document, in order to have fast reads due to one less query or the need to join the data.
Scott
A few considerations can be made here:
If a category has many products but a product cannot belong to more than one category, then
If number of products is not expected to be very large per category, then
Nest products inside category document
Else, use a different collection for products and use field 'categoryId' in them
Else, use use a different collection for products and use field 'categoryId' in them
Nest documents only when they have one definite parent and they are not huge or too many. Otherwise, the parent document will get huge with no way to control its size.
This may be a dumb question, but I've always wondered what's the best way to do this.
Suppose we have a database with two tables: Users and Orders (one user can have many orders), and in any OOP language you have two classes to represent those tables User and Order. In the database it's evident that the 'order' will have the 'user' ID because it's a one to many relationship (because one user can have many orders) and the user won't have any order ID. But in code what's the best practice out of the following three?
a) Should the user have an array of Orders?
b) Should the order have the user ID?
c) Should the order have a reference to the user object?
Or are there more efficient ways to tackle this? I've always done it in different ways, they all have both pros and cons, but I've never asked an expert's opinion.
Thanks in advance!
In this instance, the User could have an array of orders if you're performing operations on the User that also involves orders that they own.
Whenever I design my classes, objects that are related contain pointers to each other, so I can access the Orders from the User and the User from an Order.
I don't believe there is a best practice as it really depends on what you're trying to accomplish. With Users and Orders, I could see you starting with an Order and needing to access the User and vice versa; therefore, in your situation it sounds like you should map the objects both ways.
One word of warning, just be careful not to create a circular reference. If you delete both objects without removing the reference, it could create a memory leak.
You are asking about what is known as "object relational mapping" (ORM). I think the best way to learn what you want to learn is to look at some well established ORM libraries [such as ActiveRecord(Ruby) or Hibernate (Java)] and see how they do it.
With that in mind:
a) If the application requires it there should be access to an array (or similar enumeration) of objects representing the users orders through the user object. However this will usually best involve lazy loading (i.e. the orders will usually not be pulled from the database when the user pulled from the database....the orders will be subsequently queried when the application needs access to them). After objects are lazy loaded they can be cached by the ORM to eliminate the need for further queries on that istantiation.
b) Unless for performance reasons you only pull specific columns you're usually going to pull all columns when pulling an order. So it would include the user id.
c) Answer a applies to this as well.
I have two tables/collections; Users and Groups. A user can be a member of any number of groups and a user can also be an owner of any number of groups. In a relational database I'd probably have a third table called UserGroups with a UserID column, a GroupID column and an IsOwner column.
I'm using MongoDB and I'm sure there is a different approach for this kind of relationship in a document database. Should I embed the list of groups and groups-as-owner inside the Users table as two arrays of ObjectIDs? Should I also store the list of members and owners in the Groups table as two arrays, effectively mirroring the relationship causing a duplication of relationship information?
Or is a bridging UserGroups table a legitimate concept in document databases for many to many relationships?
Thanks
What I've seen done, and what I currently use are embedded arrays with node id's in each document.
So document user1 has property groups: [id1,id2]
And document group1 has property users: [user1]. Document group2 also has property users: [user1].
This way you get a Group object and easily select all related users, and the same for the User.
This takes a bit more work when creating and updating the object. When you say 2 objects are related, you have to update both objects.
There's also a concept DBReferences in MongoDB and depending on your driver, it'll pull referenced objects automatically when retrieving a document.
http://www.mongodb.org/display/DOCS/Database+References#DatabaseReferences-DBRef
In-case anyone interested, I just bumped into a very good article posted in mongoDB blog. 6 Rules of Thumb for MongoDB Schema Design. There are 3 parts in this article, after reading all 3 you'll have a good understanding.
Let's understand Many to Many Relations with an examples
books to authors
students to teachers
The books to authors is a few to few relationship, so we can have either an array of books or authors inside another's document. Same goes for students to teachers. We could also embed at the risk of duplication. However this will required that each student has a teacher in the system before insertion and vice versa. The application logic may always not allow it. In other words, the parent object must exist for the child object to exist.
But when you have many to many relationship, use two collections and have a true linking.