In a Mongo environment it is beneficial to embed data in documents.
so for example an Employees document:
{
{
userid: 'someid',
username: 'user1'
isManager: true,
subordinates: [
{
userid: 'anotherid',
username: 'user2',
isManager: false
}
],
officeLocation: {
officeId: 'someofficeid',
officeName: 'Some Office'
}
},
{
userid: 'anotherid',
username: 'user2',
isManager: false,
officeLocation: {
officeId: 'someotherofficeid',
officeName: 'Some Other Office'
}
}
}
And the office document:
{
{
officeid: 'someofficeid',
officeName: 'Some Office'
},
{
officeid: 'someotherofficeid',
officeName: 'Some Other Office'
}
}
So lets assume that someone in the company decides that they don't like the name Some Other Office and they want to change it to Some Cool Office.
When they make the change in the office document how do we know to update all the embedded Some Other Office in the employee document as well?
It seems that every time that you take a piece of data from one document and embed it into an object in another document that the link between the two gets broken and then you have to write separate queries to update the data in all the different spots that you embedded that object into.
I like the idea of embedded documents rather than storing references, but without some kind of 2 way data-binding it seems impractical when it comes to updating information.
Is there any way that I would be able to bind the data two ways or is there an easier way to go about modeling my data?
Thanks
It remainds me about the traditional RDBMS systems when you model to normalize/denormalize an information. I'm not sure about the binding, but, if you need the "single true" for an information, the better way is never having the information stored in two different places. So, in your case, it may be better to store the Office information into a separated document and just link it by Id.
Related
I just use mongoose recently and a bit confused how to sort and paginate it.
let say I make some project like twitter and I had 3 schema. first is user second is post and third is post_detail. user schema contains data that user had, post is more like fb status or twitter tweet that we can reply it, post_detail is like the replies of the post
user
var userSchema = mongoose.Schema({
username: {
type: String
},
full_name: {
type: String
},
age: {
type: Number
}
});
post
var postDetailSchema = mongoose.Schema({
message: {
type: String
},
created_by: {
type: String
}
total_reply: {
type: Number
}
});
post_detail
var postDetailSchema = mongoose.Schema({
post_id: {
type: String
}
message: {
type: String
},
created_by: {
type: String
}
});
the relation is user._id = post.created_by, user._id = post_detail.created_by, post_detail.post_id = post._id
say user A make 1 post and 1000 other users comment on that posts, how can we sort the comment by the username of user? user can change the data(full_name, age in this case) so I cant put the data on the post_detail because the data can change dynamically or I just put it on the post_detail and if user change data I just change the post_detail too? but if I do that I need to change many rows because if the same users comment 100 posts then that data need to be changed too.
the problem is how to sort it, I think if I can sort it I can paginate it too. or in this case I should just use rdbms instead of nosql?
thanks anyway, really appreciate the help and guidance :))
Welcome to MongoDB.
If you want to do it in the way you describe, just don't go for Mongo.
You are designing the schema based on relations and not in documents.
Your design requires to do joins and this does not work well in mongo because there is not an easy/fast way of doing this.
First, I would not create a separate entity for the post details but embedded in the Post document the post details as a list.
Regarding your question:
or I just put it on the post_detail and if user change data I just
change the post_detail too?
Yes, that is what you should do. If you want to be able to sort the documents by the userName you should denormalize it and include in the post_details.
If I had to design the schema, it would be something like this:
{
"message": "blabl",
"authorId" : "userId12",
"total_reply" : 100,
"replies" : [
{
"message" : "okk",
"authorId" : "66234",
"authorName" : "Alberto Rodriguez"
},
{
"message" : "test",
"authorId" : "1231",
"authorName" : "Fina Lopez"
}
]
}
With this schema and using the aggregation framework, you can sort the comments by username.
If you don't like this approach, I rather would go for an RDBMS as you mentioned.
I would like to create an eCommerce type of database where I have products and categories for the products using Mongodb and Mongoose. I am thinking of having two collections, one for products and one for categories. After digging online, I think the category should be as such:
var categorySchema = {
_id: { type: String },
parent: {
type: String,
ref: 'Category'
},
ancestors: [{
type: String,
ref: 'Category'
}]
};
I would like to be able to find all the products by category. For example "find all phones." However, the categories may be renamed, updated, etc. What is the best way to implement the product collection? In SQL, a product would contain a foreign key to a category.
A code sample of inserting and finding a document would be much appreciated!
Why not keep it simple and do something like the following?
var product_Schema = {
phones:[{
price:Number,
Name:String,
}],
TV:[{
price:Number,
Name:String
}]
};
Then using projections you could easily return the products for a given key. For example:
db.collection.find({},{TV:1,_id:0},function(err,data){
if (!err) {console.log(data)}
})
Of course the correct schema design will be dependent on how you plan on querying/inserting/updating data, but with mongo keeping things simple usually pays off.
Say that I have a business that represents users who spend a certain amount of time to produce certain quantities of stuff. I want each user to be free to create their own algorithm, or formula, for determining the price that they charge for their work:
Users Collection, with possibly thousands of different users.
{
userId: 'sdf23d23dwew',
price: function(time, qty){
// some algorithm
}
},
{
userId: '23f5gf34f',
price: function(time, qty){
// another algorithm
}
},
{
userId: '7u76565',
price: function(time, qty){
// yet another algorithm
}
},
{
userId: 'w45y65yh4',
price: function(time, qty){
// something else
}
}
//and on and on and on...
Now, JSON doesn't support functions and neither does MongoDB. BUT this use-case of possibly thousands of users, each with the freedom to create their own unique method of determining their own prices, seems to me like being able to store functions inside of their user document would be ideal.
I certainly don't feel like it's a good idea to just store all these thousands of functions in a JS file on the server that somehow gets referenced by a userId when it's needed...
Is there a solution for this case?
I'm using MongoDB. I know that MongoDB isn't relational but information sometimes is. So what's the most efficient way to reference these kinds of relationships to lessen database load and maximize query speed?
Example:
* Tinder-style "matches" *
There are many users in a Users collection. They get matched to each other.
So I'm thinking:
Document 1:
{
_id: "d3fg45wr4f343",
firstName: "Bob",
lastName: "Lee",
matches: [
"ferh823u9WURF",
"8Y283DUFH3FI2",
"KJSDH298U2F8",
"shdfy2988U2Ywf"
]
}
Document 2:
{
_id: "d3fg45wr4f343",
firstName: "Cindy",
lastName: "Doe",
matches: [
"d3fg45wr4f343"
]
}
Would this work OK if there were, say, 10,000 users and you were on Bob's profile page and you wanted to display the firstName of all of his matches?
Any alternative structures that would work better?
* Online Forum *
I supposed you could have the following collections:
Users
Topics
Users Collection:
{
_id: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
}
{
_id: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
}
Topics Collection Version 1
One example document in the Topics Collection:
{
title: "A spider just popped out of the AC",
dateTimeSubmitted: 201408201200,
category: 5,
posts: [
{
message: "I'm going to use a gun.",
dateTimeSubmitted: 201408201200,
author: "d3fg45wr4f343"
},
{
message: "I don't think this would work.",
dateTimeSubmitted: 201408201201,
author: "23qdf3a3fq3fq3"
},
{
message: "It will totally work.",
dateTimeSubmitted: 201408201202,
author: "d3fg45wr4f343"
},
{
message: "ur dumb",
dateTimeSubmitted: 201408201203,
author: "23qdf3a3fq3fq3"
}
]
}
Topics Collection Version 2
One example document in the Topics Collection. The author's avatar and userName are now embedded in the document. I know that:
This is not DRY.
If the author changes their avatar and userName, these change would need to be updated in the Topics Collection and in all of the post documents that are in it.
BUT it saves the system from querying for all the avatars and userNames via the authors ID every single time this thread is viewed on the client.
{
title: "A spider just popped out of the AC",
dateTimeSubmitted: 201408201200,
category: 5,
posts: [
{
message: "I'm going to use a gun.",
dateTimeSubmitted: 201408201200,
author: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
},
{
message: "I don't think this would work.",
dateTimeSubmitted: 201408201201,
author: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
},
{
message: "It will totally work.",
dateTimeSubmitted: 201408201202,
author: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
},
{
message: "ur dumb",
dateTimeSubmitted: 201408201203,
author: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
}
]
}
So yeah, I'm not sure which is best...
If the data is realy many to many i.e. one can have many matches and can be matched by many in your first example it is usually best to go with relations.
The main arguments against relations stem from mongodb not beeing a relational database so there are no such things as foreign key constraints or join statements.
The trade off you have to consider in those many to many cases (many beeing much more than two) is either enforce the key constraints yourself or manage the possible data inconsistencies accross the multiple documents (your last example). And in most cases the relational approach is much more practical than the embedding approach for those cases.
Exceptions could be read often write seldom examples. For (a very constructed) example when in your first example matches would be recalculated once a day or so by wiping all previous matches and calculating a list of new matches. In that case the data inconsistencies you would introduce could be acceptable and the read time you save by embedding the firstnames of the matches could be an advantage.
But usually for many to many relations it would be best to use a relational approach and make use of the array query features such as {_id :{$in:[matches]}}.
But in the end it all comes down to the consideration of how many inconsistencies you can live with and how fast you realy need to access the data (is it ok for some topics to have the old avatar for a few days if I save half a second of page load time?).
Edit
The schema design series on the mongodb blog might be a good read for you: part1, part2 and part3
I am creating a blog system in Node.js with mongodb as the db.
I have contents like this: (blog articles):
// COMMENTS SCHEMA:
// ---------------------------------------
var Comments = new Schema({
author: {
type: String
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
}
});
exports.Comments = mongoose.model('Comments',Comments);
var Tags = new Schema({
name: {
type: String
}
});
exports.Tags = mongoose.model('Tags',Tags);
// CONTENT SCHEMA:
// ---------------------------------------
exports.Contents = mongoose.model('Contents', new Schema({
title: {
type: String
},
author: {
type: String
},
permalink: {
type: String,
unique: true,
sparse: true
},
catagory: {
type: String,
default: ''
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
},
status: {
type: Number
},
comments: [Comments],
tags: [Tags]
}));
I am a little new to this type of database, im used to MySQL on a LAMP stack.
Basically my question is as follows:
whats the best way to associate the Contents author to a User in the
DB?
Also, whats the best way to do the tags and categories?
In MYSQL we would have a tags table and a categories table and relate by keys, I am not sure the best and most optimal way of doing it in Mongo.
THANK YOU FOR YOUR TIME!!
Couple of ideas for Mongo:
The best way to associate a user is e-mail address - as an attribute of the content/comment document - e-mail is usually a reliable unique key. MongoDB doesn't have foreign keys or associated constraints. But that is fine.
If you have a registration policy, add user name, e-mail address and other details to the users collection. Then de-normalize the content document with the user name and e-mail. If, for any reason, the user changes the name, you will have to update all the associated contents/comments. But so long as the e-mail address is there in the documents, this should be easy.
Tags and categories are best modelled as two lists in the content document, IMHO.
You can also create two indices on these attributes, if required. Depends on the access patterns and the UI features you want to provide
You can also add a document which keeps a tag list and a categories list in the contents collection and use $addToSet to add new tags and categories to this document. Then, you can show a combo box with the current tags as a starting point.
As a final point, think through the ways you plan to access the data and then design documents, collections & indices accordingly
[Update 12/9/11] Was at MongoSv and Eliot (CTO 10gen) presented a pattern relevant to this question: Instead of one comment document per user (which could grow large) have a comment document per day for a use with _id = -YYYYMMDD or even one per month depending on the frequency of comments. This optimizes index creation/document growth vs document proliferation (in case of the design where there is one comment per user).
The best way to associate the Content Authors to a User in the MongoDB, is to take an array in Author Collection which keeps an reference to User. Basically Array because One Content/Book may have multiple Authors i.e. you need to associate one Content to many Users.
The best way for category is to create a different collection in your DB and similarly as above keep a array in Contents.
I hope it helps at-least a little.