I am creating a blog system in Node.js with mongodb as the db.
I have contents like this: (blog articles):
// COMMENTS SCHEMA:
// ---------------------------------------
var Comments = new Schema({
author: {
type: String
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
}
});
exports.Comments = mongoose.model('Comments',Comments);
var Tags = new Schema({
name: {
type: String
}
});
exports.Tags = mongoose.model('Tags',Tags);
// CONTENT SCHEMA:
// ---------------------------------------
exports.Contents = mongoose.model('Contents', new Schema({
title: {
type: String
},
author: {
type: String
},
permalink: {
type: String,
unique: true,
sparse: true
},
catagory: {
type: String,
default: ''
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
},
status: {
type: Number
},
comments: [Comments],
tags: [Tags]
}));
I am a little new to this type of database, im used to MySQL on a LAMP stack.
Basically my question is as follows:
whats the best way to associate the Contents author to a User in the
DB?
Also, whats the best way to do the tags and categories?
In MYSQL we would have a tags table and a categories table and relate by keys, I am not sure the best and most optimal way of doing it in Mongo.
THANK YOU FOR YOUR TIME!!
Couple of ideas for Mongo:
The best way to associate a user is e-mail address - as an attribute of the content/comment document - e-mail is usually a reliable unique key. MongoDB doesn't have foreign keys or associated constraints. But that is fine.
If you have a registration policy, add user name, e-mail address and other details to the users collection. Then de-normalize the content document with the user name and e-mail. If, for any reason, the user changes the name, you will have to update all the associated contents/comments. But so long as the e-mail address is there in the documents, this should be easy.
Tags and categories are best modelled as two lists in the content document, IMHO.
You can also create two indices on these attributes, if required. Depends on the access patterns and the UI features you want to provide
You can also add a document which keeps a tag list and a categories list in the contents collection and use $addToSet to add new tags and categories to this document. Then, you can show a combo box with the current tags as a starting point.
As a final point, think through the ways you plan to access the data and then design documents, collections & indices accordingly
[Update 12/9/11] Was at MongoSv and Eliot (CTO 10gen) presented a pattern relevant to this question: Instead of one comment document per user (which could grow large) have a comment document per day for a use with _id = -YYYYMMDD or even one per month depending on the frequency of comments. This optimizes index creation/document growth vs document proliferation (in case of the design where there is one comment per user).
The best way to associate the Content Authors to a User in the MongoDB, is to take an array in Author Collection which keeps an reference to User. Basically Array because One Content/Book may have multiple Authors i.e. you need to associate one Content to many Users.
The best way for category is to create a different collection in your DB and similarly as above keep a array in Contents.
I hope it helps at-least a little.
Related
database noob here using MongoDB, in my program, I have users, and the core of my program are these roadmaps that I display. So, each user can create roadmaps, save others roadmaps, blah blah... Each user has a field named savedRoadmaps and createdRoadmaps which should store the roadmaps. My question is, should I just store the roadmap _ids in the savedRoadmap and createdRoadmaps field or the entire roadmap?
I am asking this because it feels like saving just the _id of the roadmaps can save storage, but it might not come in handy when I have to fetch the data of the user first, then fetch the roadmap using the roadmap ID in the user's savedRoadmap/createdRoadmap field, versus just fetching the user and the savedRoadmap field will already have the roadmap in there.
And btw, is there any sweet and brief database design read out there, please direct me to some if you know any!
For a user, I want it to have a name, email, password, description ofcourse, and also savedRoadmaps and createdRoadmaps. A user can create unlimited roadmaps and also save as much as he or she wants. For a roadmap, I want it to have a name, category, time_completion, author, date, and a roadmap object which will contain the actual json string that I will use d3 to display. Here's my User and Roadmap Schema right now:
const RoadmapSchema = new Schema({
author: {
type: String,
require: false
},
name: {
type: String,
require: true
},
category: {
type: String,
require: true
},
time_completion: {
type: Number,
require: true
},
date: {
type: Date,
default: Date.now
},
roadmap: {
type: "object",
require: true
}
});
and User Schema:
const UserSchema = new Schema({
name: {
type: String,
required: true
},
email: {
type: String,
required: true
},
password: {
type: String,
required: true
},
date: {
type: Date,
default: Date.now
},
savedRoadmap: {
type: "object",
default: []
},
createdRoadmap: {
type: "object",
default: []
}
});
My question is, inside of the savedRoadmap and createdRoadmap fields of the User schema, should I include just the _id of a roadmap, or should I include the entire json string which represents the roadmap?
There are 3 different data-modeling techniques you can use to design your roadmaps system based on the cardinality of the relationship between users and roadmaps.
In general you need to de-normalize your data model based on the queries that are expected from your application:
One to Few: Embed the N side if the cardinality is one-to-few and there is no need to access the embedded object outside the context of the parent object
One to Many: Use an array of references to the N-side objects if the cardinality is one-to-many or if the N-side objects should stand alone for any reasons
One-to-Squillions: Use a reference to the One-side in the N-side objects if the cardinality is one-to-squillions
And btw, is there any sweet and brief database design read out there,
please direct me to some if you know any!
Rules of Thumb for MongoDB Schema Design: Part 1
I want to create a collection for user's rating, I have doubts between 2 structures schemas.
First schema:
var Rating = new mongoose.Schema({
userID: {
type: String,
minlength: 1,
required: true,
trim: true
},
ratings: [{
rate: {
type: Number
}
}]
});
Second schema:
var Rating = new mongoose.Schema({
userID: {
type: String,
required: true,
},
rating: {
type: Number,
required: true
},
});
The first schema will cause that every rating the be pushed into the array of ratings and the second will cause inserting multiple documents of the same userID and each document contains its rating.
I would like to know which approach is recommended between the two, increasing the array or increasing documents each time the user get rating.
It depends on the details of your project (there is no the one super good and universal schema).
The first structure is closer to the MongoDB ideology. But do not forget about the document size limitation (16MB, except if you are using GridFS). This structure is better if you do not have a big amount of information (items in the ratings field). Because all ratings will be in one document it means that your indexes will be optimal small (one user - one document).
The second schema is better for situation when ou have a big amount of ratings (related to the document size limit).
Also you can use two collections. One for aggregated data (final results after calculations, something like as cache) and another for detailed information. As mentioned before - the best solution depends on the details of the project
I recoment you to read this article 6 Rules of Thumb for MongoDB Schema Design
I would like to create an eCommerce type of database where I have products and categories for the products using Mongodb and Mongoose. I am thinking of having two collections, one for products and one for categories. After digging online, I think the category should be as such:
var categorySchema = {
_id: { type: String },
parent: {
type: String,
ref: 'Category'
},
ancestors: [{
type: String,
ref: 'Category'
}]
};
I would like to be able to find all the products by category. For example "find all phones." However, the categories may be renamed, updated, etc. What is the best way to implement the product collection? In SQL, a product would contain a foreign key to a category.
A code sample of inserting and finding a document would be much appreciated!
Why not keep it simple and do something like the following?
var product_Schema = {
phones:[{
price:Number,
Name:String,
}],
TV:[{
price:Number,
Name:String
}]
};
Then using projections you could easily return the products for a given key. For example:
db.collection.find({},{TV:1,_id:0},function(err,data){
if (!err) {console.log(data)}
})
Of course the correct schema design will be dependent on how you plan on querying/inserting/updating data, but with mongo keeping things simple usually pays off.
TLDR; Should you use subdocuments or relational Id?
This is my PostSchema:
const Post = new mongoose.Schema({
title: {
type: String,
required: true
},
body: {
type: String,
required: true
},
comments: [Comment.schema]
})
And this is my Comment Schema:
const Comment = new mongoose.Schema({
body: {
type: String,
required: true
}
})
In Postgres, I would have a post_id field in Comment, instead of having an array of comments inside Post. I am sure you can do the same in MongoDB but I don't know which one is more conventional. If people use subdocuments over references (and joining tables) in MongoDB, why is that? In other words, why should I ever use subdocuments? If it's advantageous, should I do the same in Postgres as well?
What I understood from your question, answering based on that.
If you will keep sub documents, you don't have to query two tables to know comments specific to one post.
Let's say we have following db structure for post:-
[{
_id:1,
title:'some title',
comments:[
{
...//some fields that belongs to comments
} ,
{
...//some fields that belongs to comments
} ,
...
]
},
{
_id:2,
title:'some title',
comments:[
{
...//some fields that belongs to comments
} ,
{
...//some fields that belongs to comments
} ,
...
]
}]
Now you can query based on _id of the post (1) and can get comments array that belongs to the specific post.
If you will just keep the comment's id inside post, you have to query both the tables, which I don't think is a good idea.
EDIT :-
If you are keeping post id inside comments record, then it will help you to track which comment is for which post i.e. if you want to query comments table based on post id and you need only fields from comments records.
What I think, use case will be which post contains what all comments. So keeping comment inside post will give you comments fields as well as fields from post record.
So it's totally depends on your requirement, how you will design your data structure.
var FamilySchema = new Schema({
members: [String],
indexedOn: {
type: Date,
default: Date.now
},
updatedOn: {
type: Date,
default: Date.now
}
});
As a crude example, I have a Family that has many members, so I use a schema like the one shown above. But there can be THOUSANDS of members in one family and a member can be in ONLY one family. So every time I come across a new member, I have to search to see if he belongs to any Families and if he does, add him. If he doesn't, I have to create a new family and add him.
This seems like an extremely inefficient way to do things. Is there a better design for this sort of use case?
You could use an array and index the field of members.
Or, here's a very common MongoDB modeling technique that avoids using an array (and means that you can have richer structures for a given family member). Create a Family and a FamilyMember. As you said that each family member may only be in one family, you would add a field to the FamilyMemberSchema as a reference to the Family (using ref as shown below).
var FamilySchema = new Schema({
name: String,
indexedOn: {
type: Date,
default: Date.now
},
updatedOn: {
type: Date,
default: Date.now
}
});
var FamilyMemberSchema = new Schema({
name: String,
family_id: { type: Schema.Types.ObjectId, ref: 'Family' }
});
// you might want an index on these fields
FamilyMemberSchema.index({ family_id: 1, name: 1});
var Family = mongoose.Model('Family', FamilySchema);
var FamilyMember = mongoose.Model('FamilyMember', FamilyMemberSchema);
You could then use a query to fetch all Family Members for a particular family:
FamilyMember.find().where('family_id', 'AFAMILYID').exec(/* callback */);
You wouldn't need to use the ref much as using the populate functionality wouldn't be particularly useful in your situation (http://mongoosejs.com/docs/populate.html), but it documents the schema definition better, so I'd use it.
You can use two collections, one for families and other for members. You can use a field in members collection in order to link them with one family (by "_id" for instance) of the other collection.
When you have to add new element you can search on "members" collections if the element already exists. An index could help to speed up the query.