Which approach is better? - mongodb

I want to create a collection for user's rating, I have doubts between 2 structures schemas.
First schema:
var Rating = new mongoose.Schema({
userID: {
type: String,
minlength: 1,
required: true,
trim: true
},
ratings: [{
rate: {
type: Number
}
}]
});
Second schema:
var Rating = new mongoose.Schema({
userID: {
type: String,
required: true,
},
rating: {
type: Number,
required: true
},
});
The first schema will cause that every rating the be pushed into the array of ratings and the second will cause inserting multiple documents of the same userID and each document contains its rating.
I would like to know which approach is recommended between the two, increasing the array or increasing documents each time the user get rating.

It depends on the details of your project (there is no the one super good and universal schema).
The first structure is closer to the MongoDB ideology. But do not forget about the document size limitation (16MB, except if you are using GridFS). This structure is better if you do not have a big amount of information (items in the ratings field). Because all ratings will be in one document it means that your indexes will be optimal small (one user - one document).
The second schema is better for situation when ou have a big amount of ratings (related to the document size limit).
Also you can use two collections. One for aggregated data (final results after calculations, something like as cache) and another for detailed information. As mentioned before - the best solution depends on the details of the project
I recoment you to read this article 6 Rules of Thumb for MongoDB Schema Design

Related

Mongo DB Time Series Performance on just secondary Index

I am building a mongoose model to store survey response data. There are, however, different types of surveys with different response rates. One type of survey has frequent answers (perhaps every few seconds) and data is normally queried in chunks of time, eg from startDate to endDate of the response. However, some surveys only get responses maybe a 20 times a month, and sometimes I would want to get all the data for that survey just based on the survey_id, and, not using any time field constraints.
So my question is, do secondary indexes on time series collections work as well as they would on a non-time series collection?
My model looks like this:
const responseSchema = mongoose.Schema(
{
metaData: {
type: new mongoose.Schema({
survey_id: { type: mongoose.Schema.Types.ObjectId, ref: "survey", required: true },
}),
required: true,
},
createdAt: Date,
answers: { type: Map, of: mongoose.Mixed },
},
{
timeseries: {
timeField: "createdAt",
metaField: "metaData",
granularity: "seconds",
},
}
);
responseSchema.plugin(ts);
responseSchema.index({ "metaData.survey_id": 1, "createdAt": 1 });
I would expect normal querys using the createdAt field as filters to work well, but what if I only query by survey_id and don't use the time field. Will that still work well? or do I get performance degradation by not using the time field with a time series collection.
querys of this collection will always be based on the survey_id

mongoose indexing? grouping?

I'm kinda new to mongoose, and I'm not sure if it's a right term.
what I'm building is a community site (like redit), and I have a schema like below
const postSchema = new mongoose.Schema({
content: {
type: String,
required: true,
},
title: {
type: String,
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
required: true,
ref: 'User',
},
board: {
type: String,
required: true,
enum: ['board1','board2'],
},
created_at: {
type: Date,
default: Date.now,
},
updated_at: {
type: Date,
},
})
there are many kinds of 'board'
and I'm not sure if it can be 'indexed'.
purpose of it is for getting posts faster
for example in sql (assume that board column is indexed)
--> select * from post where board = 'board1' ;
I'm confusing about the terms, need some direction..
Short answer:
You need to create an index on the field board by doing:
db.post.createIndex(
{ board: 1 } ,
{ name: "borad index" }
)
Long answer:
Indexing in mongodb uses memory in order to save running time.
Let's take an example: say you have all words in English in your DB. And you are reading a book and from time to time you need to search for a word to check its meaning.
How would you do that? A dictionary. You'll sort the words alphabetically and then you could easily search for every word you wanted.
Indexing apply the same concept. When you create an index on the field board it takes all its values, sort them and save it in a table (and reference for each entry the full document from your collection).
Now when you search for select * from post where board = 'board1' it first use the memorized table of sorted boards, finds the ones that equal to board1 and then by the reference gives you the full documents that belongs to it. You can continue reading here.

Mongoose product category design?

I would like to create an eCommerce type of database where I have products and categories for the products using Mongodb and Mongoose. I am thinking of having two collections, one for products and one for categories. After digging online, I think the category should be as such:
var categorySchema = {
_id: { type: String },
parent: {
type: String,
ref: 'Category'
},
ancestors: [{
type: String,
ref: 'Category'
}]
};
I would like to be able to find all the products by category. For example "find all phones." However, the categories may be renamed, updated, etc. What is the best way to implement the product collection? In SQL, a product would contain a foreign key to a category.
A code sample of inserting and finding a document would be much appreciated!
Why not keep it simple and do something like the following?
var product_Schema = {
phones:[{
price:Number,
Name:String,
}],
TV:[{
price:Number,
Name:String
}]
};
Then using projections you could easily return the products for a given key. For example:
db.collection.find({},{TV:1,_id:0},function(err,data){
if (!err) {console.log(data)}
})
Of course the correct schema design will be dependent on how you plan on querying/inserting/updating data, but with mongo keeping things simple usually pays off.

Is there a MongoDB maximum bson size work around?

The document I am working on is extremely large. It collects user input from an extremely long survey (like survey monkey) and stores the answers in a mongodb database.
I am unsurprisingly getting the following error
Error: Document exceeds maximal allowed bson size of 16777216 bytes
If I cannot change the fields in my document is there anything I can do? Is there some way to compress down the document, by removing white space or something like that?
Edit
Here is the structure of the document
Schema({
id : { type: Number, required: true },
created: { type: Date, default: Date.now },
last_modified: { type: Date, default: Date.now },
data : { type: Schema.Types.Mixed, required: true }
});
An example of the data field:
{
id: 65,
question: {
test: "some questions",
answers: [2,5,6]
}
// there could be thousands of these question objects
}
One thing you can do is to build your own mongoDB :-). Mongodb is an open source and the limitation about the size of a document is rather arbitrary to enforce a better schema design. You can just modify this line and build it for yourself. Be careful with this.
The most straight forward idea is to have each small question in a different document with a field which reference to its parent.
Another idea is to limit number of documents in the parent. Lets say you limit is N elements then the parent looks like this:
{
_id : ObjectId(),
id : { type: Number, required: true },
created: { type: Date, default: Date.now }, // you can store it only for the first element
last_modified: { type: Date, default: Date.now }, // the same here
data : [{
id: 65,
question: {
test: "some questions",
answers: [2,5,6]
}
}, ... up to N of such things {}
]
}
This way modifying number N you can make sure that you will be in 16 MB of BSON. And in order to read the whole survey you can select
db.coll.find({id: the Id you need}) and then combine the whole survey on the application level. Also do not forget to ensureIndex on id.
Try different things, do a benchmark on your data and see what works for you.
You should be using gridfs. It allows you to store documents in chunks. Here's the link: http://docs.mongodb.org/manual/reference/gridfs/

Node.js - Mongoose/MongoDB - Model Schema

I am creating a blog system in Node.js with mongodb as the db.
I have contents like this: (blog articles):
// COMMENTS SCHEMA:
// ---------------------------------------
var Comments = new Schema({
author: {
type: String
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
}
});
exports.Comments = mongoose.model('Comments',Comments);
var Tags = new Schema({
name: {
type: String
}
});
exports.Tags = mongoose.model('Tags',Tags);
// CONTENT SCHEMA:
// ---------------------------------------
exports.Contents = mongoose.model('Contents', new Schema({
title: {
type: String
},
author: {
type: String
},
permalink: {
type: String,
unique: true,
sparse: true
},
catagory: {
type: String,
default: ''
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
},
status: {
type: Number
},
comments: [Comments],
tags: [Tags]
}));
I am a little new to this type of database, im used to MySQL on a LAMP stack.
Basically my question is as follows:
whats the best way to associate the Contents author to a User in the
DB?
Also, whats the best way to do the tags and categories?
In MYSQL we would have a tags table and a categories table and relate by keys, I am not sure the best and most optimal way of doing it in Mongo.
THANK YOU FOR YOUR TIME!!
Couple of ideas for Mongo:
The best way to associate a user is e-mail address - as an attribute of the content/comment document - e-mail is usually a reliable unique key. MongoDB doesn't have foreign keys or associated constraints. But that is fine.
If you have a registration policy, add user name, e-mail address and other details to the users collection. Then de-normalize the content document with the user name and e-mail. If, for any reason, the user changes the name, you will have to update all the associated contents/comments. But so long as the e-mail address is there in the documents, this should be easy.
Tags and categories are best modelled as two lists in the content document, IMHO.
You can also create two indices on these attributes, if required. Depends on the access patterns and the UI features you want to provide
You can also add a document which keeps a tag list and a categories list in the contents collection and use $addToSet to add new tags and categories to this document. Then, you can show a combo box with the current tags as a starting point.
As a final point, think through the ways you plan to access the data and then design documents, collections & indices accordingly
[Update 12/9/11] Was at MongoSv and Eliot (CTO 10gen) presented a pattern relevant to this question: Instead of one comment document per user (which could grow large) have a comment document per day for a use with _id = -YYYYMMDD or even one per month depending on the frequency of comments. This optimizes index creation/document growth vs document proliferation (in case of the design where there is one comment per user).
The best way to associate the Content Authors to a User in the MongoDB, is to take an array in Author Collection which keeps an reference to User. Basically Array because One Content/Book may have multiple Authors i.e. you need to associate one Content to many Users.
The best way for category is to create a different collection in your DB and similarly as above keep a array in Contents.
I hope it helps at-least a little.