Database design - saving the entire object to a user or just the id of an object?

Database design - saving the entire object to a user or just the id of an object? - mongodb

database noob here using MongoDB, in my program, I have users, and the core of my program are these roadmaps that I display. So, each user can create roadmaps, save others roadmaps, blah blah... Each user has a field named savedRoadmaps and createdRoadmaps which should store the roadmaps. My question is, should I just store the roadmap _ids in the savedRoadmap and createdRoadmaps field or the entire roadmap?
I am asking this because it feels like saving just the _id of the roadmaps can save storage, but it might not come in handy when I have to fetch the data of the user first, then fetch the roadmap using the roadmap ID in the user's savedRoadmap/createdRoadmap field, versus just fetching the user and the savedRoadmap field will already have the roadmap in there.
And btw, is there any sweet and brief database design read out there, please direct me to some if you know any!
For a user, I want it to have a name, email, password, description ofcourse, and also savedRoadmaps and createdRoadmaps. A user can create unlimited roadmaps and also save as much as he or she wants. For a roadmap, I want it to have a name, category, time_completion, author, date, and a roadmap object which will contain the actual json string that I will use d3 to display. Here's my User and Roadmap Schema right now:
const RoadmapSchema = new Schema({
author: {
type: String,
require: false
},
name: {
type: String,
require: true
},
category: {
type: String,
require: true
},
time_completion: {
type: Number,
require: true
},
date: {
type: Date,
default: Date.now
},
roadmap: {
type: "object",
require: true
}
});
and User Schema:
const UserSchema = new Schema({
name: {
type: String,
required: true
},
email: {
type: String,
required: true
},
password: {
type: String,
required: true
},
date: {
type: Date,
default: Date.now
},
savedRoadmap: {
type: "object",
default: []
},
createdRoadmap: {
type: "object",
default: []
}
});
My question is, inside of the savedRoadmap and createdRoadmap fields of the User schema, should I include just the _id of a roadmap, or should I include the entire json string which represents the roadmap?

There are 3 different data-modeling techniques you can use to design your roadmaps system based on the cardinality of the relationship between users and roadmaps.
In general you need to de-normalize your data model based on the queries that are expected from your application:
One to Few: Embed the N side if the cardinality is one-to-few and there is no need to access the embedded object outside the context of the parent object
One to Many: Use an array of references to the N-side objects if the cardinality is one-to-many or if the N-side objects should stand alone for any reasons
One-to-Squillions: Use a reference to the One-side in the N-side objects if the cardinality is one-to-squillions
And btw, is there any sweet and brief database design read out there,
please direct me to some if you know any!
Rules of Thumb for MongoDB Schema Design: Part 1

Related

mongoose indexing? grouping?

I'm kinda new to mongoose, and I'm not sure if it's a right term.
what I'm building is a community site (like redit), and I have a schema like below
const postSchema = new mongoose.Schema({
content: {
type: String,
required: true,
},
title: {
type: String,
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
required: true,
ref: 'User',
},
board: {
type: String,
required: true,
enum: ['board1','board2'],
},
created_at: {
type: Date,
default: Date.now,
},
updated_at: {
type: Date,
},
})
there are many kinds of 'board'
and I'm not sure if it can be 'indexed'.
purpose of it is for getting posts faster
for example in sql (assume that board column is indexed)
--> select * from post where board = 'board1' ;
I'm confusing about the terms, need some direction..

Short answer:
You need to create an index on the field board by doing:
db.post.createIndex(
{ board: 1 } ,
{ name: "borad index" }
)
Long answer:
Indexing in mongodb uses memory in order to save running time.
Let's take an example: say you have all words in English in your DB. And you are reading a book and from time to time you need to search for a word to check its meaning.
How would you do that? A dictionary. You'll sort the words alphabetically and then you could easily search for every word you wanted.
Indexing apply the same concept. When you create an index on the field board it takes all its values, sort them and save it in a table (and reference for each entry the full document from your collection).
Now when you search for select * from post where board = 'board1' it first use the memorized table of sorted boards, finds the ones that equal to board1 and then by the reference gives you the full documents that belongs to it. You can continue reading here.

MongoDB / Mongoose schema for users

I start by saying I'm a beginner with mongodb, but I'm developing a webapp just to study this db in more depth.
I can't "think" of a valid solution (at the schema level), which allows me to manage users and the data associated with them in this case:
I have three types of users: "basic user", "user manager" and "supervisor".
The basic user must be able to see only his data, the “user manager“ must be able to see his data and the data of the users under him. In cascade, the supervisor user must be able to see his data, those of the manager and users under him.
-Practical example:
a user "X" has been assigned tasks, if user "X" has a manager "R", the manager must see his tasks and the tasks of "X". In cascade, if the manager has a supervisor, the supervisor must be able to see his tasks, those of "R" and those of "X."
Any advice or example?
Many thanks

You can create one collection to store the user details.
Example - UserSchema
email: {
type: String,
unique: true,
required: true
},
userId: {
type: String,
unique: true,
required: true
},
managerId: {
type: String,
required: false
},
superVisorId: {
type: String,
required: false
},
name: {
type: String
},
roleId: {
type: Array,
required: true,
enum: [EMPLOYEE, MANAGER, SUPERVISOR],
default: [EMPLOYEE]
}
Here userId is something which you can assign uniquely to each user.
Now you can write some middleware which checks the roleId when a user is trying to access data of say -> EMPLOYEE is trying to access data of MANAGER based on the roleId you can reject this action.
You can have another collection taskSchema
taskId: {
type: String,
required: true,
unique: true
},
taskDescription: {
type: String
},
// Store the userId from userSchema,
// This refers to whom this particular task is assigned to.
taskAssignedTo: {
type: String,
required: true,
unique: true
}
Now, If the SUPERVISOR want's to access all the tasks that have been assigned to them You can simply run a query to first fetch all the MANAGERS and then fetch all the EMPLOYEE under that manager and store them.
Now just fetch all the tasks assigned to each of these userIds.

Can't update or query embedded sub-documents using MongoDB? Now what?

I took the NoSQL plunge against all my RDBMS prejudices from my past. But I trusted. Now I find myself 3 months into a project and the exact reasons we adhered to RDMS principles seem to be biting me in the butt. I think I just discovered here on stackoverflow that I can't work with twice embedded arrays. I followed the noSQL, embedded document approach like a good kool-aid drinker and feel like I've been betrayed. Before I swear off noSQL and go back and refactor my entire code-base to adhere to new 'normalized' model I'd like to here from some no-sql champions.
Here is my model using one big document with embedded docs and the works:
var mongoose = require('mongoose'),
Schema = mongoose.Schema,
User = mongoose.model('User');
var Entry = new Schema({
text: String,
ups: Number,
downs: Number,
rankScore: Number,
posted: {
type: Date,
default: Date.now
},
postedBy: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}
});
var boardSchema = new Schema({
theme: String,
created: {
type: Date,
default: Date.now
},
owner: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
},
entered: {
type: Boolean,
default: false
},
entries: [Entry],
participants: [{
user: { type: mongoose.Schema.Types.ObjectId, ref: 'User'},
date: { type: Date, default: Date.now },
topTen: [ { type: mongoose.Schema.Types.ObjectId, ref: 'Entry'} ]
}]
});
mongoose.model('Board', boardSchema);
Basically, I want to query the document by Board._id, then where participants.user == req.user.id, I'd like to add to the topTen[] array. Note participants[] is an array within the document and topTen is an array within participants[]. I've found other similar questions but I was pointed to a Jira item which doesn't look like it will be implemented to allow the use of $ positional operation in multiple embedded arrays. Is there no way to do this now? Or if anyone has a suggestion of how to model my document so that I don't have to go full re-write with a new normalized reference model...please help!
Here are some of my query attempts from what I could find online. Nothing worked for me.
Board.update({_id: ObjectId('56910eed15c4d50e0998a2c9'), 'participants.user._id': ObjectId('56437f6a142974240273d862')}, {$set:{'participants.0.topTen.$.entry': ObjectId('5692eafc64601ceb0b64269b') }}
I read you should avoid such 'nested' designs but with the embedded model its hard not to. Basically this statement says to me "don't embed" go "ref".

How to properly design a Mongo Schema to keep elements that belong together - together?

var FamilySchema = new Schema({
members: [String],
indexedOn: {
type: Date,
default: Date.now
},
updatedOn: {
type: Date,
default: Date.now
}
});
As a crude example, I have a Family that has many members, so I use a schema like the one shown above. But there can be THOUSANDS of members in one family and a member can be in ONLY one family. So every time I come across a new member, I have to search to see if he belongs to any Families and if he does, add him. If he doesn't, I have to create a new family and add him.
This seems like an extremely inefficient way to do things. Is there a better design for this sort of use case?

You could use an array and index the field of members.
Or, here's a very common MongoDB modeling technique that avoids using an array (and means that you can have richer structures for a given family member). Create a Family and a FamilyMember. As you said that each family member may only be in one family, you would add a field to the FamilyMemberSchema as a reference to the Family (using ref as shown below).
var FamilySchema = new Schema({
name: String,
indexedOn: {
type: Date,
default: Date.now
},
updatedOn: {
type: Date,
default: Date.now
}
});
var FamilyMemberSchema = new Schema({
name: String,
family_id: { type: Schema.Types.ObjectId, ref: 'Family' }
});
// you might want an index on these fields
FamilyMemberSchema.index({ family_id: 1, name: 1});
var Family = mongoose.Model('Family', FamilySchema);
var FamilyMember = mongoose.Model('FamilyMember', FamilyMemberSchema);
You could then use a query to fetch all Family Members for a particular family:
FamilyMember.find().where('family_id', 'AFAMILYID').exec(/* callback */);
You wouldn't need to use the ref much as using the populate functionality wouldn't be particularly useful in your situation (http://mongoosejs.com/docs/populate.html), but it documents the schema definition better, so I'd use it.

You can use two collections, one for families and other for members. You can use a field in members collection in order to link them with one family (by "_id" for instance) of the other collection.
When you have to add new element you can search on "members" collections if the element already exists. An index could help to speed up the query.

Node.js - Mongoose/MongoDB - Model Schema

I am creating a blog system in Node.js with mongodb as the db.
I have contents like this: (blog articles):
// COMMENTS SCHEMA:
// ---------------------------------------
var Comments = new Schema({
author: {
type: String
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
}
});
exports.Comments = mongoose.model('Comments',Comments);
var Tags = new Schema({
name: {
type: String
}
});
exports.Tags = mongoose.model('Tags',Tags);
// CONTENT SCHEMA:
// ---------------------------------------
exports.Contents = mongoose.model('Contents', new Schema({
title: {
type: String
},
author: {
type: String
},
permalink: {
type: String,
unique: true,
sparse: true
},
catagory: {
type: String,
default: ''
},
content: {
type: String
},
date_entered: {
type: Date,
default: Date.now
},
status: {
type: Number
},
comments: [Comments],
tags: [Tags]
}));
I am a little new to this type of database, im used to MySQL on a LAMP stack.
Basically my question is as follows:
whats the best way to associate the Contents author to a User in the
DB?
Also, whats the best way to do the tags and categories?
In MYSQL we would have a tags table and a categories table and relate by keys, I am not sure the best and most optimal way of doing it in Mongo.
THANK YOU FOR YOUR TIME!!

Couple of ideas for Mongo:
The best way to associate a user is e-mail address - as an attribute of the content/comment document - e-mail is usually a reliable unique key. MongoDB doesn't have foreign keys or associated constraints. But that is fine.
If you have a registration policy, add user name, e-mail address and other details to the users collection. Then de-normalize the content document with the user name and e-mail. If, for any reason, the user changes the name, you will have to update all the associated contents/comments. But so long as the e-mail address is there in the documents, this should be easy.
Tags and categories are best modelled as two lists in the content document, IMHO.
You can also create two indices on these attributes, if required. Depends on the access patterns and the UI features you want to provide
You can also add a document which keeps a tag list and a categories list in the contents collection and use $addToSet to add new tags and categories to this document. Then, you can show a combo box with the current tags as a starting point.
As a final point, think through the ways you plan to access the data and then design documents, collections & indices accordingly
[Update 12/9/11] Was at MongoSv and Eliot (CTO 10gen) presented a pattern relevant to this question: Instead of one comment document per user (which could grow large) have a comment document per day for a use with _id = -YYYYMMDD or even one per month depending on the frequency of comments. This optimizes index creation/document growth vs document proliferation (in case of the design where there is one comment per user).

The best way to associate the Content Authors to a User in the MongoDB, is to take an array in Author Collection which keeps an reference to User. Basically Array because One Content/Book may have multiple Authors i.e. you need to associate one Content to many Users.
The best way for category is to create a different collection in your DB and similarly as above keep a array in Contents.
I hope it helps at-least a little.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse