Node, MongoDB, Mongoose Design Choice - Creating two collections or one collection - mongodb

I'm struggling with a large design choice for my applications' mongo collections and mongoose schemas.
My applications calls for two account types: Students and Teachers.
The only similarity between the two account types is that they both require the fields: firstName, lastName, email, and password. Other than that, they are different (teachers have "assignments", "tests", students have "homework", etc.)
I have pondered my options extensively, and considered the following design choices:
Use mongoose-schema-extend, and create an "abstract" schema for
all accounts. Then, extend this schema to create the Teacher and
Student schemas. This implies two collections, and therefore some
redundant fields. There are also issues with logging in and account creation (checking to see if the email used to log in is a student email or teacher email, etc.)
Create one collection "accounts", and add a type field to
indicate if the account is a "student" or a "teacher". This implies
that entries in the "accounts" collection will be dissimilar. This
also requires that I have two mongoose schemas for a single
collection.
Create an "accounts" collection, have a "type" field and an "accountId" field. In addition to a "student" collection and a "teacher" collection -- the "type" field will indicate which collection the student-specific or teacher-specific fields reside within, and the "accountId" field will indicate exactly which entry the account is matched with.
I appreciate all input, criticism or suggestions.

I've been down a similar road and I eventually landed on a mix of option 1 and 2.
mongoose-schema-extend simply modifies the prototype of Schema with an #extend() method which when invoked performs a deep copy of the passed schema. Most helpful. However, you can control which collection mongoose saves to in MongoDB by adding a collections property to the Schema:
var schema = new Schema({
foo: String,
bar: Boolean
}, { collection: "FooBarBaz" });
Remember: Mongoose understands the concept of a Schema but MongoDB does not. This means you can store dissimilar data and use your custom business logic to control the mess. With that said, you can create a base model called User, force mongoose to use the same collection by using the collection option and then extend off this base model to make your Teachers and Students models.
Make sure you add a type flag in the base model as you suggested in option 2. Not only is this convenient for quick lookups, but it will be critical when working commando with raw MongoDB data.

#jibsales has an excellent solution.
One more solution to consider is using Population with references http://mongoosejs.com/docs/populate.html from the Users collection to the Student and Teacher collections. Some benefits are:
Entries in each of the three collections (Users, Teachers, Students)
are similar in storage.
Allows you to obtain the fields for the "User" independently of
obtaining the fields for the referenced collection.
This would require that the schema is modified before an instance is created (and a model is created from the schema), where refType is the desired collection:
var userSchema = new Schema({
_id : Number,
name : String,
age : Number,
stories : [{ type: Schema.Types.ObjectId, ref: refType}]
});

Related

MongoDB lookup ObjectID when the collection it belongs to is unknown?

The software I am currently working with can only run aggregate queries or simple find_one's. I am new to mongodb ,so I am having difficulty figuring out if I can do what I would like to do.
The Question:
Is it possible to run a lookup query on an object id when that object id may be in one of many collections?
The setup:
I have a main collection, this main collection is essentially an array of other ObjectID's that apply to this object. This collection (call it Main_Config) consists of three ObjectID's.
Client
General_Config
Role_Config
The Client, General_Config, and Main_Config can all have an enforced schema I would like the Role_Config to also have an enforced schema. This is where the issue comes into play, the Role_Config, may take 3 or more possible schemas. My idea was to create a collection for every possible schema, however if I do this I will not know to what collection the Role_Config ObjectID belongs to. Is there a way to lookup an ObjectID that may exist in one of many collections?
There is no findInAnyCollection() type of function. In your model you will have to manually code a loop and look it up.
One approach: In your main config collection, we have docs with this field:
otherIds = [ {coll: "ROLE", key: "5fb8057f08c09fb8dfe8d310"}, {coll: "GENERAL", key: "GENERAL_72f2b2922ed98800bd0e"}, ...]
Putting it all together:
db.AA.drop();
db.BB.drop();
db.CC.drop();
db.AA.insert({_id:0, otherIds: [ {coll:"BB", key:0}, {coll:"BB", key:1}, {coll:"CC", key:2}]});
db.BB.insert({_id:0, foo:"bar", baz:"bin"});
db.BB.insert({_id:1, foo:"ion", baz:"kjlkj"});
db.BB.insert({_id:2, foo:"POPPO", baz:"UHUH"});
db.CC.insert({_id:0, data: "wfwefw"});
db.CC.insert({_id:1, data: "jj"});
db.CC.insert({_id:2, data: "mm"});
doc = db.AA.findOne();
doc['otherIds'].forEach(function(item) {
var other = db[item['coll']].findOne({_id:item['key']});
printjson(other);
});

Moving from relational db to mongodb

I have a question on best practises or ideal way how I should store the data in the database. As an example I have a Site that has a Country assigned.
Table Countries: id|name|alpha2
Table Sites: id|countryId|name
Each Site has a reference to the country ID.
I would like to create a new website using Meteor and its mongodb and was wondering how I should store the objects. Do I create a colleciton "countries" and "sites" and use the country _id to as a reference? Then resolve the references using transform?
Looking at SimpleSchema I came up with the following:
Schemas.Country = new SimpleSchema ({
name: {
type: String
},
alpha2: {
type: String,
max: 2
}
});
Schemas.Site = new SimpleSchema({
name: {
type: String,
label: "Site Name"
},
country: {
type: Schemas.Country
}
});
Countries = new Meteor.Collection("countries");
Countries.attachSchema(Schemas.Country);
Sites = new Meteor.Collection("sites");
Sites.attachSchema(Schemas.Site);
I was just wondering how this is then stored in the db. As I have 2 collections but inside the sites collection I do have defined country objects as well. What if a country changes its alpha2 code (very unlikely)?
Also this would continue where I have a collection called "conditions". Each condition will have a Site defined. I could now define the whole Site object into the condition object. What if the Sitename changes? Would I need to manually change it in all condition objects?
This confuses me a bit. I am very thankful for all your thoughts.
The challenge with Meteor is that its tightly bound to Mongo, which is not good to built OLTP app that require normalized DB design. Mongo is good for OLAP kind of apps which fall in WORM (Write Once Read Many) category. I would like to see Meteor supporting OrientDB as they do Mongo.
There can be two approaches:
Normalize the DB as we do in RDBMS and then retrieve data by hitting
data multiple times. Here is a good article explaining this approach - reactive joins in meteor.
Joins in
Meteor
are suggested in future. You can also try Meteor packages - publish
composite or
publish with
relations
Keep data de-normalized at least partially (for 1-N relation you can
embed things in document, for N-N relation you may having separate
collection). For instance, 'Student' can be embedded in 'Class' as
student will never be in more than 1 class, but to relate 'Student'
and 'Subject', they can be in different collections (N-N relation -
student will have more than one subject and each subject will be
taken by more than one student). For fetching N-N relation again you
can use the same approach that is mentioned point above.
I am not able to give you exact code example, but I hope it helps.

MongoDB schema embedding and nested unique keys

I have a relational SQL DB that's being changed to MongoDB. In SQL there are 3 tables that are relevant: Farm, Division, Wombat (names and purpose changed for this question). There's also a Farmer table which is the equivalent of a users table.
Using Mongoose I've come up with this new schema:
var mongoose = require('mongoose');
var farmSchema = new mongoose.Schema({
// reference to the farmer collection's _id key
farmerId: mongoose.Schema.ObjectId,
name: String, // name of farm
division: [{
divisionId: mongoose.Schema.ObjectId,
name: String,
wombats: [{
wombatId: mongoose.Schema.ObjectId,
name: String,
weight: Number
}]
}]
});
Each of the (now) nested collections has a unique field in it. This will allow me to use Ajax to send just the uniqueId and the weight (for example) to adjust that value instead of updating the entire document when only the weight changes.
This feels like an incorrect SQL adaptation for MongoDB. Is there a better way to do this?
In general, I believe that people tend to embed way too much when using MongoDB.
The most important argument is that having different writers to the same objects makes things a lot more complicated. Working with arrays and embedded objects can be tricky and some modifications are impossible, for instance because there's no positional operator matching in nested arrays.
For your particular scenario, take note that unique array keys might not behave as expected, and that behavior might change in future releases.
It's often desirable to opt for a simple SQL-like schema such as
Farm {
_id : ObjectId("...")
}
Division {
_id : ObjectId("..."),
FarmId : ObjectId("..."),
...
}
Wombat {
_id : ObjectId("..."),
DivisionId : ObjectId("..."),
...
}
Whether embedding is the right approach or not very much depends on usage patterns, data size, concurrent writes, etc. - a key difference to SQL is that there is no one right way to model 1:n or n:n relationships, so you'll have to carefully weigh the pros and cons for each scenario. In my experience, having a unique ID is a pretty strong indicator that the document should be a 'first-class citizen' and have its own collection.

Meteor: Embed documents inside a document or separate them into each collection object and link them?

I am running into a scenario where I am asking myself do I need to put each entity (a Classroom has many students) into separate Meteor.collection object or rather embed an array of students inside the classroom object and have one Meteor.collection Classroom object.
My instinct tells me to put Classroom and Students in their own Meteor.collections but I am not sure how to establish a one to many relationship between the two Meteor collection objects.
What if there are many more traditional one-to-many, many-to-many relationships translate into Meteor way of doing things?
My question arises from the fact that .aggregate() is not supported, and realizing that it's impossible without doing a recursive loop to grab nested and embedded documents, inside a parent document in which Meteor collection exists (ex. Classroom).
Most of the time it is useful to put separate object types into separate collections.
Let's say we have a one to many relationship:
Classrooms.insert({
_id: "sdf8ad8asdj2jef",
name: "test classroom"
});
Students.insert({
_id: "lof8gzanasd9a7j2n",
name: "John"
classroomId: "sdf8ad8asdj2jef"
});
Get all Students in classroom sdf8ad8asdj2jef:
Students.find({classroomId: "sdf8ad8asdj2jef"});
Get the classroom with student lof8gzanasd9a7j2n:
var student = Studtents.findOne("lof8gzanasd9a7j2n");
var classroom = Classrooms.find(student.classroomId);
Putting the objects into separate collections is especially useful when you are going to use Meteor.publish() and Meteor.subscribe().
Meteor.publish() is pretty handy when you want to publish only data to the client that is really relevant to the user.
The following publishes only students who are in the room with the given classroomId.
(So the client doesn't have to download all student objects from the server database. Only those who are relevant.)
Meteor.publish("students", function (classroomId) {
return Students.find({classroomId: classroomId});
});
Many to many relationships are also not that complicated:
Classrooms.insert({
_id: "sdf8ad8asdj2jef",
name: "test classroom"
studentIds: ["lof8gzanasd9a7j2n"]
});
Students.insert({
_id: "lof8gzanasd9a7j2n",
name: "John"
classroomIds: ["sdf8ad8asdj2jef"]
});
Get all students in classroom sdf8ad8asdj2jef:
Students.find({classroomIds: "sdf8ad8asdj2jef"});
Get all classrooms with student lof8gzanasd9a7j2n:
Classrooms.find({studentIds: "lof8gzanasd9a7j2n"});
More information on MongoDBs read operations.
Separate collections for students and classrooms seems more straightforward.
I think just keeping a 'classroom' or 'classroomId' field in each student document will allow you to join the two collections when necessary.

Mongoid: retrieving documents whose _id exists in another collection

I am trying to fetch the documents from a collection based on the existence of a reference to these documents in another collection.
Let's say I have two collections Users and Courses and the models look like this:
User: {_id, name}
Course: {_id, name, user_id}
Note: this just a hypothetical example and not actual use case. So let's assume that duplicates are fine in the name field of Course. Let's thin Course as CourseRegistrations.
Here, I am maintaining a reference to User in the Course with the user_id holding the _Id of User. And note that its stored as a string.
Now I want to retrieve all users who are registered to a particular set of courses.
I know that it can be done with two queries. That is first run a query and get the users_id field from the Course collection for the set of courses. Then query the User collection by using $in and the user ids retrieved in the previous query. But this may not be good if the number of documents are in tens of thousands or more.
Is there a better way to do this in just one query?
What you are saying is a typical sql join. But thats not possible in mongodb. As you suggested already you can do that in 2 different queries.
There is one more way to handle it. Its not exactly a solution, but the valid workaround in NonSql databases. That is to store most frequently accessed fields inside the same collection.
You can store the some of the user collection fields, inside the course collection as embedded field.
Course : {
_id : 'xx',
name: 'yy'
user:{
fname : 'r',
lname :'v',
pic: 's'
}
}
This is a good approach if the subset of fields you intend to retrieve from user collection is less. You might be wondering the redundant user data stored in course collection, but that's exactly what makes mongodb powerful. Its a one time insert but your queries will be lot faster.