Mongoose joining data - mongodb

If I have an object in my MongoDB that will need to be used EVERYWHERE in my system, so it is in its own collection. However, I con't quite figure out how to get the data to show up automatically on the other objects it is joined to.
Here is an example:
Schema1 = { name: String }
Schema2 = { something: String, other_thing: [{schema1_id: String}] }
Now what I want is to be able to say var name = mySchema2.name; and get the name of the linked Schema1 object.
I am using Mongoose, Express and Node.js and I have tried using a Mongoose 'virtual' for this, but when I say res.send(myobject); I don't see the virtual property anywhere on the object.
What is the best way to do this?

I know it is far after you post the question but it might help others.
If you use this reference all over you may want to consider using embedded document. The benefits of embedded document is that you get them when you query the parent document thus it save you additional query and the drawbacks is that the parent document may become large (or even very large) thus you should use them but use them carefully.
Here is an example of simple embedded document. Instead of referencing 'comments' in the post document, which require additional query, we will embed it (code is a bit pseudo):
var postSchema = new Schema({
author : {type : String},
title : {type : String, required : true},
content : {type : String, required : true},
comment : {
owner : {type : String},
subject : {type: String, required : true},
content : {type String, required : true}
}
});
MongoDB allows you a simple and convenience way to query comments' fields by the dot character. For example if we like to query only comments which their subject starts with 'car' we do as follow:
myPostModel.find({ 'comment.subject' : /car*/ }).exec(function(err, result){
Do some stuff with the result...
});
Note that for simplicity of the example the comment field in the post is not an array (one comment per post is allowed in this example). However even if it will be an array, mongo refer to array's elements very elegantly in the same way.

There are a couple of plugins to help with DBRefs in Mongoose.
mongoose-dbref uses the DBRef standards and is probably a good place to start.
mongoose-plugins is one I wrote a while ago but it works in a slightly different way.

In mongodb no such word as JOIN, because of joins killing scalability. But most drivers support DBRefs, they are just making additional request to load referenced data.
So you can just make additional request yourself to load object that you using everywhere.
If you using some object everywhere in your app it sounds like object that need to be in cache. But mongodb work as some kind o cache if enough memory to load object into memory. So, to keep it simple just make additional request to load object.

Related

Mongoose: Populate on existing DBRef

I am migrating a Spring project to Nextjs&co for personal enrichment.
I have an existing mongodb database with school related collections such as:
// school (as json)
"_id" : ObjectId("5f457f041291df2910dea1ed"),
"name" : "San Lucas Primary School",
...
"campus" : DBRef("campus", ObjectId("5f457dd9126d210893e14e11"))
I've loaded up mongoose, and have tried to wrangle it for the last few days to get it to populate campus.
If I define the schema like so:
import mongoose from 'mongoose'
const SchoolSchema = new mongoose.Schema({
name: String,
campus: {type: mongoose.Schema.Types.ObjectId, ref: 'campus'},
});
module.exports = mongoose.model("School", SchoolSchema, 'school') // i define the existing collection name 'school' to avoid the built in pluralization
When I do school.find() in debugger, I get the mongoose model object. The campus field is missing, and there is an error: ValidatorError: Cannot read properties of undefined (reading 'options')\n at _init
When I alter the Schema to not include campus:
import mongoose from 'mongoose'
const SchoolSchema = new mongoose.Schema({
name: String,
// campus: {type: mongoose.Schema.Types.ObjectId, ref: 'campus'},
});
module.exports = mongoose.model("School", SchoolSchema, 'school')
The debugger now spits out the whole object, including campus but it looks like this:
campus = DBRef {collection: "campus", oid: ObjectId, db: undefined, fields: Object}
There was another configuration where it was spitting it out as if it were creating the object at runtime, new DBref("campus", new ObjectId("...")) or something like that.
When I json it out, it always ends up {$ref: 'campus', $id: ...}. But if I do not include it in the schema, I can't do all that handy populate and things.
I'm this far from extracting the id as a string and doing findById().
Folks, I am STUMPED.
DBRef is a rather controversial convention of data format coming from early versions (doc for v2.2) long before $lookup was added in v3.2. The main reason was to allow cross-database references between documents, yet even then it was not recommended except very niche usecases:
In most cases you should use the manual reference method for connecting two or more related documents. However, if you need to reference documents from multiple collections, consider using DBRefs.
It is not supported by all drivers, caused many problems with export/import because regular field names could not start with $ as recently as in v4.4 https://www.mongodb.com/docs/v4.4/reference/limits/#mongodb-limit-Restrictions-on-Field-Names
Mongoose on the other hand, is quite opinionated ODM which comes with it's own conventions. Automatic opted out from DBRefs in favour of their own Population logic, which is so much incompatible with DBRefs, that they even stopped calling it "DBRef-like" starting from v3.0.
There was an attempt to add support of native DBRefs to mongoose, but the project looks abandoned. You may find it useful to read this explaination: DbRef with Mongoose - mongoose-dbref or populate?
Anyway, apart from DBRefs you will likely face other issues related to mongoose vs spring conventions of document structure. Off the top of my head it's mongoose optimistic locking, which relies on the value of _v field, otherwise not exposed on the application level, etc.
If you intend to use the same database in a heterogeneous setup you can't really use anything but native drivers, as all ODMs come with own conventions and it is very likely they won't match.

Node, MongoDB, Mongoose Design Choice - Creating two collections or one collection

I'm struggling with a large design choice for my applications' mongo collections and mongoose schemas.
My applications calls for two account types: Students and Teachers.
The only similarity between the two account types is that they both require the fields: firstName, lastName, email, and password. Other than that, they are different (teachers have "assignments", "tests", students have "homework", etc.)
I have pondered my options extensively, and considered the following design choices:
Use mongoose-schema-extend, and create an "abstract" schema for
all accounts. Then, extend this schema to create the Teacher and
Student schemas. This implies two collections, and therefore some
redundant fields. There are also issues with logging in and account creation (checking to see if the email used to log in is a student email or teacher email, etc.)
Create one collection "accounts", and add a type field to
indicate if the account is a "student" or a "teacher". This implies
that entries in the "accounts" collection will be dissimilar. This
also requires that I have two mongoose schemas for a single
collection.
Create an "accounts" collection, have a "type" field and an "accountId" field. In addition to a "student" collection and a "teacher" collection -- the "type" field will indicate which collection the student-specific or teacher-specific fields reside within, and the "accountId" field will indicate exactly which entry the account is matched with.
I appreciate all input, criticism or suggestions.
I've been down a similar road and I eventually landed on a mix of option 1 and 2.
mongoose-schema-extend simply modifies the prototype of Schema with an #extend() method which when invoked performs a deep copy of the passed schema. Most helpful. However, you can control which collection mongoose saves to in MongoDB by adding a collections property to the Schema:
var schema = new Schema({
foo: String,
bar: Boolean
}, { collection: "FooBarBaz" });
Remember: Mongoose understands the concept of a Schema but MongoDB does not. This means you can store dissimilar data and use your custom business logic to control the mess. With that said, you can create a base model called User, force mongoose to use the same collection by using the collection option and then extend off this base model to make your Teachers and Students models.
Make sure you add a type flag in the base model as you suggested in option 2. Not only is this convenient for quick lookups, but it will be critical when working commando with raw MongoDB data.
#jibsales has an excellent solution.
One more solution to consider is using Population with references http://mongoosejs.com/docs/populate.html from the Users collection to the Student and Teacher collections. Some benefits are:
Entries in each of the three collections (Users, Teachers, Students)
are similar in storage.
Allows you to obtain the fields for the "User" independently of
obtaining the fields for the referenced collection.
This would require that the schema is modified before an instance is created (and a model is created from the schema), where refType is the desired collection:
var userSchema = new Schema({
_id : Number,
name : String,
age : Number,
stories : [{ type: Schema.Types.ObjectId, ref: refType}]
});

MongoDB schema embedding and nested unique keys

I have a relational SQL DB that's being changed to MongoDB. In SQL there are 3 tables that are relevant: Farm, Division, Wombat (names and purpose changed for this question). There's also a Farmer table which is the equivalent of a users table.
Using Mongoose I've come up with this new schema:
var mongoose = require('mongoose');
var farmSchema = new mongoose.Schema({
// reference to the farmer collection's _id key
farmerId: mongoose.Schema.ObjectId,
name: String, // name of farm
division: [{
divisionId: mongoose.Schema.ObjectId,
name: String,
wombats: [{
wombatId: mongoose.Schema.ObjectId,
name: String,
weight: Number
}]
}]
});
Each of the (now) nested collections has a unique field in it. This will allow me to use Ajax to send just the uniqueId and the weight (for example) to adjust that value instead of updating the entire document when only the weight changes.
This feels like an incorrect SQL adaptation for MongoDB. Is there a better way to do this?
In general, I believe that people tend to embed way too much when using MongoDB.
The most important argument is that having different writers to the same objects makes things a lot more complicated. Working with arrays and embedded objects can be tricky and some modifications are impossible, for instance because there's no positional operator matching in nested arrays.
For your particular scenario, take note that unique array keys might not behave as expected, and that behavior might change in future releases.
It's often desirable to opt for a simple SQL-like schema such as
Farm {
_id : ObjectId("...")
}
Division {
_id : ObjectId("..."),
FarmId : ObjectId("..."),
...
}
Wombat {
_id : ObjectId("..."),
DivisionId : ObjectId("..."),
...
}
Whether embedding is the right approach or not very much depends on usage patterns, data size, concurrent writes, etc. - a key difference to SQL is that there is no one right way to model 1:n or n:n relationships, so you'll have to carefully weigh the pros and cons for each scenario. In my experience, having a unique ID is a pretty strong indicator that the document should be a 'first-class citizen' and have its own collection.

Confusion regarding Mongo db Schema. How to make it better?

I am using mongoose with node.js for this.
My current Schema is this:
var linkSchema = new Schema({
text: String,
tags: array,
body: String,
user: String
})
My use-case is this: There are a list of users and each user has a list of links associated with it. Users and links are different Schemas of course. Thus, how does one get that sort of one to one relationship done using mongo-db.
Should I make a User Schema and embed linkSchema in it? Or the other way around?
Another doubt regarding that. Tags would always be an array of strings which I can use to browse through links later. Should it be an array data type or is there a better way to represent it?
If it's 1:1 then nest one document inside the other. Which way around depends on the queries, but you could easily do both if you need to.
For tags, you can index an array field and use that for searching/filtering documents and from the information you've given that sounds reasonable IMHO.
If you had a fixed set of tags it would make sense to represent those as a nested object with named fields perhaps, depending on queries. Don't forget you not only can create nested documents in Mongo but you can also search on sub-fields and even use entire nested documents as searchable/indexable fields. For instance, you could have a username like this;
email: "joe#somewhere.com"
as a string, and you could also do;
email: {
user: "joe",
domain: "somewhere.com"
}
you could index email in both cases and use either for matching. In the latter case though you could also search on domain or user only without resorting to RegEx style queries. You could also store both variants, so there's lots of flexibile options in Mongo.
Going back to tags, I think your array of strings is a fine model given what you've described, but if you were doing more complex bulk aggregation, it wouldn't be crazy to store a document for every tag with the same document contents, since that's essentially what you'd have to do for every query during aggregation.

Why does Mongoose have both schemas and models?

The two types of objects seem to be so close to one another that having both feels redundant. What is the point of having both schemas and models?
EDIT: Although this has been useful for many people, as mentioned in the comments it answers the "how" rather than the why. Thankfully, the why of the question has been answered elsewhere also, with this answer to another question. This has been linked in the comments for some time but I realise that many may not get that far when reading.
Often the easiest way to answer this type of question is with an example. In this case, someone has already done it for me :)
Take a look here:
http://rawberg.com/blog/nodejs/mongoose-orm-nested-models/
EDIT: The original post (as mentioned in the comments) seems to no longer exist, so I am reproducing it below. Should it ever return, or if it has just moved, please let me know.
It gives a decent description of using schemas within models in mongoose and why you would want to do it, and also shows you how to push tasks via the model while the schema is all about the structure etc.
Original Post:
Let’s start with a simple example of embedding a schema inside a model.
var TaskSchema = new Schema({
name: String,
priority: Number
});
TaskSchema.virtual('nameandpriority')
.get( function () {
return this.name + '(' + this.priority + ')';
});
TaskSchema.method('isHighPriority', function() {
if(this.priority === 1) {
return true;
} else {
return false;
}
});
var ListSchema = new Schema({
name: String,
tasks: [TaskSchema]
});
mongoose.model('List', ListSchema);
var List = mongoose.model('List');
var sampleList = new List({name:'Sample List'});
I created a new TaskSchema object with basic info a task might have. A Mongoose virtual attribute is setup to conveniently combine the name and priority of the Task. I only specified a getter here but virtual setters are supported as well.
I also defined a simple task method called isHighPriority to demonstrate how methods work with this setup.
In the ListSchema definition you’ll notice how the tasks key is configured to hold an array of TaskSchema objects. The task key will become an instance of DocumentArray which provides special methods for dealing with embedded Mongo documents.
For now I only passed the ListSchema object into mongoose.model and left the TaskSchema out. Technically it's not necessary to turn the TaskSchema into a formal model since we won’t be saving it in it’s own collection. Later on I’ll show you how it doesn’t harm anything if you do and it can help to organize all your models in the same way especially when they start spanning multiple files.
With the List model setup let’s add a couple tasks to it and save them to Mongo.
var List = mongoose.model('List');
var sampleList = new List({name:'Sample List'});
sampleList.tasks.push(
{name:'task one', priority:1},
{name:'task two', priority:5}
);
sampleList.save(function(err) {
if (err) {
console.log('error adding new list');
console.log(err);
} else {
console.log('new list successfully saved');
}
});
The tasks attribute on the instance of our List model (sampleList) works like a regular JavaScript array and we can add new tasks to it using push. The important thing to notice is the tasks are added as regular JavaScript objects. It’s a subtle distinction that may not be immediately intuitive.
You can verify from the Mongo shell that the new list and tasks were saved to mongo.
db.lists.find()
{ "tasks" : [
{
"_id" : ObjectId("4dd1cbeed77909f507000002"),
"priority" : 1,
"name" : "task one"
},
{
"_id" : ObjectId("4dd1cbeed77909f507000003"),
"priority" : 5,
"name" : "task two"
}
], "_id" : ObjectId("4dd1cbeed77909f507000001"), "name" : "Sample List" }
Now we can use the ObjectId to pull up the Sample List and iterate through its tasks.
List.findById('4dd1cbeed77909f507000001', function(err, list) {
console.log(list.name + ' retrieved');
list.tasks.forEach(function(task, index, array) {
console.log(task.name);
console.log(task.nameandpriority);
console.log(task.isHighPriority());
});
});
If you run that last bit of code you’ll get an error saying the embedded document doesn’t have a method isHighPriority. In the current version of Mongoose you can’t access methods on embedded schemas directly. There’s an open ticket to fix it and after posing the question to the Mongoose Google Group, manimal45 posted a helpful work-around to use for now.
List.findById('4dd1cbeed77909f507000001', function(err, list) {
console.log(list.name + ' retrieved');
list.tasks.forEach(function(task, index, array) {
console.log(task.name);
console.log(task.nameandpriority);
console.log(task._schema.methods.isHighPriority.apply(task));
});
});
If you run that code you should see the following output on the command line.
Sample List retrieved
task one
task one (1)
true
task two
task two (5)
false
With that work-around in mind let’s turn the TaskSchema into a Mongoose model.
mongoose.model('Task', TaskSchema);
var Task = mongoose.model('Task');
var ListSchema = new Schema({
name: String,
tasks: [Task.schema]
});
mongoose.model('List', ListSchema);
var List = mongoose.model('List');
The TaskSchema definition is the same as before so I left it out. Once its turned into a model we can still access it’s underlying Schema object using dot notation.
Let’s create a new list and embed two Task model instances within it.
var demoList = new List({name:'Demo List'});
var taskThree = new Task({name:'task three', priority:10});
var taskFour = new Task({name:'task four', priority:11});
demoList.tasks.push(taskThree.toObject(), taskFour.toObject());
demoList.save(function(err) {
if (err) {
console.log('error adding new list');
console.log(err);
} else {
console.log('new list successfully saved');
}
});
As we’re embedding the Task model instances into the List we’re calling toObject on them to convert their data into plain JavaScript objects that the List.tasks DocumentArray is expecting. When you save model instances this way your embedded documents will contain ObjectIds.
The complete code example is available as a gist. Hopefully these work-arounds help smooth things over as Mongoose continues to develop. I’m still pretty new to Mongoose and MongoDB so please feel free to share better solutions and tips in the comments. Happy data modeling!
Schema is an object that defines the structure of any documents that will be stored in your MongoDB collection; it enables you to define types and validators for all of your data items.
Model is an object that gives you easy access to a named collection, allowing you to query the collection and use the Schema to validate any documents you save to that collection. It is created by combining a Schema, a Connection, and a collection name.
Originally phrased by Valeri Karpov, MongoDB Blog
I don't think the accepted answer actually answers the question that was posed. The answer doesn't explain why Mongoose has decided to require a developer to provide both a Schema and a Model variable. An example of a framework where they have eliminated the need for the developer to define the data schema is django--a developer writes up their models in the models.py file, and leaves it to the framework to manage the schema. The first reason that comes to mind for why they do this, given my experience with django, is ease-of-use. Perhaps more importantly is the DRY (don't repeat yourself) principle--you don't have to remember to update the schema when you change the model--django will do it for you! Rails also manages the schema of the data for you--a developer doesn't edit the schema directly, but changes it by defining migrations that manipulate the schema.
One reason I could understand that Mongoose would separate the schema and the model is instances where you would want to build a model from two schemas. Such a scenario might introduce more complexity than is worth managing--if you have two schemas that are managed by one model, why aren't they one schema?
Perhaps the original question is more a relic of the traditional relational database system. In world NoSQL/Mongo world, perhaps the schema is a little more flexible than MySQL/PostgreSQL, and thus changing the schema is more common practice.
To understand why? you have to understand what actually is Mongoose?
Well, the mongoose is an object data modeling library for MongoDB and Node JS, providing a higher level of abstraction. So it's a bit like the relationship between Express and Node, so Express is a layer of abstraction over regular Node, while Mongoose is a layer of abstraction over the regular MongoDB driver.
An object data modeling library is just a way for us to write Javascript code that will then interact with a database. So we could just use a regular MongoDB driver to access our database, it would work just fine.
But instead we use Mongoose because it gives us a lot more functionality out of the box, allowing for faster and simpler development of our applications.
So, some of the features Mongoose gives us schemas to model our data and relationship, easy data validation, a simple query API, middleware, and much more.
In Mongoose, a schema is where we model our data, where we describe the structure of the data, default values, and validation, then we take that schema and create a model out of it, a model is basically a wrapper around the schema, which allows us to actually interface with the database in order to create, delete, update, and read documents.
Let's create a model from a schema.
const tourSchema = new mongoose.Schema({
name: {
type: String,
required: [true, 'A tour must have a name'],
unique: true,
},
rating: {
type: Number,
default: 4.5,
},
price: {
type: Number,
required: [true, 'A tour must have a price'],
},
});
//tour model
const Tour = mongoose.model('Tour', tourSchema);
According to convetion first letter of a model name must be capitalized.
Let's create instance of our model that we created using mongoose and schema. also, interact with our database.
const testTour = new Tour({ // instance of our model
name: 'The Forest Hiker',
rating: 4.7,
price: 497,
});
// saving testTour document into database
testTour
.save()
.then((doc) => {
console.log(doc);
})
.catch((err) => {
console.log(err);
});
So having both schama and modle mongoose makes our life easier.
Think of Model as a wrapper to schemas. Schemas define the structure of your document , what kind of properties can you expect and what will be their data type (String,Number etc.). Models provide a kind of interface to perform CRUD on schema. See this post on FCC.
Schema basically models your data (where you provide datatypes for your fields) and can do some validations on your data. It mainly deals with the structure of your collection.
Whereas the model is a wrapper around your schema to provide you with CRUD methods on collections. It mainly deals with adding/querying the database.
Having both schema and model could appear redundant when compared to other frameworks like Django (which provides only a Model) or SQL (where we create only Schemas and write SQL queries and there is no concept of model). But, this is just the way Mongoose implements it.