MongoDB/Mongoose querying subdocuments and indexing

MongoDB/Mongoose querying subdocuments and indexing - mongodb

Ok, so I am new to MongoDB and the world of document based databases. I have stored in MongoDB a collection of profiles which has two subdocuments; 'Interest' and 'Country'. Below is schema information:
var Country = new mongoose.Schema({
name: String,
countryCode: String
});
var Interest = new mongoose.Schema({
label: {
type: String
}
});
var Profile = connection.model('profile', {
uid: {
type: String,
unique: true,
sparse: true
},
username: String
country: Country,
interests: [Interest],
member_since: Date
});
Let's say I want to be able to run a fast and efficient query such that I can select all of the users interested in 'music' and whose countryCode is 'AU' but done in an efficient way that doesn't scan all documents (I'm guessing I need an index?), how can I do this? Below is a sample profile as it appears in Compass:
_id:59d17efa3ed3a453e2b865f9
username:"Rudolph"
country:Object
name: "Australia"
countryCode:"AU"
_id:59d17efa3ed3a453e2b865fa
__v:0
interests:Object
label:Array
0:"music"
1:"film"
2:"dance"
member_since:2017-10-01 19:49:14.565

Related

MongoDB (Mongoose) data structure question

I'm curious about the best way to represent this kind of situation in Mongo. I have my own idea, but I'm curious on what the general consensus/best practice actually would be.
Imagine I have two collections:-
Employees
--> _id
--> FirstName
--> Surname
--> Email
Comments
--> _id
--> PersonReference
--> CommentDate
--> Comment
Now imagine that Employees can come and go and the 'Employees' collection is always up-to-date. However, in the event that an employee has ever made a comment, the full information on the comment including who made it must be available.
The way I would propose to tackle this problem, is to organise the structure like this instead:-
Employees
--> _id: _id
--> FirstName: string
--> Surname: string
--> Email: string
Comments
--> _id: _id
--> CommentDate: date
--> Comment: string
[-] --> PersonReference
[+] --> Employee: object { _id: id, FirstName: string, Surname: string, Email:string }
So essentially, I would have a list of 'Active Employees' and at a time where a comment is made, I would duplicate the employee information into the Comments collection document (rather than use a reference).
From a high level perspective, is this considered best practise?
Many thanks

Duplicating the employee info in the comments collection is really a bad idea.
When an employee info needs to be changed, it will also needs to be updated in the comments.
You have a few options:
1-) Embedding the comments inside the Employee schema:
In this method we have no separate Comments collection.
If you have no need to independently query comments, this method makes sense.
This way we can access a user and his/her comments in one db access and without needing any join (populate or lookup).
The schema for this can be like this:
const mongoose = require("mongoose");
const employeeSchema = new mongoose.Schema({
firstName: String,
username: String,
email: String,
comments: [
new mongoose.Schema({
commentDate: Date,
comment: String
})
]
});
module.exports = mongoose.model("Employee", employeeSchema);
2-) Parent referencing:
In this method we keep the reference of the comments in the Employee schema.
If you don't need to access to employee from a comment, this can an option.
Employee Schema:
const mongoose = require("mongoose");
const employeeSchema = new mongoose.Schema({
firstName: String,
username: String,
email: String,
comments: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "Comment"
}
]
});
module.exports = mongoose.model("Employee", employeeSchema);
Comment Schema:
const mongoose = require("mongoose");
const commentSchema = new mongoose.Schema({
commentDate: Date,
comment: String
});
module.exports = mongoose.model("Comment", commentSchema);
3-) Child referencing
In this method we keep reference of the employee in the comments.
So if you need to access the comments from an employee we need to use Populate Virtual feature of mongoose. Becase in employee schema we don't have a reference to the comments.
Employee Schema:
const mongoose = require("mongoose");
const employeeSchema = new mongoose.Schema(
{
firstName: String,
username: String,
email: String
},
{
toJSON: { virtuals: true } // required to use populate virtual
}
);
// Populate virtual
employeeSchema.virtual("comments", {
ref: "Comment",
foreignField: "employee",
localField: "_id"
});
module.exports = mongoose.model("Employee", employeeSchema);
Comment Schema:
const mongoose = require("mongoose");
const commentSchema = new mongoose.Schema({
commentDate: Date,
comment: String,
employee: {
type: mongoose.Schema.Types.ObjectId,
ref: "Employee"
}
});
module.exports = mongoose.model("Comment", commentSchema);
4-) Both parent and child referencing:
With this method, it is possible to select comments from employee, and employee from comments. But here we have some kind of data duplication, and also when a comment is deleted, it needs to be done in both of the collections.
const mongoose = require("mongoose");
const employeeSchema = new mongoose.Schema({
firstName: String,
username: String,
email: String,
comments: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "Comment"
}
]
});
module.exports = mongoose.model("Employee", employeeSchema);
Comment Schema:
const mongoose = require("mongoose");
const commentSchema = new mongoose.Schema({
commentDate: Date,
comment: String,
employee: {
type: mongoose.Schema.Types.ObjectId,
ref: "Employee"
}
});
module.exports = mongoose.model("Comment", commentSchema);

Many database implement kind of no-delete collections, implementing a delete/active flag for each document.
For example, Employees collection would become :
Employees
--> _id: _id
--> FirstName: string
--> Surname: string
--> Email: string
--> Active: boolean
This way, you keep track on employees data that has been deleted, and prevent documents duplication if you have database size restrictions.
PS: nowadays you can be tackled keeping user data if they ask deletion (RGPD)
EDIT: This solution with boolean may not work if Employees document is updated and you want to keep employees firstname,name,mail,etc at the time he made the Comment.

Correctly inserting and/or updating many datasets to MongoDB (using mongoose)?

So from time to time I get new exports of a cities database of POIs and info about them and I want to have all that data in my MongoDB with a Loopback-API on it. Therefore I reduce the data to my desired structure and try to import it.
For the first time I receive such an export, I can simply insert the data with insertMany().
When I get a new export, it means that it includes updated POIs which I actually want my existing POIs to be replaced with that new data. So I thought I'd use updateMany() but I could'nt figure out how I'd do that in my case.
Here's what I have so far:
const fs = require('fs');
const mongoose = require('mongoose');
const data = JSON.parse(fs.readFileSync('data.json', 'utf8'));
// Connect to database
mongoose.connect('mongodb://localhost/test', {
useMongoClient: true
}, (err) => {
if (err) console.log('Error', err);
});
// Define schema
let poiSchema = new mongoose.Schema({
_id: Number,
name: String,
geo: String,
street: String,
housenumber: String,
phone: String,
website: String,
email: String,
category: String
});
// Set model
let poi = mongoose.model('poi', poiSchema);
// Generate specified data from given export
let reducedData = data['ogr:FeatureCollection']['gml:featureMember'].reduce((endData, iteratedItem) => {
endData = endData.length > 0 ? endData : [];
endData.push({
_id: iteratedItem['service']['fieldX'],
name: iteratedItem['service']['fieldX'],
geo: iteratedItem['service']['fieldX']['fieldY']['fieldZ'],
street: iteratedItem['service']['fieldX'],
housenumber: iteratedItem['service']['fieldX'],
phone: iteratedItem['service']['fieldX'],
website: iteratedItem['service']['fieldX'],
email: iteratedItem['service']['fieldX'],
category: iteratedItem['service']['fieldX']
});
return endData;
}, []);
//
// HERE: ?!?!? Insert/update reduced data in MongoDB collection ?!?!?
//
mongoose.disconnect();
So I just want to update everything that has changed.
Of course if I leave it to insertMany() it fails due to dup key.

For the second time, use mongo's update command with upsert set to true.
db.collection.update(query, update, options)
In the query pass the _id ,in update pass the object and in option set upsert to true. This will update the document if it exists creates a new document if that doesn't exist.

MongoDB schema design for multiple user types

I'm about to build a Node.js+Express+Mongoose app and I'd like to pick the community's brains and get some advice on best practices and going about creating an efficient schema design.
My application is going to include 2 different user types, i.e "teacher" and "student". Each will have a user profile, but will require different fields for each account type. There will also be relationships between "teacher" and "student" where a "student" will initially have 1 teacher (with the possibility of more in the future), and a "teacher" will have many students.
My initial thoughts about how to approach this is to create a general User model and a profile model for each user type (studentProfile model & teacherProfile model), then reference the appropriate profile model inside the User model, like so (A):
var UserSchema = new Schema({
name: String,
email: String,
password: String,
role: String, /* Student or Teacher */
profile: { type: ObjectID, refPath: 'role' }
});
var studentProfileSchema = new Schema({
age: Number,
grade: Number,
teachers: [{ type: ObjectID, ref: 'Teacher' }]
});
var teacherProfileSchema = new Schema({
school: String,
subject: String
});
Or do I just go ahead and directly embed all the fields for both profiles in the User model and just populate the fields required for the specific user type, like so (B):
var UserSchema = new Schema({
name: String,
email: String,
password: String,
role: String, /* Student or Teacher */
profile: {
age: Number,
grade: Number,
school: String,
subject: String
},
relationships: [{ type: ObjectID, ref: 'User' }]
});
The downside to option B is that I can't really make use of Mongoose's required property for the fields. But should I not be relying on Mongoose for validation in the first place and have my application logic do the validating?
On top of that, there will also be a separate collection/model for logging students' activities and tasks, referencing the student's ID for each logged task, i.e.:
var activitySchema = new Schema({
activity: String,
date: Date,
complete: Boolean,
student_id: ObjectID
});
Am I on the right track with the database design? Any feedback would be greatly appreciated as I value any input from this community and am always looking to learn and improve my skills. What better way than from like minded individuals and experts in the field :)
Also, you can see that I'm taking advantage of Mongoose's population feature. Is there any reason to advise against this?
Thanks again!

You could try using .discriminator({...}) function to build the User schema so the other schemas can directly "inherit" the attributes.
const options = {discriminatorKey: 'kind'};
const UserSchema = new Schema({
name: String,
email: String,
password: String,
/* role: String, Student or Teacher <-- NO NEED FOR THIS. */
profile: { type: ObjectID, refPath: 'role' }
}, options);
const Student = User.discriminator('Student', new Schema({
age: Number,
grade: Number,
teachers: [{ type: ObjectID, ref: 'Teacher' }]
}, options));
const Teacher = User.discriminator('Teacher', new Schema({
school: String,
subject: String
}, options));
const student = new Student({
name: "John Appleseed",
email: "john#gmail.com",
password: "123",
age: 18,
grade: 12,
teachers: [...]
});
console.log(student.kind) // Student
Check the docs.

One approach could be the following:
//Creating a user model for login purposes, where your role will define which portal to navigate to
const userSchema = new mongoose.Schema({
_id:mongoose.Schema.Types.ObjectId,
name: {type:String,required:true},
password: {type: String, required: true},
email: {type: String, required: true},
role:{type:String,required:true}
},{timestamps:true});
export default mongoose.model("User", userSchema);
//A student schema having imp info about student and also carrying an id of teacher from Teachers Model
const studentSchema = new mongoose.Schema({
_id:mongoose.Schema.Types.ObjectId,
age:{type:Number},
grade:{type:String},
teacher:{type:mongoose.Schema.Types.ObjectId,ref:'Teachers'}
},{timestamps:true});
export default mongoose.model("Students", studentSchema);
//A teacher model in which you can keep record of teacher
const teacherSchema = new mongoose.Schema({
_id:mongoose.Schema.Types.ObjectId,
subject:{type:String},
School:{type:String},
},{timestamps:true});
export default mongoose.model("Teachers", teacherSchema);

Performance on sorting by populated field using mongoose

I have learned that it is not possible to sort by populated field in mongodb during querying. Suppose I have a schema like below, and I have 1 million data in record. And i only need to return 10 records for each query, depending of the column sorting (asc/desc) and page defined. What are the effective solution to this problem?
Simplify problem:
In the front end, I will have a data table with column firstname, lastname, test.columnA and test.columnB. Each of this column is sortable by user.
My initial solution was to query everything out in mongoose, flattening it to json and using javascript to reorder and finally response the final 10 data only. But this will have bad performance impact with increasing data set.
var testSchema = {
columnA: { type: String },
columnB: { type: String },
}
var UserSchema = {
firstname: { type: string },
lastname: { type: string },
test: {
type: ObjectId,
ref: 'Test'
}
}

Mongoose populate() returning empty array

so I've been at it for like 4 hours, read the documentation several times, and still couldn't figure out my problem. I'm trying to do a simple populate() to my model.
I have a User model and Store model. The User has a favoriteStores array which contains the _id of stores. What I'm looking for is that this array will be populated with the Store details.
user.model
var mongoose = require('mongoose'),
Schema = mongoose.Schema;
var UserSchema = new Schema({
username: String,
name: {first: String, last: String},
favoriteStores: [{type: Schema.Types.ObjectId, ref: 'Store'}],
modifiedOn: {type: Date, default: Date.now},
createdOn: Date,
lastLogin: Date
});
UserSchema.statics.getFavoriteStores = function (userId, callback) {
this
.findById(userId)
.populate('favoriteStores')
.exec(function (err, stores) {
callback(err, stores);
});
}
And another file:
store.model
var mongoose = require('mongoose'),
Schema = mongoose.Schema;
var StoreSchema = new Schema({
name: String,
route: String,
tagline: String,
logo: String
});
module.exports = mongoose.model('Store', StoreSchema);
After running this what I get is:
{
"_id": "556dc40b44f14c0c252c5604",
"username": "adiv.rulez",
"__v": 0,
"modifiedOn": "2015-06-02T14:56:11.074Z",
"favoriteStores": [],
"name": {
"first": "Adiv",
"last": "Ohayon"
}
}
The favoriteStores is empty, even though when I just do a get of the stores without the populate it does display the _id of the store.
Any help is greatly appreciated! Thanks ;)
UPDATE
After using the deepPopulate plugin it magically fixed it. I guess the problem was with the nesting of the userSchema. Still not sure what the problem was exactly, but at least it's fixed.

I think this issue happens when schemas are defined across multiple files. To solve this, try call populate this way:
.populate({path: 'favoriteStores', model: 'Store'})

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

MongoDB/Mongoose querying subdocuments and indexing - mongodb

Related

MongoDB (Mongoose) data structure question

Correctly inserting and/or updating many datasets to MongoDB (using mongoose)?

MongoDB schema design for multiple user types

Performance on sorting by populated field using mongoose

Mongoose populate() returning empty array

Categories

Resources