What's the strategy for creating index for this query? - mongodb

I have a comment model for each thread,
const CommentSchema = new mongoose.Schema({
author: { type: ObjectID, required: true, ref: 'User' },
thread: { type: ObjectID, required: true, ref: 'Thread' },
parent: { type: ObjectID, required: true, ref: 'Comment' },
text: { type: String, required: true },
}, {
timestamps: true,
});
Besides a single query via _id, I want to query the database via this way:
Range query
const query = {
thread: req.query.threadID,
_id: { $gt: req.query.startFrom }
};
CommentModel.find(query).limit(req.query.limit);
My intention here is to find comments which related to a thread then get part of the result. It seems this query works as expected. My questions are:
Is this the right way to fulfill my requirement?
How to proper index the fields? Is this a compound index or I need to separate indexing each field? I checked the result of explain(), it seems as long as one of the query fields contains an index, the inputStage.stage will always have IXSCAN rather than COLLSCAN? Is this the key information to check the performace of the query?
Does it mean that every time I need to find based on one field, I need to make an index for these fields? Let's say that I want to search all the comments that are posted by an author to a specific thread.
Code like this:
const query = {
thread: req.query.threadID,
author: req.query.authorID,
};
Do I need to create a compound index for this requirement?

If you want to query by multiple fields then you have to create compound index.
for example
const query = {
thread: req.query.threadID,
author: req.query.authorID,
};
if you want to use this query then you have to create compound index like :
db.comments.createIndex( { "thread": 1, "author": 1 } );
Then that is, the index supports queries on the item field as well as both item and stock fields:
db.comments.find( { thread: "threadName" } )
db.comments.find( { thread: "threadName", author: "authorName" }
but not supports this one
db.comments.find( { author: "authorName", thread: "threadName" }
say if you create index for { "thread": 1, "author": 1, "parent": 1 } then for bellow ordered query support index
the thread field,
the thread field and the author field,
the thread field and the author field and the parent field.
But not support for bellow order
the author field,
the parent field, or
the author and parent fields.
For more read this

Related

MongoDB aggregate search with multiple fields

I am trying to build an API for search jobs
Frontend input: single filed keyword with a string
API response: Return list of jobs that match any of the following fields
skills
location
company
Schemas
1.Job schema
title: {
type: String,
required: true,
},
location: {
type: mongoose.Schema.Types.ObjectId,
ref: 'location',
},
skills:[{
type: mongoose.Schema.Types.ObjectId,
ref: 'Skill'
}],
company: {
type: mongoose.Schema.Types.ObjectId,
ref: 'company',
},
As you see skills, location and company are mapped in another collection and frontend gives no separation on the keyword I am not sure which way I can write an effective search query
Right now approach is
Find skill_id based on skill name and fetch all jobs that have desired skill
Follow the same for location and company
But I am not sure this is the right approach, can somebody' advise a proper way of doing this
many strategies can be applied in your case (yours, AdamExchange's one, aggregation with $lookup stages...) depending of size of collections, indexes, etc...
But i think think you really have to look to index and index intersection strategies to really optimize your query
I would :
first create 3 single indexes on skill.name / location.name / company indexes ===>So you can find the ids in your different collections, using index.
Create single indexes on job collection : location, skill, company
Then you can simply run your queries like this (assuming MyKeyword is the value of your frontend field) [pseudo code, i don't know the language you use]:
skillId = db.skill.find({name:MyKeyword });
locationId = db.location.find({name:MyKeyword });
companyId = db.company.find({name:MyKeyword });
db.job.find({
$or: [
{
skill: {
$eq: skillId
}
},
{
location: {
$eq: locationId
}
},
{
company: {
$eq: companyId
}
}
]
})
So you can take benefit of indexes to query 'secondary collections' and of indexes intersection for each case of your $or condition for main collection.

How to define mongoose schema for nested documents

I need to define the mongoose schema for for nested documents which is given below.
Documents:
"Options":[{"Value":["28","30","32","34","36","38","40","42","44","46"],"_id":{"$oid":"5de8427af55716115dd43c8f"},"Name":"Size"},{"Value":["White"],"_id":{"$oid":"5de8427af55716115dd43c8e"},"Name":"Colour"}]
I was declaring like below but its not working.
const Product = new Schema(
{
Options: [{ value: { _id: ObjectId, Name: String } }]
},
{
timestamps: {
createdAt: "createdAt",
updatedAt: "updatedAt"
},
collection: "products"
}
);
Here I need the schema where if i will directly add/update the same document then it will be added.
You need to modify your schema like this :
{
Options: [ new Schema ({ value: [...], _id: Schema.Types.ObjectId, Name: String })]
}
This is the way to create an array of subdocuments with Mongoose. If you don't use the "new Schema" key words, you are actually creating a field with type "Mixed", which needs a different way to handle updates.
You can also omit the _id, it should be added automatically.
You can find more information on subdocument on this page :
https://mongoosejs.com/docs/subdocs.html
...and on mixed type fields : https://mongoosejs.com/docs/schematypes.html#mixed
...which will explain shortly the problem.
{
Options: [ new Schema ({ _id: mongoose.Types.ObjectId(),value: [String], Name: String } })]
}

Pulling/deleting an item from a nested array

Note: it's a Meteor project.
My schema looks like that:
{
_id: 'someid'
nlu: {
data: {
synonyms:[
{_id:'abc', value:'car', synonyms:['automobile']}
]
}
}
}
The schema is defined with simple-schema. Relevant parts:
'nlu.data.synonyms.$': Object,
'nlu.data.synonyms.$._id': {type: String, autoValue: ()=> uuidv4()},
'nlu.data.synonyms.$.value': {type:String, regEx:/.*\S.*/},
'nlu.data.synonyms.$.synonyms': {type: Array, minCount:1},
'nlu.data.synonyms.$.synonyms.$': {type:String, regEx:/.*\S.*/},
I am trying to remove {_id:'abc'}:
Projects.update({_id: 'someid'},
{$pull: {'nlu.data.synonyms' : {_id: 'abc'}}});
The query returns 1 (one doc was updated) but the item was not removed from the array. Any idea?
This is my insert query
db.test.insert({
"_id": "someid",
"nlu": {
"data": {
"synonyms": [
{
"_id": "abc"
},
{
"_id": "def"
},
10,
[ 5, { "_id": 5 } ]
]
}
}
})
And here is my update
db.test.update(
{
"_id": "someid",
"nlu.data.synonyms._id": "abc"
},
{
"$pull": {
"nlu.data.synonyms": {
"_id": "abc"
}
}
}
)
The problem broke down to the autoValue parameter on your _id property.
This is a very powerful feature to manipulate automatic values on your schema. However, it prevented from pulling as it had always returned a value, indicating that this field should be set.
In order to make it aware of the pulling, you can make it aware of an operator being present (as in cases of mongo updates).
Your autoValue would then look like:
'nlu.data.synonyms.$._id': {type: String, autoValue: function(){
if (this.operator) {
this.unset();
return;
}
return uuidv4();
}},
Edit: Note the function here being not an arrow function, otherwise it losses the context that is bound on it by SimpleSchema.
It basically only returns a new uuid4 when there is no operator present (as in insert operations). You can extend this further by the provided functionality (see the documentation) to your needs.
I just summarized my code to a reproducable example:
import uuidv4 from 'uuid/v4';
const Projects = new Mongo.Collection('PROJECTS')
const ProjectSchema ={
nlu: Object,
'nlu.data': Object,
'nlu.data.synonyms': {
type: Array,
},
'nlu.data.synonyms.$': {
type: Object,
},
'nlu.data.synonyms.$._id': {type: String, autoValue: function(){
if (this.operator) {
this.unset();
return;
}
return uuidv4();
}},
'nlu.data.synonyms.$.value': {type:String, regEx:/.*\S.*/},
'nlu.data.synonyms.$.synonyms': {type: Array, minCount:1},
'nlu.data.synonyms.$.synonyms.$': {type:String, regEx:/.*\S.*/},
};
Projects.attachSchema(ProjectSchema);
Meteor.startup(() => {
const insertId = Projects.insert({
nlu: {
data: {
synonyms:[
{value:'car', synonyms:['automobile']},
]
}
}
});
Projects.update({_id: insertId}, {$pull: {'nlu.data.synonyms' : {value: 'car'}}});
const afterUpdate = Projects.findOne(insertId);
console.log(afterUpdate, afterUpdate.nlu.data.synonyms.length); // 0
});
Optional Alternative: Normalizing Collections
However there is one additional note for optimization.
You can work around this auto-id generation issue by normalizing synonyms into an own collection, where the mongo insert provides you an id. I am not sure how unique this id will be compared to uuidv4 but i never faced id issues with that.
A setup could look like this:
const Synonyms = new Mongo.Collection('SYNONYMS');
const SynonymsSchema = {
value: {type:String, regEx:/.*\S.*/},
synonyms: {type: Array, minCount:1},
'synonyms.$': {type:String, regEx:/.*\S.*/},
};
Synonyms.attachSchema(SynonymsSchema);
const Projects = new Mongo.Collection('PROJECTS')
const ProjectSchema ={
nlu: Object,
'nlu.data': Object,
'nlu.data.synonyms': {
type: Array,
},
'nlu.data.synonyms.$': {
type: String,
},
};
Projects.attachSchema(ProjectSchema);
Meteor.startup(() => {
// just add this entry once
if (Synonyms.find().count() === 0) {
Synonyms.insert({
value: 'car',
synonyms: ['automobile']
})
}
// get the id
const carId = Synonyms.findOne()._id;
const insertId = Projects.insert({
nlu: {
data: {
synonyms:[carId] // push the _id as reference
}
}
});
// ['MG464i9PgyniuGHpn'] => reference to Synonyms document
console.log(Projects.findOne(insertId).nlu.data.synonyms);
Projects.update({_id: insertId}, {$pull: {'nlu.data.synonyms' : carId }}); // pull the reference
const afterUpdate = Projects.findOne(insertId);
console.log(afterUpdate, afterUpdate.nlu.data.synonyms.length);
});
I know this was not part of the question but I just wanted to point out that there are many benefits of normalizing complex document structures into separate collections:
no duplicate data
decouple data that is not intended to be bound (here: Synonyms could be also used independently from Projects)
update referred documents once, all Projects will point to the very actual version (since it's a reference)
finer publication/subscription handling => more control about what data flows over the wire
reduces complex auto and default value generation
changes in the referred collection's schema may have only few consequences for UI and functions that make use of the referrer's schema.
Of course this has also disadvantages:
more collections to handle
more code to write (more code = more potential errors)
more tests to write (much more time to invest)
sometimes you need to denormalize back for this one case out of 100
you have to invest a lot of time in data schema design before starting to code

How to query a document using mongoose and send the document to the client with only one relevant element from an array field to the client?

I have the following schema:
var lessonSchema = mongoose.Schema({
_id: mongoose.Schema.Types.ObjectId,
name: String,
students: [{
_id: mongoose.Schema.Types.ObjectId,
attendance: {
type: Boolean,
default: false,
},
}],
});
The students array is an array of students who attended the particular lesson. I want to find a lesson using whether a particular user is present in the students array and then sent only that element of the students array which corresponds to the user making the request, along with all other fields as it is. For example, the query should return:
{
_id: 'objectid',
name: 'lesson-name'
students: [details of just the one student corresponding to req.user._id]
}
I tried using:
Lesson.find({'students._id': String(req.user._id)}, {"students.$": 1})
The query returns the document with just the id and the relevant element from the students array:
{
_id: 'objectid'
students: [details of the one student corresponding to req.user._id]
}
I tried using:
Lesson.find({'students._id': mongoose.Types.ObjectId(req.user._id)})
This returns the document with the details of all the students:
{
_id: 'objectid',
name: 'lesson-name'
students: [array containing details of all the students who attended the lesson]
}
How can I modify the query to return it the way I want?
You can return the name field by adding it to the projection object like this:
Lesson.find({ "students._id": String(req.user._id) }, { "name": 1, "students.$": 1 })
When you add a projection object (2nd parameter to find), the _id field is returned by default, plus whichever fields you set to 1.
Therefore, you were returning just the _id and the desired student but not the name field.
If you want to return all other fields and just limit the array to the matched item then you can make use of $slice in your projection:
Lesson.find({ "students._id": String(req.user._id) }, { "students.$": { $slice: 1 } })

Mongodb: How to add unique value to each element in array?

I'm a new user of mongodb and I have a model like below. For update list data, I have to specify the element in an array. So I think I need to store a unique value for each element. Because list.name and list.price are variable data.
So are there any good ways to create an unique id in mongodb? Or should I create unique ids by myself?
{
name: 'AAA',
list: [
{name: 'HOGE', price: 10, id: 'XXXXXXXXXX'}, // way to add id
{name: 'FUGA', price: 12, id: 'YYYYYYYYYY'} // way to add id
]
}
Mongodb creates unique id only for documents. There is no better way for list or array elements. So, you should create Unique ids yourself.
Add keep in mind that, While updating your list use $addToSet.
For more information of $addToSet follow this documentation
use ObjectId() on your id field, so like..
db.test.update({name: "AAA"}, { $push: { list: {_id : ObjectId(), name: "dingles", price: 21} }});
reference: https://docs.mongodb.org/v3.0/reference/object-id/
whoever is seeing this in 2022, mongodb creates unique ids automatically we just have to provide schema for that particular array.
like,
_id : {
type: String
},
list: {
type: [{
Name : {
type: String
},
price : {
type: String
}
}]
}
this schema will generate auto id for all elements added into array
but below example will not create it.
_id : {
type: String
},
list: {
type: Array
}