Running aggregate query on mongodb using mongoose hook during find - mongodb

I have a model that i want to get the user aggregate after some lookup during the login which I am thinking of adding to the mongoose hook below
schema.pre("find", async function (next) {
this.aggregate([
{$lookup: {from: "categories", localField: "categories", foreignField: "_id", as: "categories"}},
])
next();
});
But this does not have the aggregate method exposed at this point. What is the possible solution I can opt-in for ? I don't want to make a separate aggregate call or a different endpoint for just the aggregate

Related

Aggregate $lookup stage overwrite data

How I can make a lookup without to overwrite the existing data.
I recreated the situation in mopngodb playground: mongo db playground example
The problem is that I need to lookup the events and all the object ids (subevents, tags).
But the problem starts on the first lookup to the subevents. I need to have all the lookup data on the same place like the id for the lookup. But the rest from the data from the event is gone only the subevents are there.
Any ideas?
as you are giving lookup output name same as existing field it overwrites existing data, if you give new name it persists all data.
for example below aggregation gives the main events and sub events details, if required we can use project stages to put subevents under main event:
[{$match: {
type: "EVENT"
}}, {$lookup: {
from: "events",
localField: "markedItemID",
foreignField: "_id",
as: "marked_event"
}}, {$unwind: {
path: "$marked_event"
}}, {$lookup: {
from: "events",
localField: "marked_event.baseData.subEvents",
foreignField: "_id",
as: "marked_subEvents"
}}]
https://mongoplayground.net/p/dlxqiK1PKdH

How concurrency/locking works on documents while passing mongoDB aggregation pipeline

Consider we have two collection coll1 and coll2. I am applying some aggregation stages to the coll1
db.coll1.aggregate([
{ $match: { ... } },
{ $lookup:{
from: "coll2",
localField: "_id",
foreignField: "_id",
as: "coll2"
}
}
// followed by other stages
//last stage being $merge
{ $merge : { into: "coll3", on: "_id"} }
])
So, my query is:
While the aggregation is in progress, is it possible the underlying collection, in this case coll1, is allowed to be modified/updated ? In either case, please help to understand how it works (went through mongoDb docs, but could not understand)
How does it write the final coll3 ? In sense, does it write all in one shot or one document as it finish the pipeline ?
In regards to spring-data-mongodb, I am successfully able to call mongoOperation.aggregate() for the above aggregation pipeline, but it returns aggregationResult object with Zero mappedResults.( When checked in db, coll3 is getting created).
Does $merge not return any such details ?
I am using mongoDb 4.2

Can we apply Projection in model level which is applied where ever that model is accessed even in aggregate or $lookup

Mongoose provides us with the ability to apply Aggregation Hooks which works only when aggregation is applied to that particular model.
userSchema.post("aggregate", function() {
this.pipeline().push({$project: { _id: 1, firstName: 1, lastName: 1 }});});
The above code works fine and proper projection is applied when we do
User.aggregate([...])
But the same projection is not applied when we lookup user in another model's aggregate.
{
$lookup: {
from: "users",
localField: "user",
foreignField: "_id",
as: "associatedUser"
}
},
Is there a way in mongoose so that we can apply projection in model level which is applied where ever that model is accessed and we don't have to apply projection in every aggregation query.

Mongodb query execution take too much time

Iam working on the Go project and I am using mongodb to store my data. But suddenly the mongodb query execution took too much time to get data.
I have a collection named "cars" with around 25000 documents and each document containing around 200 fields (4.385KB). I have an aggregate query like this:
db.cars.aggregate([
{
$lookup:
{
from: "users",
localField: "uid",
foreignField: "_id",
as: "customer_info"
}
},{
$unwind: "$customer_info"
},{
$lookup:
{
from: "user_addresses",
localField: "uid",
foreignField: "_id",
as: "address"
}
},{
$unwind: "$address"
},{
$lookup:
{
from: "models",
localField: "_id",
foreignField: "car_id",
as: "model_info"
}
},{
$match:{
purchased_on:{$gt:1538392491},
status:{$in:[1,2,3,4]},
"customer_info.status":{$ne:9},
"model_info.status":{$ne:9},
}
},{
$sort:{
arrival_time:1
}
},{
$skip:0
},{
$limit:5
}
])
My document structure is like: https://drive.google.com/file/d/1hM-lPwvE45_213rQDYaYuYYbt3LRTgF0/view.
Now, If run this query with out indexing then it take around 10 mins to load the data. Can anyone suggest me how can I reduce its execution time ?
There are many things to do to optimize your query. What I would try :
As Anthony Winzlet said in comments, use as possible $match stage as first stage. This way, you can reduce number of documents passed to the following stages, and use indexes.
Assuming you use at least 3.6 mongo version, change your lookup stages using the 'let/pipeline' syntax (see here). This way, you can integrate your 'external filters' ( "customer_info.status":{$ne:9}, "model_info.status":{$ne:9} ) in a $match stage in your lookups pipeline. With indexes on right fields / collections, you will gain some time / memory in your $lookup stages.
Do your unwind stages as late as possible, to restrict number of documents passed to the following stages.
It's important to understand how works aggregation pipeline : each stage receive data, do its stuff, and pass data to next stage. So the less data is passed to the pipeline, the faster will be your query.

MongoDB Query across Multiple Collections

I have a collection (collectionA) that stores an event ID in an event array. The event array information comes from (collectionB).
Lately when an event is deleted from CollectionB via the web app, it sometimes does not get removed from Collection A as it should.
Is there a query i can do in mongo 3.0 to check to see what event_id's exist in CollectionA that are not in collectionB. Those will be the ones that need to be removed while the development team resolves the issue?
Here is a sample query that will give you list of such objects, assuming, collectionA has events array with IDs from collectionB
db.collectionA.aggregate([
{$unwind: '$events'},
{$lookup: {
from: 'collectionB',
localField: 'events',
foreignField: '_id',
as: 'event'
}},
{$unwind: {path: '$event', preserveNullAndEmptyArrays:true}},
{$match:{ 'event': {$exists:false}}},
])