Slow MongoDB lookup - mongodb

I am quite new to MongoDB. And I have a Mongo aggregate query that takes 5 minutes on 10k+ documents. That doesn't seem right. Basically Products and SKUs have a 1-to-many relationship and I want to find all products with SKUs greater than 0.00001
products = await this.productModel.aggregate([
{
$lookup: {
from: 'skus',
let: { myid: '$_id' },
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ['$product', '$$myid'] },
{ $gt: ['$quantity', 0.00001] },
],
},
},
},
],
as: 'skus',
},
},
{
$addFields: {
id: '$_id',
skusize: { $size: '$skus' },
},
},
{
$match: { skusize: { $gt: 0 } },
},
]);

Related

How to join two Mongo DB Collections together, with one being an Array of Objects inside the Other

I have two collections, one being Companies and the others being Projects. I am trying to write an aggregation function that first grabs all Companies with the status of "Client", then from there write a pipeline that will return all filtered Companies where the company._id === project.companyId, as an Array of Objects. An example of the shortened Collections are below:
Companies
{
_id: ObjectId('2341908342'),
companyName: "Meta",
address: "123 Facebook Lane",
status: "Client"
}
Projects
{
_id: ObjectId('234123840'),
companyId: '2341908342',
name: "Test Project",
price: 97450,
}
{
_id: ObjectId('23413456'),
companyId: '2341908342',
name: "Test Project 2",
price: 100000,
}
My desired outcome after the Aggregation:
Companies
{
_id: ObjectId('2341908342'),
companyName: "Meta",
address: "123 Facebook Lane",
projects: [ [Project1], [Project2],
}
The projects field does not currently exist on the Companies collection, so I imagine we would have to add it. I also begun writing a $match function to filter by clients, but I am not sure if this is correct. I am trying to use $lookup for this but can not figure out the pipeline. Can anyone help me?
Where I'm currently stuck:
try {
const allClientsWithProjects = await companyCollection
.aggregate([
{
$match: {
orgId: {
$in: [new ObjectId(req.user.orgId)],
},
status: { $in: ["Client"] },
},
},
{
$addFields: {
projects: [{}],
},
},
{
$lookup: { from: "projects", (I am stuck here) },
},
])
.toArray()
Thank you for any help anyone can provide.
UPDATE*
I am seemingly so close I feel like... This is what I have currently, and it is returning everything but Projects is still an empty array.
try {
const allClients = await companyCollection
.aggregate([
{
$match: {
orgId: {
$in: [new ObjectId(req.user.orgId)],
},
status: {
$in: ["Client"],
},
},
},
{
$lookup: {
from: "projects",
let: {
companyId: {
$toString: [req.user.companyId],
},
},
pipeline: [
{
$match: {
$expr: {
$eq: ["$companyId", "$$companyId"],
},
},
},
],
as: "projects",
},
},
])
.toArray()
All of my company information is being returned correctly for multiple companies, but that projects Array is still []. Any help would be appreciated, and I will still be troubleshooting this.
One option is using a $lookup with a pipeline:
db.company.aggregate([
{
$match: {
_id: {
$in: [
ObjectId("5a934e000102030405000000")
],
},
status: {
$in: [
"Client"
]
},
},
},
{
$lookup: {
from: "Projects",
let: {
companyId: {
$toString: "$_id"
}
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$companyId",
"$$companyId"
]
}
}
}
],
as: "projects"
}
}
])
See how it works on the playground example
Final answer for my question:
try {
const allClientsAndProjects = await companyCollection
.aggregate([
{
$match: {
orgId: {
$in: [new ObjectId(req.user.orgId)],
},
status: {
$in: ["Client"],
},
},
},
{
$lookup: {
from: "projects",
let: {
companyId: {
$toString: "$_id",
},
},
pipeline: [
{
$match: {
$expr: {
$eq: ["$companyId", "$$companyId"],
},
},
},
],
as: "projects",
},
},
])
.toArray()

MongoDB aggregate performance

I have two collections one is bids and another one is auctions. I am able to get bid inside customer wise count bids collection inside almost one million records. and auctions collection have 500k records. I need this same result as quick fetching in mongodb.
this is getting almost 29 seconds for response time. but i need quick time to get response
{ $match: { customer: '00000000823026' } },
{ $group: {
_id: '$auctioncode',
data: {
$last: '$$ROOT'
}
}
},
{
$lookup: {
from: 'auctions',
let: { auctioncode: '$_id' },
pipeline: [
{
$match: {
$expr: {
$and: [{ $eq: ['$_id', '$$auctioncode'] }],
},
},
},
],
as: 'auction',
},
},
{ $match: { auction: { $exists: true, $not: { $size: 0 } } } },
{
$addFields: {
_id: '$data._id',
auctioncode: '$data.auctioncode',
amount: '$data.amount',
customer: '$data.customer',
customerName: '$data.customerName',
maxBid: '$data.maxBid',
stockcode: '$data.stockcode',
watchlistHidden: '$data.watchlistHidden',
winner: '$data.winner'
}
},
{
$match: {
$and: [
{
// to get only RUNNING auctions in bid history
'auction.status': { $ne: 'RUNNING'},
// to filter auctions based on status
// tslint:disable-next-line:max-line-length
},
],
},
},
{ $sort: { 'auction.enddate': -1 } },
{ $count: 'totalCount'}
current result is totalCount 2640
how to optimize and need to find a way to performance changes in mongodb
If all that you require is the count of results, the below code is more optimized.
Index customer key for even better execution time.
Note: You can make use of pipeline method of $lookup if you are using MongoDB version >= 5.0 as it makes use if indexes unlike the lower version.
db.collection.aggregate([
{
$match: {
customer: '00000000823026'
}
},
{
$group: {
_id: '$auctioncode',
data: {
$last: '$$ROOT'
}
}
},
{
$lookup: {
from: 'auctions',
localField: '_id',
foreignField: '_id',
as: 'auction',
},
},
{
$match: {
// auction: {$ne: []},
// to get only RUNNING auctions in bid history
'auction.status': { $ne: 'RUNNING'},
// to filter auctions based on status
// tslint:disable-next-line:max-line-length
// {
// $addFields: { <-- Not Required
// _id: '$data._id',
// auctioncode: '$data.auctioncode',
// amount: '$data.amount',
// customer: '$data.customer',
// customerName: '$data.customerName',
// maxBid: '$data.maxBid',
// stockcode: '$data.stockcode',
// watchlistHidden: '$data.watchlistHidden',
// winner: '$data.winner'
// }
// },
// { $sort: { 'auction.enddate': -1 } }, <-- Not Required
{ $count: 'totalCount'}
], {
allowDiskUse: true
})

$match in lookup stage without removing local documents in mongoDB

Right now I have a lookup stage that matches a collection of answers with their respective questions. The problem I'm having is that on the $match if the question does not having a matching answer document in the lookup collection it is removed from the results of the aggregation. How can I avoid this?
{
$lookup: {
from: 'groupansphotos',
let: { question_id: '$questions._id' },
pipeline: [
{
$match: {
$expr: { $eq: ['$_id', '$$question_id'] },
},
},
{ $unwind: '$answers' },
{ $match: { 'answers.reported': { $lt: 1 } } },
{ $sort: { 'answers.helpful': -1 } },
{ $limit: 2 },
{
$project: {
_id: 0,
k: { $toString: '$answers._id' }, // k
v: '$$ROOT.answers', // v
},
},
],
as: 'answers',
},
},

Referencing root _id in aggregate lookup match expression not working

This is my first experience using aggregate pipeline. I'm not able to get a "$match" expression to work inside the pipeline. If I remove the "_id" match, I get every document in the collection past the start date, but once I add the $eq expression, it returns empty.
I read a lot of other examples and tried many different ways, and this seems like it is correct. But the result is empty.
Any suggestions?
let now = new Date()
let doc = await Team.aggregate([
{ $match: { created_by: mongoose.Types.ObjectId(req.params.user_oid)} },
{ $sort: { create_date: 1 } },
{ $lookup: {
from: 'events',
let: { "team_oid": "$team_oid" },
pipeline: [
{ $addFields: { "team_oid" : { "$toObjectId": "$team_oid" }}},
{ $match: {
$expr: {
$and: [
{ $gt: [ "$start", now ] },
{ $eq: [ "$_id", "$$team_oid" ] }
]
},
}
},
{
$sort: { start: 1 }
},
{
$limit: 1
}
],
as: 'events',
}},
{
$group: {
_id: "$_id",
team_name: { $first: "$team_name" },
status: { $first: "$status" },
invited: { $first: "$invited" },
uninvited: { $first: "$uninvited" },
events: { $first: "$events.action" },
dates: { $first: "$events.start" } ,
team_oid: { $first: "$events.team_oid" }
}
}])
Example Docs (added by request)
Events:
_id:ObjectId("60350837c57b3a15a414d265")
invitees:null
accepted:null
sequence:7
team_oid:ObjectId("60350837c57b3a15a414d263")
type:"Calendar Invite"
action:"Huddle"
status:"Questions Issued"
title:"Huddle"
body:"This is a Huddle; you should receive new questions 5 days befor..."
creator_oid:ObjectId("5ff9e50a206b1924dccd691e")
start:2021-02-26T07:00:59.999+00:00
end:2021-02-26T07:30:59.999+00:00
__v:0
Team:
_id:ObjectId("60350837c57b3a15a414d263")
weekly_schedule:1
status:"Live"
huddle_number:2
reminders:2
active:true
created_by:ObjectId("5ff9e50a206b1924dccd691e")
team_name:"tESTI"
create_date:2021-02-23T13:50:47.172+00:00
__v:0
This is just a guess since you don't have schema in your question. But it looks like your have some of your _ids mixed up. Where you are currently trying to $match events whose _id is equal to a team_oid. Rather than the event's team_oid field being equal to the current 'team' _id.
I'm pretty confident this will produce the correct output. If you post any schema or sample docs I will edit it.
https://mongoplayground.net/p/5i1w2Ii7KCR
let now = new Date()
let doc = await Team.aggregate([
{ $match: { created_by: mongoose.Types.ObjectId(req.params.user_oid)} },
{ $sort: { create_date: 1 } },
{ $lookup: {
from: 'events',
// Set tea_oid as the current team _id
let: { "team_oid": "$_id" },
pipeline: [
{ $match: {
$expr: {
$and: [
{ $gt: [ "$start", now ] },
// Match events whose 'team_oid' field matches the 'team' _id set above
{ $eq: [ "$team_oid", "$$team_oid" ] }
]
},
}
},
{
$sort: { start: 1 }
},
{
$limit: 1
}
],
as: 'events',
}},
{
$group: {
_id: "$_id",
team_name: { $first: "$team_name" },
status: { $first: "$status" },
invited: { $first: "$invited" },
uninvited: { $first: "$uninvited" },
events: { $first: "$events.action" },
dates: { $first: "$events.start" } ,
team_oid: { $first: "$events.team_oid" }
}
}])

MongoDB Aggregate - Get Total Count and Skip in one pipeline

I am trying to get the total count of documents which match my pipeline operators and then after I get that count of all of them I would like to use $skip and $limit to return a subset for pagination. Right now I am basically doing the same aggregation twice - once to get the count of all matches and once to do the skip/limit. Can this be done using one aggregation pipeline?
To get the count I do
let [count] = await MyModel.aggregate([
{
$match: {
store: mongoose.Types.ObjectId(storeId),
},
},
{
$lookup: {
from: "resaleitems",
let: {
store: "$store",
},
as: "itemsForSale",
pipeline: [
{
$match: {
$expr: {
$eq: ["$store", "$$store"],
},
},
},
],
},
},
{
$match: {
"itemsForSale.0": { $exists: true },
},
},
{ $count: "totalCount" },
]).exec();
and then right after to get the skipped and limited results I do basically the same thing, but have added the $skip and $limit steps
let items = await MyModel.aggregate([
{
$match: {
store: mongoose.Types.ObjectId(storeId),
},
},
{ $skip: skip.toNumber() },
{ $limit: bnLimit.toNumber() },
{
$lookup: {
from: "resaleitems",
let: {
store: "$store",
},
as: "itemsForSale",
pipeline: [
{
$match: {
$expr: {
$eq: ["$store", "$$store"],
},
},
},
],
},
},
{
$match: {
"itemsForSale.0": { $exists: true },
},
},
]).exec();