mongodb 2 level aggregate lookup - mongodb

I have those collection schemas
Schema.users = {
name : "string",
username : "string",
[...]
}
Schema.rooms = {
name : "string",
hidden: "boolean",
user: "string",
sqmt: "number",
service: "string"
}
Schema.room_price = {
morning : "string",
afternoon: "string",
day: "string",
room:'string'
}
I need to aggregate the users with the rooms and foreach room the specific room prices.
the expected result would be
[{
_id:"xXXXXX",
name:"xyz",
username:"xyz",
rooms:[
{
_id: 1111,
name:'room1',
sqmt: '123x',
service:'ppp',
room_prices: [{morning: 123, afternoon: 321}]
}
]}]
The first part of the aggregate could be
db.collection('users').aggregate([
{$match: cond},
{$lookup: {
from: 'rooms',
let: {"user_id", "$_id"},
pipeline: [{$match:{expr: {$eq: ["$user", "$$user_id"]}}}],
as: "rooms"
}}])
but I can't figure out how to get the room prices within the same aggregate

Presuming that room from the room_prices collection has the matching data from the name of the rooms collection, then that would the expression to match on for the "inner" pipeline of the $lookup expression with yet another $lookup:
db.collection('users').aggregate([
{ $match: cond },
{ $lookup: {
from: 'rooms',
let: { "user_id": "$_id" },
pipeline: [
{ $match:{ $expr: { $eq: ["$user", "$$user_id"] } } },
{ $lookup: {
from: 'room_prices',
let: { 'name': '$name' },
pipeline: [
{ $match: { $expr: { $eq: [ '$room', '$$name'] } } },
{ $project: { _id: 0, morning: 1, afternoon: 1 } }
],
as: 'room_prices'
}}
],
as: "rooms"
}}
])
That's also adding a $project in there to select only the fields you want from the prices. When using the expressive form of $lookup you actually do get to express a "pipeline", which can be any aggregation pipeline combination. This allows for complex manipulation and such "nested lookups".
Note that using mongoose you can also get the collection name from the model object using something like:
from: RoomPrice.collection.name
This is generally future proofing against possible model configuration changes which might possibly change the name of the underlying collection.
You can also do pretty much the same with the "legacy" form of $lookup prior to the sub-pipeline syntax available from MongoDB 3.6 and upwards. It's just a bit more processing and reconstruction:
db.collection('users').aggregate([
{ $match: cond },
// in legacy form
{ $lookup: {
from: 'rooms',
localField: 'user_id',
foreignField: 'user',
as: 'rooms'
}},
// unwind the output array
{ $unwind: '$rooms' },
// lookup for the second collection
{ $lookup: {
from: 'room_prices',
localField: 'name',
foreignField: 'room',
as: 'rooms.room_prices'
}},
// Select array fields with $map
{ $addFields: {
'rooms': {
'room_prices': {
$map: {
input: '$rooms.room_prices',
in: {
morning: '$this.morning',
afternoon: '$this.afternoon'
}
}
}
}
}},
// now group back to 'users' data
{ $group: {
_id: '$_id',
name: { $first: '$name' },
username: { $first: '$username' },
// same for any other fields, then $push 'rooms'
rooms: { $push: '$rooms' }
}}
])
That's a bit more overhead mostly from usage of $unwind and also noting that the "field selection" does actually mean you did return the "whole documents" from room_prices "first", and only after that was complete can you select the fields.
So there are advantages to the newer syntax, but it still could be done with earlier versions if you wanted to.

Related

How do I use a wildcard in my lookup foreignField?

I'm trying to make a lookup, where the foreignField is dynamic:
{
$merge: {
_id: ObjectId('61e56339b528bf009feca149')
}
},
{
$lookup: {
from: 'computer',
localField: '_id',
foreignField: 'configs.?.refId',
as: 'computers'
}
}
I know that the foreignField always starts with configs and ends with refId, but the string between the two is dynamic.
Here is an example of what a document looks like:
'_id': ObjectId('6319bd1540b41d1a35717a16'),
'name': 'MyComputer',
'configs': {
'ybe': {
'refId': ObjectId('61e56339b528bf009feca149')
'name': 'Ybe Config'
},
'test': {
'refId': ObjectId('61f3d7ec47805d1443f14540')
'name': 'TestConfig'
},
...
}
As you can see the configs property contains different objects with different names ('ybe', 'test', etc...). I want to lookup based on the refId inside of all of those objects.
How do I achieve that?
Using dynamic value as a field name is considered an anti-pattern and introduces unnecessary complexity to querying. However, you can achieve your behaviour with $objectToArray by converting the object into array of k-v pairs and perform the $match in a sub-pipeline.
db.coll.aggregate([
{
"$lookup": {
"from": "computer",
"let": {
id: "$_id"
},
"pipeline": [
{
$set: {
configs: {
"$objectToArray": "$configs"
}
}
},
{
"$unwind": "$configs"
},
{
$match: {
$expr: {
$eq: [
"$$id",
"$configs.v.refId"
]
}
}
}
],
"as": "computers"
}
}
])
MongoPlayground

Mongoose lookup across 3 collections using foreign key

I have found a few questions that relate to this (here and here) but I have been unable to interpret the answers in a way that I can understand how to do what I need.
I have 3 collections: Organisations, Users, and Projects. Every project belongs to one user, and every user belongs to one organisation. From the user's id, I need to return all the projects that belong to the organisation that the logged-in user belongs to.
Returning the projects from the collection that belong to the user is easy, with this query:
const projects = await Project.find({ user: req.user.id }).sort({ createdAt: -1 })
Each user has an organisation id as a foreign key, and I think I need to do something with $lookup and perhaps $unwind mongo commands, but unlike with SQL queries I really struggle to understand what's going on so I can construct queries correctly.
EDIT: Using this query
const orgProjects = User.aggregate(
[
{
$match: { _id: req.user.id }
},
{
$project: { _id: 0, org_id: 1 }
},
{
$lookup: {
from: "users",
localField: "organisation",
foreignField: Organisation._id,
as: "users_of_org"
}
},
{
$lookup: {
from: "projects",
localField: "users_of_org._id",
foreignField: "user",
as: "projects"
}
},
{
$unset: ["organisation", "users_of_org"]
},
{
$unwind: "$projects"
},
{
$replaceWith: "$projects"
}
])
Seems to almost work, returning the following:
Aggregate {
_pipeline: [
{ '$match': [Object] },
{ '$project': [Object] },
{ '$lookup': [Object] },
{ '$lookup': [Object] },
{ '$unset': [Array] },
{ '$unwind': '$projects' },
{ '$replaceWith': '$projects' }
],
_model: Model { User },
options: {}
}
assuming your documents have a schema like this, you could do an aggregation pipeline like below with 2 $lookup stages.
db.users.aggregate(
[
{
$match: { _id: "user1" }
},
{
$project: { _id: 0, org_id: 1 }
},
{
$lookup: {
from: "users",
localField: "org_id",
foreignField: "org_id",
as: "users_of_org"
}
},
{
$lookup: {
from: "projects",
localField: "users_of_org._id",
foreignField: "user_id",
as: "projects"
}
},
{
$unset: ["org_id", "users_of_org"]
},
{
$unwind: "$projects"
},
{
$replaceWith: "$projects"
}
])

MongoDB: slow performance pipeline lookup compared to basic lookup

I have two collection:
matches:
[{
date: "2020-02-15T17:00:00Z",
players: [
{_id: "5efd9485aba4e3d01942a2ce"},
{_id: "5efd9485aba4e3d01942a2cf"}
]
},
{...}]
and players:
[{
_id: "5efd9485aba4e3d01942a2ce",
name: "Rafa Nadal"
},
{
_id: "5efd9485aba4e3d01942a2ce",
name: "Roger Federer"
},
{...}]
I need to use lookup pipeline because I'm building a graphql resolver with recursive functions and I need nested lookup. I've followed this example https://docs.mongodb.com/datalake/reference/pipeline/lookup-stage#nested-example
My problem is that with pipeline lookup I need 11 seconds but with basic lookup only 0.67 seconds. And my test database is very short! about 1300 players and 700 matches.
This is my pipeline lookup (11 seconds to resolve)
db.collection('matches').aggregate([{
$lookup: {
from: 'players',
let: { ids: '$players' },
pipeline: [{ $match: { $expr: { $in: ['$_id', '$$ids' ] } } }],
as: 'players'
}
}]);
And this my basic lookup (0.67 seconds to resolve)
db.collection('matches').aggregate([{
$lookup: {
from: "players",
localField: "players",
foreignField: "_id",
as: "players"
}
}]);
Why so much difference? In what way can I do faster pipeline lookup?
The thing is that when you do a lookup using pipeline with a match stage, then the index would be used only for the fields that are matched with $eq operator and for the rest index will not be used.
And the example you specified with pipeline will work like this ( again index will not be used here as it is not $eq )
db.matches.aggregate([
{
$lookup: {
from: "players",
let: {
ids: {
$map: {
input: "$players",
in: "$$this._id"
}
}
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$_id",
"$$ids"
]
}
}
}
],
as: "players"
}
}
])
As players is an array of object so it need to be mapped to array of ids first
MongoDB Playground
As #namar sood comments there are several tickets that refer to this issue:
https://jira.mongodb.org/browse/SERVER-37470
https://jira.mongodb.org/browse/SERVER-32549
Meanwhile a solution could be (also works nested):
db.collection('matches').aggregate([
{ $unwind: '$players' },
{
$lookup: {
from: 'players',
let: { id: '$players' },
pipeline: [{ $match: { $expr: { $eq: ['$_id', '$$id' ] } } }],
as: 'players'
},
{ $unwind: '$players' },
{
$group: {
"_id": "$_id",
"data": { "$first": "$$ROOT" },
"players": {$push: "$players"}
}
},
{ $addFields: {"data.players": "$players"} },
{ $replaceRoot: { newRoot: "$data" }}
]);

MongoDB Lookup in array

Good Morning.
I'm trying to query mongoDB by joining two collections. But I'm not able to do that when the primary identifier is inside an array in the secondary collection.
See my code and my collection
Pipe Colletion
{ _id: 1, deal_id: 25698}
{ _id: 2, deal_id: 45879}
{ _id: 3, deal_id: 54142}
Leads Colletion
{ _id: 1, name:"Teste A", deals_id:[25698,45879]}
{ _id: 2, name:"Teste B", deals_id:[54142]}
Desired result when searching for deal_id from the Pipe collection:
Results
{ _id: 1, deal_id: 25698, name:"Teste A"}
{ _id: 2, deal_id: 45879, name:"Teste A"}
{ _id: 3, deal_id: 54142, name:"Teste B"}
My Code:
db.pipedrive.aggregate([
{
$lookup: {
from: "leads",
'let': {
deal_id: '$deal_id'
},
pipeline: [
{
$match: {
$expr: {
$and: [
{$eq: ["$$deal_id", "$deals_id"]}
]
}
}
}
],
as: "Results"
}
}
]);
As I am currently doing, the analytics line is always returning me empty.
Can you help me?
With your original approach, exchanging $eq with $in should work, and since you only have one condition $and is not necessary. Note that $deals_id should exists in foreign document and must be an array, as $in requires the second parameter to be an array. So in case it doesn't exist we will have to wrap it with $ifNull
db.pipedrive.aggregate([{
$lookup: {
from: "leads",
let: {
deal_id: '$deal_id'
},
pipeline: [
{
$match: {
$expr: { $eq: ["$$deal_id", { $ifNull: ["$deals_id", []] ] }
}
}
],
as: "Results"
}
}
// optional other stages to output as desired shape
]);
A Better alternative would be just to use regular $lookup, since mongodb is able to compare the with the elements of the foreign field array. Just like with find query. This will also account for missing fields or non array fields.
db.pipedrive.aggregate([{
$lookup: {
from: "leads",
localField: "deal_id",
foreignField: "deals_id",
as: "Results"
}
}
// optional other stages to output as desired shape
{
$unwind: "$Results"
},
{
$project: {
deal_id: true,
name: "$Results.name"
}
}
]);
Mongo playground
You can use this query to get desired results. $lookup will take care of array as well.
db.pipe.aggregate([
{
$lookup: {
from: "leads",
localField: "deal_id",
foreignField: "deals_id",
as: "results"
}
},
{
$unwind: "$results"
},
{
$project: {
_id: 1,
deal_id: 1,
name: "$results.name"
}
}
])
Here is the link to check results,
https://mongoplayground.net/p/mhF6zHD9d26

$lookup when foreignField is in nested array

I have two collections :
Student
{
_id: ObjectId("657..."),
name:'abc'
},
{
_id: ObjectId("593..."),
name:'xyz'
}
Library
{
_id: ObjectId("987..."),
book_name:'book1',
issued_to: [
{
student: ObjectId("657...")
},
{
student: ObjectId("658...")
}
]
},
{
_id: ObjectId("898..."),
book_name:'book2',
issued_to: [
{
student: ObjectId("593...")
},
{
student: ObjectId("594...")
}
]
}
I want to make a Join to Student collection that exists in issued_to array of object field in Library collection.
I would like to make a query to student collection to get the student data as well as in library collection, that will check in issued_to array if the student exists or not if exists then get the library document otherwise not.
I have tried $lookup of mongo 3.6 but I didn`t succeed.
db.student.aggregate([{$match:{_id: ObjectId("593...")}}, $lookup: {from: 'library', let: {stu_id:'$_id'}, pipeline:[$match:{$expr: {$and:[{"$hotlist.clientEngagement": "$$stu_id"]}}]}])
But it thorws error please help me in regard of this. I also looked at other questions asked at stackoverflow like. question on stackoverflow,
question2 on stackoverflow but these are comapring simple fields not array of objects. please help me
I am not sure I understand your question entirely but this should help you:
db.student.aggregate([{
$match: { _id: ObjectId("657...") }
}, {
$lookup: {
from: 'library',
localField: '_id' ,
foreignField: 'issued_to.student',
as: 'result'
}
}])
If you want to only get the all book_names for each student you can do this:
db.student.aggregate([{
$match: { _id: ObjectId("657657657657657657657657") }
}, {
$lookup: {
from: 'library',
let: { 'stu_id': '$_id' },
pipeline: [{
$unwind: '$issued_to' // $expr cannot digest arrays so we need to unwind which hurts performance...
}, {
$match: { $expr: { $eq: [ '$issued_to.student', '$$stu_id' ] } }
}, {
$project: { _id: 0, "book_name": 1 } // only include the book_name field
}],
as: 'result'
}
}])
This might not be a very good answer, but if you can change your schema of Library to:
{
_id: ObjectId("987..."),
book_name:'book1'
issued_to: [
ObjectId("657..."),
ObjectId("658...")
]
},
{
_id: "ObjectId("898...")",
book_name:'book2'
issued_to: [
ObjectId("593...")
ObjectId("594...")
]
}
Then when you do:
{
$lookup: {
from: 'student',
localField: 'issued_to',
foreignField: '_id',
as: 'issued_to_students', // this creates a new field without overwriting your original 'issued_to'
}
},
You should get, based on your example above:
{
_id: ObjectId("987..."),
book_name:'book1'
issued_to_students: [
{ _id: ObjectId("657..."), name: 'abc', ... },
{ _id: ObjectId("658..."), name: <name of this _id>, ... }
]
},
{
_id: "ObjectId("898...")",
book_name:'book2'
issued_to: [
{ _id: ObjectId("593..."), name: 'xyz', ... },
{ _id: ObjectId("594..."), name: <name of this _id>, ... }
]
}
You need to $unwind the issued_to from library collection to match the issued_to.student with _id
db.student.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id) } },
{ "$lookup": {
"from": Library.collection.name,
"let": { "studentId": "$_id" },
"pipeline": [
{ "$unwind": "$issued_to" },
{ "$match": { "$expr": { "$eq": [ "$issued_to.student", "$$studentId" ] } } }
],
"as": "issued_to"
}}
])