Using object path in $lookup mongo aggregation pipeline - mongodb

For today's task I am trying aggregating documents in a collection (let's call it collection1 and in one of the pipeline's stages I am trying to use $lookup to retrieve documents from another collection (let's call it collection2).
collection1 object model:
{
"field1": "value1",
"field2": "value2"
"field3": "value3"
}
collection2 object model:
{
"field1: "value1",
"field2"; "value2",
"field3: {
"field31": "value31",
"field32": "value32"
}
}
What I am exactly trying to do is to retrieve the documents from collection2 where field3.field31 equals value of the collection1s field1.
My $lookup stage looks like approx like this but currently it doesn't seem to work. I did not find any clue if this should work but looking forward to your replies.
{
$lookup: {
from: "collection2",
let: {
"c": "$field1",
"l": "$field2",
"t": "$field3",
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ["$field1", "$$c"] },
{ $eq: ["$field2", "$$l"] },
{ $eq: ["$field3.field31", "$$t"] },
]
}
},
},
],
as: "awesomejoin"
}
}
I want to avoid having a project or a group and then unwinding and filtering again. My wish is to get the records directly from the match stage thinking this is better in terms of performance...
Let me know your thoughts on this.
Thank you

Please try this :
db.Collection1.aggregate([
{
$lookup:
{
from: "Collection2",
localField: "field1",
foreignField: "field3.field31",
as: "docs"
}
}
])
It should be simple with plain $lookup, not exactly sure why you're creating local variables and looking for equals on same variables, Also $unwind will be used on arrays, over objects you could access inner elements using . notation same as like in programming languages.
Ref : $lookup

Related

Using conditions for both collections (original and foreign) in lookup $match

I'm not sure if it is a real problem or just lack of documentations.
You can put conditions for documents in foreign collection in a lookup $match.
You can also put conditions for the documents of original collection in a lookup $match with $expr.
But when I want to use both of those features, it doesn't work. This is sample lookup in aggregation
{ $lookup:
{
from: 'books',
localField: 'itemId',
foreignField: '_id',
let: { "itemType": "$itemType" },
pipeline: [
{ $match: { $expr: { $eq: ["$$itemType", "book"] } }}
],
as: 'bookData'
}
}
$expr is putting condition for original documents. But what if I want to get only foreign documents with status: 'OK' ? Something like:
{ $match: { status: "OK", $expr: { $eq: ["$$itemType", "book"] } }}
Does not work.
I tried to play with the situation you provided.
Try to put $expr as the first key of $match object. And it should do the thing.
{ $lookup:
{
from: 'books',
localField: 'itemId',
foreignField: '_id',
let: { "itemType": "$itemType" },
pipeline: [
{ $match: { $expr: { $eq: ["$$itemType", "book"] }, status: 'OK' }}
],
as: 'bookData'
}
}
The currently accepted answer is "wrong" in the sense that it doesn't actually change anything. The ordering that the fields for the $match predicate are expressed in does not make a difference. I would demonstrate this with your specific situation, but there is an extra complication there which we will get to in a moment. In the meantime, consider the following document:
{
_id: 1,
status: "OK",
key: 123
}
This query:
db.collection.find({
status: "OK",
$expr: {
$eq: [
"$key",
123
]
}
})
And this query, which just has the order of the predicates reversed:
db.collection.find({
$expr: {
$eq: [
"$key",
123
]
},
status: "OK"
})
Will both find and return that document. A playground demonstration of the first can be found here and the second one is here.
Similarly, your original $match:
{ $match: { status: "OK", $expr: { $eq: ["$$itemType", "book"] } }}
Will behave the same as the one in the accepted answer:
{ $match: { $expr: { $eq: ["$$itemType", "book"] }, status: 'OK' }}
Said another way, there is no difference in behavior based on whether or not the $expr is used first. However, I suspect the overall aggregation is not expressing your desired logic. Let's explore that a little further. First, we need to address this:
$expr is putting condition for original documents.
This is not really true. According to the documentation for $expr, that operator "allows the use of aggregation expressions within the query language."
A primary use of this functionality, and indeed the first one listed in the documentation, is to compare two fields from a single document. In the context of $lookup, this ability to refer to fields from the original documents allows you to compare their values against the collection that you are joining with. The documentation has some examples of that, such as here and other places on that page which refer to $expr.
With that in mind, let's come back to your aggregation. If I am understanding correctly, your intent with the { $expr: { $eq: ["$$itemType", "book"] } predicate is to filter documents from the original collection. Is that right?
If so, then that is not what your aggregation is currently doing. You can see in this playground example that the $match nested inside of the $lookup pipeline does not affect the documents from the original collection. Instead, you should do that filtering via an initial $match on the base pipeline. So something like this:
db.orders.aggregate([
{
$match: {
$expr: {
$eq: [
"$itemType",
"book"
]
}
}
}
])
Or, more simply, this:
db.orders.aggregate([
{
$match: {
"itemType": "book"
}
}
])
Based on all of this, your final pipeline should probably look similar to the following:
db.orders.aggregate([
{
$match: {
"itemType": "book"
}
},
{
$lookup: {
from: "books",
localField: "itemId",
foreignField: "_id",
let: {
"itemType": "$itemType"
},
pipeline: [
{
$match: {
status: "OK"
}
}
],
as: "bookData"
}
}
])
Playground example here. This pipeline:
Filters the data in the original collection (orders) by their itemType. From the sample data, it removes the document with _id: 3 as it has a different itemType than the one we are looking for ("book").
It uses the localField/foreignField syntax to find data in books where the _id of the books document matches the itemId of the source document(s) in the orders collection.
It further uses the let/pipeline syntax to express the additional condition that the status of the books document is "OK". This is why books document with the status of "BAD" does not get pulled into the bookData for the orders document with _id: 2.
Documentation for the (combined) second and third parts is here.

MongoDB lookup with object relation instead of array

I have a collection matches like this. I'm using players object {key: ObjectId, key: ObjectID} instead of classic array [ObjectId, ObjectID] for reference players collection
{
"_id": ObjectId("5eb93f8efd259cd7fbf49d55"),
"date": "01/01/2020",
"players": {
"home": ObjectId("5eb93f8efd259cd7fbf49d59"),
"away": ObjectId("5eb93f8efd259cd7fbf49d60")
}
},
{...}
And players collection:
{
"_id": ObjectId("5eb93f8efd259cd7fbf49d59"),
"name": "Roger Federer"
"country": "Suiza"
},
{
"_id": ObjectId("5eb93f8efd259cd7fbf49d60"),
"name": "Rafa Nadal"
"country": "España"
},
{...}
What's the better way to do mongoDB lookup? something like this is correct?
const rows = await db.collection('matches').aggregate([
{
$lookup: {
from: "players",
localField: "players.home",
foreignField: "_id",
as: "players.home"
}
},
{
$lookup: {
from: "players",
localField: "players.away",
foreignField: "_id",
as: "players.away"
},
{ $unwind: "$players.home" },
{ $unwind: "$players.away" },
}]).toArray()
I want output like this:
{
_id: 5eb93f8efd259cd7fbf49d55,
date: "12/05/20",
players: {
home: {
_id: 5eb93f8efd259cd7fbf49d59,
name: "Roger Federer",
country: "Suiza"
},
away: {
_id: 5eb93f8efd259cd7fbf49d60,
name: "Rafa Nadal",
country: "España"
}
}
}
{...}
You can try below aggregation query :
db.matches.aggregate([
{
$lookup: {
from: "players",
localField: "players.home",
foreignField: "_id",
as: "home"
}
},
{
$lookup: {
from: "players",
localField: "players.away",
foreignField: "_id",
as: "away"
}
},
/** Check output of lookup is not empty array `[]` & get first doc & write it to respective field, else write the same value as original */
{
$project: {
date: 1,
"players.home": { $cond: [ { $eq: [ "$home", [] ] }, "$players.home", { $arrayElemAt: [ "$home", 0 ] } ] },
"players.away": { $cond: [ { $eq: [ "$away", [] ] }, "$players.away", { $arrayElemAt: [ "$away", 0 ] } ] }
}
}
])
Test : mongoplayground
Changes or Issues with current Query :
1) As you're using two $unwind stages one after the other, If anyone of the field either home or away doesn't have a matching document in players collection then in the result you don't even get actual match document also, But why ? It's because if you do $unwind on [] (which is returned by lookup stage) then unwind will remove that parent document from result, To overcome this you need to use preservenullandemptyarrays option in unwind stage.
2) Ok, there is another way to do this without actually using $unwind. So do not use as: "players.home" or as: "players.away" cause you're actually writing back to original field, Just in case if you don't find a matching document an empty array [] will be written to actual fields either to "home" or "away" wherever there is not match (In this case you would loose actual ObjectId() value existing in that particular field in matches doc). So write output of lookup to a new field.
Or even more efficient way, instead of two $lookup stages (Cause each lookup has to go through docs of players collection again & again), you can try one lookup with multiple-join-conditions-with-lookup :
db.matches.aggregate([
{
$lookup: {
from: "players",
let: { home: "$players.home", away: "$players.away" },
pipeline: [
{
$match: { $expr: { $or: [ { $eq: [ "$_id", "$$home" ] }, { $eq: [ "$_id", "$$away" ] } ] } }
}
],
as: "data"
}
}
])
Test : mongoplayground
Note : Here all the matching docs from players which match with irrespective of away or home field will be pushed to data array. So to keep DB operation simple you can get that array from DB along with actual matches document & Offload some work to code which is to map respective objects from data array to players.home & players.away fields.

Link each element of array in a document to the corresponding element in an array of another document with MongoDB

Using MongoDB 4.2 and MongoDB Atlas to test aggregation pipelines.
I've got this products collection, containing documents with this schema:
{
"name": "TestProduct",
"relatedList": [
{id:ObjectId("someId")},
{id:ObjectId("anotherId")}
]
}
Then there's this cities collection, containing documents with this schema :
{
"name": "TestCity",
"instructionList": [
{ related_id: ObjectId("anotherId"), foo: bar},
{ related_id: ObjectId("someId"), foo: bar}
{ related_id: ObjectId("notUsefulId"), foo: bar}
...
]
}
My objective is to join both collections to output something like this (the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document) :
{
"name": "TestProduct",
"relatedList": [
{ related_id: ObjectId("someId"), foo: bar},
{ related_id: ObjectId("anotherId"), foo: bar},
]
}
I tried using the $lookup operator for aggregation like this :
$lookup:{
from: 'cities',
let: {rId:'$relatedList._id'},
pipeline: [
{
$match: {
$expr: {
$eq: ["$instructionList.related_id", "$$rId"]
}
}
},
]
}
But it's not working, I'm a bit lost with this complex pipeline syntax.
Edit
By using unwind on both arrays :
{
{$unwind: "$relatedList"},
{$lookup:{
from: "cities",
let: { "rId": "$relatedList.id" },
pipeline: [
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id","$$rId"]}}},
],
as:"instructionList",
}},
{$group: {
_id: "$_id",
instructionList: {$addToSet:"$instructionList"}
}}
}
I am able to achieve what I want, however,
I'm not getting a clean result at all :
{
"name": "TestProduct",
instructionList: [
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("someId")
}
}
],
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("anotherId")
}
}
]
]
}
How can I group everything to be as clean as stated for my original question ?
Again, I'm completely lost with the Aggregation framework.
the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document)
Given an example document on cities collection:
{"_id": ObjectId("5e4a22a08c54c8e2380b853b"),
"name": "TestCity",
"instructionList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"},
{"related_id": "c", "foo": "z"}
]}
and an example document on products collection:
{"_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"id": "a"},
{"id": "b"}
]}
You can achieve try using the following aggregation pipeline:
db.products.aggregate([
{"$lookup":{
"from": "cities",
"let": { "rId": "$relatedList.id" },
"pipeline": [
{"$unwind":"$instructionList"},
{"$match":{
"$expr":{
"$in":["$instructionList.related_id", "$$rId"]
}
}
}],
"as":"relatedList",
}},
{"$project":{
"name":"$name",
"relatedList":{
"$map":{
"input":"$relatedList",
"as":"x",
"in":{
"related_id":"$$x.instructionList.related_id",
"foo":"$$x.instructionList.foo"
}
}
}
}}
]);
To get a result as the following:
{ "_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"}
]}
The above is tested in MongoDB v4.2.x.
But it's not working, I'm a bit lost with this complex pipeline syntax.
The reason why it's slightly complex here is because you have an array relatedList and also an array of subdocuments instructionList. When you refer to instructionList.related_id (which could mean multiple values) with $eq operator, the pipeline doesn't know which one to match.
In the pipeline above, I've added $unwind stage to turn instructionList into multiple single documents. Afterward, using $in to express a match of single value of instructionList.related_id in array relatedList.
I believe you just need to $unwind the arrays in order to lookup the relation, then $group to recollect them. Perhaps something like:
.aggregeate([
{$unwind:"relatedList"},
{$lookup:{
from:"cities",
let:{rId:"$relatedList.id"}
pipeline:[
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$project:{_id:0, instruction:"$instructionList"}}
],
as: "lookedup"
}},
{$addFields: {"relatedList.foo":"$lookedup.0.instruction.foo"}},
{$group: {
_id:"$_id",
root: {$first:"$$ROOT"},
relatedList:{$push:"$relatedList"}
}},
{$addFields:{"root.relatedList":"$relatedList"}},
{$replaceRoot:{newRoot:"$root"}}
])
A little about each stage:
$unwind duplicates the entire document for each element of the array,
replace the array with the single element
$lookup can then consider each element separately. The stages in $lookup.pipeline:
a. $match so we only unwind the document with matching ID
b. $unwind the array so we can consider individual elements
c. repeat the $match so we are only left with matching elements (hopefully just 1)
$addFields assigns the foo field retrieved from the lookup to the object from relatedList
$group collects together all of the documents with the same _id (i.e. that were unwound from a single original document), stores the first as 'root', and pushes all of the relatedList elements back into an array
$addFields moves the relatedList in to root
$replaceRoot returns the root, which should now be the original document with the matching foo added to each relatedList element

mongoDB find documents based on a condition with a value from linked document of another collection

I have collections of the following structure:
objects:
[{"type": "someTypeOne", "menuId": 1},
{"type": "someTypeTwo", "menuId": 1},
{"type": "someTypeOne", "menuId": 2}]
menus:
[{"id":1, "type": "someTypeOne"},
{"id":2, "type": "someTypeOne"}]
I need to find all objects where "type" property doesn't match its menus "type". In this case the desired output would be:
[{"type": "someTypeTwo", "menuId": 1}]
I think that I should use aggregation for this one and I'm fiddling with it at the moment but I was not able to formulate a working query so far.
Thanks
You can try below aggregation:
db.objects.aggregate([
{
$lookup: {
from: "menus",
localField: "menuId",
foreignField: "id",
as: "menu"
}
},
{
$unwind: "$menu"
},
{
$match: {
$expr: {
$ne: [ "$menu.type", "$type" ]
}
}
},
{
$project: {
menu: 0
}
}
])
$lookup allows you to get data from both collections, then you can run $unwind on menu array to get single menu per document and you can apply you inequality condition using $match and $expr
Mongo Playground

Extract $graphLookup matches into documents

For context, I'm using MongoDB 3.6.4 and I'm trying to build a hierarchical schema for ACL permissions, but I'll boil the problem down and save the details.
Say I have a simple collection C, where parents is a list of references to other documents in C:
{
_id: ObjectId
parents: Array(ObjectId)
}
If I do an aggregation like:
[
{
$match: {_id: ObjectId("f00...")}
},
{
$graphLookup: {
from: "C",
startWith: "$parents",
connectFromField: "parents",
connectToField: "_id",
as: "graph"
}
}
]
I get back data like:
{
"_id": ObjectId("f00..."),
"parents": [ObjectId("f01..."), ObjectId("f02..."), ...],
"graph": [<doc1>, <doc2>, <doc3>, ...]
}
Is there a way to split the graph items out into documents? e.g. from the previous output example:
{
"_id": ObjectId("f00..."),
"parents": [ObjectId("f01..."), ObjectId("f02..."), ...]
}
<doc1>
<doc2>
<doc3>
You can try adding below stages to query.
[
{"$project":{"data":{"$concatArrays":[["$$ROOT"],"$graph"]}}},
{"$unwind":"$data"},
{"$project":{"data.graph":0}},
{"$replaceRoot":{"newRoot":"$data"}}
]