MongoDB Aggregation Lookup with Pipeline Doesn't Work - mongodb

I have two collections. I am trying to add the documents of Collection 2 to Collection 1, if number 1 and number 2 in Collection 2 is within a certain range as specified in Collection 1. FYI ObjectId in Collection 1 and ObjectId in Collection 2 refer to two different items/products, hence I cannot join the two collections on this id.
Example Document from Collection 1:
{'_id': ObjectId('4321'),
'number1_lb': 61.205672407820025,
'number1_ub': 61.24170844385606,
'number2_lb': -149.75074963516136,
'number2_ub': -149.71471359912533}
Example Document from Collection 2:
{'_id': ObjectId('1234'),
'number1': 1.282298,
'number2': 103.8475}
I want the output:
{'_id': ObjectId('4321'),
'number1_lb': 61.205672407820025,
'number1_ub': 61.24170844385606,
'number2_lb': -149.75074963516136,
'number2_ub': -149.71471359912533,
'recs': [ObjectId('3456'), ObjectId('4567'),...]
I thought that a lookup stage with pipeline would work. My code is currently as follows:
{"$lookup":{
"from": "Collection 2",
"let":{
"number1_lb":"$number1_lb",
"number1_ub":"$number1_ub",
"number2_lb":"$number2_lb",
"number2_ub":"$number2_ub"
},
"pipeline": [
{"$match":
{"$expr":
{"$and":[
{"$gte":["$number1","$$number1_lb"]},
{"$gte":["$number2","$$number2_lb"]},
{"$lte":["$number1","$$number1_ub"]},
{"$lte":["$number2","$$number2_ub"]}
]}}}
],
"as": "recs"
}}
But running the above gives me no output. Am I doing something wrong??

I ran it and it seems to work fine; but I had to tweak your input data in coll1 as it didn't meet the $match the criteria.
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()["testdatabase"]
# Data Setup
db.coll1.replace_one({"_id": "4321"}, {"_id": "4321", "number1_lb": -61.205672407820025, "number1_ub": 61.24170844385606, "number2_lb": -149.75074963516136, "number2_ub": 149.71471359912533}, upsert=True)
db.coll2.replace_one({"_id": "1234"}, {"_id": "1234", "number1": 1.282298, "number2": 103.8475}, upsert=True)
# Run the aggregation
results = db.coll1.aggregate([
{"$lookup": {
"from": "coll2",
"let": {
"number1_lb": "$number1_lb",
"number1_ub": "$number1_ub",
"number2_lb": "$number2_lb",
"number2_ub": "$number2_ub"
},
"pipeline": [
{"$match":
{"$expr":
{"$and": [
{"$gte": ["$number1", "$$number1_lb"]},
{"$gte": ["$number2", "$$number2_lb"]},
{"$lte": ["$number1", "$$number1_ub"]},
{"$lte": ["$number2", "$$number2_ub"]}
]}}}
],
"as": "recs"
}}
])
# pretty up the results
print(dumps(results, indent=4))
gives:
[
{
"_id": "4321",
"number1_lb": -61.205672407820025,
"number1_ub": 61.24170844385606,
"number2_lb": -149.75074963516136,
"number2_ub": 149.71471359912533,
"recs": [
{
"_id": "1234",
"number1": 1.282298,
"number2": 103.8475
}
]
}
]

You are looking to use a $lookup and a $project :
{
$lookup: {
from: "Collection2",
localField: [Foreign Field of the Collection1],
foreignField: [Principal field of the foreign collection here Collection2],
as: "nameJoint"
}
},
{$project: {
"newFieldName":
}},
But to make a joint between 2 document there as to be an commun field between those 2 documents. I am not sure there is one in this situation or I misunderstand it.
(A $lookup is bassicaly a SQL joint in noSQL )

Related

How to merge multiple documents in MongoDB and convert fields' values into fields

I have a MongoDB collection that I have managed to process using an aggregation pipeline to produce the following result:
[
{
_id: 'Complex Numbers',
count: 2
},
{ _id: 'Calculus',
count: 1
}
]
But the result that I am aiming for is something like the following:
{
'Complex Numbers': 2,
'Calculus': 1
}
is there a way to achieve that?
Query
to convert to {} we need somethings like [[k1 v1] ...] OR [{"k" "..." :v "..."}]
first stage
converts each document to [{"k" ".." , "v" ".."}]
then arrayToObject
and replace root
so we have each document like "Complex Numbers": 2
the group is used to combine all those documents in 1 document
and then replace the root with that one document
Test code here
aggregate(
[{"$replaceRoot":
{"newRoot": {"$arrayToObject": [[{"k": "$_id", "v": "$count"}]]}}},
{"$group": {"_id": null, "data": {"$mergeObjects": "$$ROOT"}}},
{"$replaceRoot": {"newRoot": "$data"}}])

performing $lookup on subset of collection

I have this data
[
{"_id":0,"a":1,"b":1,"source":1},
{"_id":1,"a":1,"c":4,"source":1},
{"_id":2,"a":2,"d":6,"source":1},
{"_id":3,"a":2,"e":6,"source":1},
{"_id":4,"a":2,"f":6,"source":1},
{"_id":5,"a":3,"d":6,"source":1},
{"_id":6,"a":3,"b":1,"source":1},
{"_id":7,"a":3,"f":6,"source":1},
{"_id":8,"a":3,"qq":3,"source":2},
{"_id":9,"a":3,"fl":6,"source":2}
]
I want to return all documents whose a field is equal to the a field of a document that has a field b. Furthermore, all must be from source 1.
The final result should be this:
[
{"_id":0,a":1,"b":1,"source":1},
{"_id":1,"a":1,"c":4,"source":1},
{"_id":5,"a":3,"d":6,"source":1},
{"_id":6,"a":3,"b":1,"source":1},
{"_id":7,"a":3,"f":6,"source":1}
]
The following query gives me the results I want:
myCollection.aggregate([{"$match":{"b":{"$exists":true},"source":1}},
{"$group":{"_id":null, "a":{"$addToSet":"$a"}}},
{"$unwind":{"path":"$a"}},
{"$project":{"_id":false}},
{"$lookup":
{"from": "myCollection",
"localField":"a",
"foreignField":"a",
"as":"results"}},
{"$project":{"a":false}},
{"$unwind":{"path":"$results"}},
{"$replaceRoot":{"newRoot":"$results"}},
{"$match":{"source":1}}
])
However, having to add that last {"$match":{"source":1}} statement got me thinking that for large sets of data the $lookup statement is going to produce a lot of unwanted results that will then be filtered out by my last $match statement. Is there any way to prevent their generation by limiting $lookup to documents from myCollection where source equals 1?
ie replace
{"$lookup":
{"from": "myCollection"
with something like
{"$lookup":
{"from": myCollection.match({"source":1})
Alternatively, is there a more efficient pipeline I could be using?
You can filter few documents in the pipeline of $lookup stage. This will help in to gain some performance and avoid unnecessary results. You can use it like below:
{
"$lookup": {
"from": "collection",
"let": {
a_: "$a"
},
"pipeline": [
{
"$match": {
$expr: {
$and: [
{
$eq: [
"$source",
1
]
},
{
$eq: [
"$a",
"$$a_"
]
}
]
}
}
}
],
"as": "results"
}
}
Your $project stage,
{"$project":{"a":false}}
is useless actually, you can omit it.

Link each element of array in a document to the corresponding element in an array of another document with MongoDB

Using MongoDB 4.2 and MongoDB Atlas to test aggregation pipelines.
I've got this products collection, containing documents with this schema:
{
"name": "TestProduct",
"relatedList": [
{id:ObjectId("someId")},
{id:ObjectId("anotherId")}
]
}
Then there's this cities collection, containing documents with this schema :
{
"name": "TestCity",
"instructionList": [
{ related_id: ObjectId("anotherId"), foo: bar},
{ related_id: ObjectId("someId"), foo: bar}
{ related_id: ObjectId("notUsefulId"), foo: bar}
...
]
}
My objective is to join both collections to output something like this (the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document) :
{
"name": "TestProduct",
"relatedList": [
{ related_id: ObjectId("someId"), foo: bar},
{ related_id: ObjectId("anotherId"), foo: bar},
]
}
I tried using the $lookup operator for aggregation like this :
$lookup:{
from: 'cities',
let: {rId:'$relatedList._id'},
pipeline: [
{
$match: {
$expr: {
$eq: ["$instructionList.related_id", "$$rId"]
}
}
},
]
}
But it's not working, I'm a bit lost with this complex pipeline syntax.
Edit
By using unwind on both arrays :
{
{$unwind: "$relatedList"},
{$lookup:{
from: "cities",
let: { "rId": "$relatedList.id" },
pipeline: [
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id","$$rId"]}}},
],
as:"instructionList",
}},
{$group: {
_id: "$_id",
instructionList: {$addToSet:"$instructionList"}
}}
}
I am able to achieve what I want, however,
I'm not getting a clean result at all :
{
"name": "TestProduct",
instructionList: [
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("someId")
}
}
],
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("anotherId")
}
}
]
]
}
How can I group everything to be as clean as stated for my original question ?
Again, I'm completely lost with the Aggregation framework.
the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document)
Given an example document on cities collection:
{"_id": ObjectId("5e4a22a08c54c8e2380b853b"),
"name": "TestCity",
"instructionList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"},
{"related_id": "c", "foo": "z"}
]}
and an example document on products collection:
{"_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"id": "a"},
{"id": "b"}
]}
You can achieve try using the following aggregation pipeline:
db.products.aggregate([
{"$lookup":{
"from": "cities",
"let": { "rId": "$relatedList.id" },
"pipeline": [
{"$unwind":"$instructionList"},
{"$match":{
"$expr":{
"$in":["$instructionList.related_id", "$$rId"]
}
}
}],
"as":"relatedList",
}},
{"$project":{
"name":"$name",
"relatedList":{
"$map":{
"input":"$relatedList",
"as":"x",
"in":{
"related_id":"$$x.instructionList.related_id",
"foo":"$$x.instructionList.foo"
}
}
}
}}
]);
To get a result as the following:
{ "_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"}
]}
The above is tested in MongoDB v4.2.x.
But it's not working, I'm a bit lost with this complex pipeline syntax.
The reason why it's slightly complex here is because you have an array relatedList and also an array of subdocuments instructionList. When you refer to instructionList.related_id (which could mean multiple values) with $eq operator, the pipeline doesn't know which one to match.
In the pipeline above, I've added $unwind stage to turn instructionList into multiple single documents. Afterward, using $in to express a match of single value of instructionList.related_id in array relatedList.
I believe you just need to $unwind the arrays in order to lookup the relation, then $group to recollect them. Perhaps something like:
.aggregeate([
{$unwind:"relatedList"},
{$lookup:{
from:"cities",
let:{rId:"$relatedList.id"}
pipeline:[
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$project:{_id:0, instruction:"$instructionList"}}
],
as: "lookedup"
}},
{$addFields: {"relatedList.foo":"$lookedup.0.instruction.foo"}},
{$group: {
_id:"$_id",
root: {$first:"$$ROOT"},
relatedList:{$push:"$relatedList"}
}},
{$addFields:{"root.relatedList":"$relatedList"}},
{$replaceRoot:{newRoot:"$root"}}
])
A little about each stage:
$unwind duplicates the entire document for each element of the array,
replace the array with the single element
$lookup can then consider each element separately. The stages in $lookup.pipeline:
a. $match so we only unwind the document with matching ID
b. $unwind the array so we can consider individual elements
c. repeat the $match so we are only left with matching elements (hopefully just 1)
$addFields assigns the foo field retrieved from the lookup to the object from relatedList
$group collects together all of the documents with the same _id (i.e. that were unwound from a single original document), stores the first as 'root', and pushes all of the relatedList elements back into an array
$addFields moves the relatedList in to root
$replaceRoot returns the root, which should now be the original document with the matching foo added to each relatedList element

mongoDB find documents based on a condition with a value from linked document of another collection

I have collections of the following structure:
objects:
[{"type": "someTypeOne", "menuId": 1},
{"type": "someTypeTwo", "menuId": 1},
{"type": "someTypeOne", "menuId": 2}]
menus:
[{"id":1, "type": "someTypeOne"},
{"id":2, "type": "someTypeOne"}]
I need to find all objects where "type" property doesn't match its menus "type". In this case the desired output would be:
[{"type": "someTypeTwo", "menuId": 1}]
I think that I should use aggregation for this one and I'm fiddling with it at the moment but I was not able to formulate a working query so far.
Thanks
You can try below aggregation:
db.objects.aggregate([
{
$lookup: {
from: "menus",
localField: "menuId",
foreignField: "id",
as: "menu"
}
},
{
$unwind: "$menu"
},
{
$match: {
$expr: {
$ne: [ "$menu.type", "$type" ]
}
}
},
{
$project: {
menu: 0
}
}
])
$lookup allows you to get data from both collections, then you can run $unwind on menu array to get single menu per document and you can apply you inequality condition using $match and $expr
Mongo Playground

mongo aggregate group and find one

I have collections in mongodb: which stores as:
{"tag":"count1","value":100,"ts":1544423706} {"tag":"count2","value":1002,"ts":1544423706} {"tag":"count1","value":101,"ts":1544423806} {"tag":"count2","value":1003,"ts":1544423806} {"tag":"count1","value":102,"ts":1544423906} {"tag":"count2","value":1004,"ts":1544423906}
so my problem is how can I get the result out of "tag" is count1 , "ts" is larger than 1544423800's first item. As I describled: I want to find the result as:
{"tag":"count1","value":101,"ts":1544423806} {"tag":"count2","value":1003,"ts":1544423806}
do I need use aggregate to group the tag and then get the first item which larger than given "ts", I am new to aggregate function in MongoDB.
db.index.aggregate([{"$match": {"tag": {"$in":["count1","count2"]},"ts": {"$gt":1544423800}}
},
{"$group": {"_id": "$tag",
"tags": {"$push": "$$ROOT"}}
},
])
which the result is not one item for each tag so I want to limit one item , what am I have to do
thank you I have solve this by :
db.index.aggregate(
[
{"$match": {"tag": {"$in":["count1","count2"]},
"ts": {"$gt":1545730000}}
},
{"$group": {"_id": "$tag",
"value": {"$first": "$value"}
}
},
]
)