How can I group by a string instead of ObjectId in MongoDB aggregate? - mongodb

I have two collections and a many-to-one relationship between them:
"_id" : ObjectId("61cc81c9585946c3b44f24411"),
"name" : "some random name",
"price" : 100,
"description" : "description",
"category_id" : ObjectId("61cc8100585946c3b44f2317d")
"_id" : ObjectId("61cc8100585946c3b44f2317d"),
"description" : "Category description",
"name" : "Electronics"
I would like to output the maximum product price for each category:
{ "$group": {
"_id": "$category_id",
"max": { "$max": "$price"}
This works just fine as it prints me the following:
{ "_id" : ObjectId("61cc80fb585946c3b44f697c"), "max" : 62}
{ "_id" : ObjectId("61cc8100585946c3b44f697d"), "max" : 100}
But is there a way to get the "name" from the Category instead of its object id?
I know in SQL you would group by category_name but it does not seem to work here.

As suggested by #prasad, you should make use of $lookup stage after your $group stage.
"$group": {
"_id": "$category_id",
"max": {
"$max": "$price"
"$lookup": {
"from": "category",
"localField": "_id",
"foreignField": "_id",
"as": "categoryName",
"$set": {
"categoryName": {
"$arrayElemAt": [
Mongo Playground Sample


MongoDB count related documents (3 levels)

I need to fast count related documents.
So, I have four collections
{ "_id" : "g1", "name" : "group1" }
{ "_id" : "g2", "name" : "group2" }
{ "_id" : "c1", "name" : "course1", "group_id" : "g1" }
{ "_id" : "c2", "name" : "course2", "group_id" : "g2" }
{ "_id" : "t1", "name" : "top1c11", "course_id" : "c1" }
{ "_id" : "t2", "name" : "top1c12", "course_id" : "c1" }
{ "_id" : "t3", "name" : "top1c21", "course_id" : "c2" }
{ "_id" : "l1", "name" : "lesson111", "topic_id" : "t1" }
{ "_id" : "l2", "name" : "lesson112", "topic_id" : "t1" }
{ "_id" : "l3", "name" : "lesson121", "topic_id" : "t2" }
{ "_id" : "l4", "name" : "lesson211", "topic_id" : "t3" }
I need count all lessons of the specific group.
I tried to run the following aggregation, but I didn't wait for an response. (But it working for small amount of data)
"$lookup": {
"from": "topics",
"let": { "topicId": "$topic_id" },
"pipeline": [
"$match": { "$expr": { "$eq": [ "$_id", "$$topicId" ] } }
"$lookup": {
"from": "courses",
"let": { "courseId": "$topic_id" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$course_id", "$$courseId" ] } } },
"as": "course"
"$unwind": "$course"
"as": "topic"
"$unwind" : "$topic"
"$match": {
"topic.course.group_id" : "g1"
$group: {
_id: "$course",
"amount": {$sum:1},
I believe this aggregation can be optimized. But I don`t sure that is a good approach to use aggregation framework for such purpose. If so, how can I optimize the aggregation.
Size of collections (test data):
courses: 30000
topics: 200000
lessons: 30000000
Now I use simple nested loops in my code to count lessons. This takes 10 seconds (for 3000 topics of the certain group).
not nested lookups (lookup and unwind)
match the group
lookup and unwind 3x, last lookup only counts the lessons, and uses pipeline lookup
group by group _id, to get the total lessons
Indexes that you need (all the foreignField)
Test code here
[{"$group":{"_id":null, "lessons":{"$sum":1}}},
{"$set":{"id":"$_id", "_id":"$$REMOVE"}}],
[{"$eq":["$lessons", []]}, 0,
{"$arrayElemAt":["$lessons.lessons", 0]}]}}},
{"$group":{"_id":"$_id", "totalLessons":{"$sum":"$lessons"}}}])
nested lookups (without unwind)
code is the same, just nested
Test code here
[{"$group":{"_id":null, "lessons":{"$sum":1}}},
{"$set":{"id":"$_id", "_id":"$$REMOVE"}}],
[{"$eq":["$lessons", []]}, 0,
{"$arrayElemAt":["$lessons.lessons", 0]}]}}}],
{"_id":0, "totalLessons":{"$sum":"$topics.lessons"}}}],
[{"$eq":["$courses", []]}, 0,
{"$arrayElemAt":["$courses.totalLessons", 0]}]}}}])
If you can send some feedback on which one was faster.
If for 1 group its very fast, maybe remove the match, to do it for all groups, or allow from match to pass more many groups.
Solution from comment of Takis. Query1, adopted for 4.2
[{"$match":{"$expr":{"$eq":["$$ptopic", "$topic_id"]}}},
{"$group":{"_id":null, "lessons":{"$sum":1}}},
{"$set":{"id":"$_id", "_id":"$$REMOVE"}}],
[{"$eq":["$lessons", []]}, 0,
{"$arrayElemAt":["$lessons.lessons", 0]}]}}},
{"$group":{"_id":"$_id", "totalLessons":{"$sum":"$lessons"}}}])

Mongodb aggregate lookup three collections

Learning MongoDB for the past two days and I am trying to aggregate three collections but unable to achieve it
Below are the three collection maintaining in the database
"_id" : "619ca68b624c41e408348406",
"title" : "Company ID"
"_id" : "61a253da88ca12a37218898d",
"group_name" : "Gold"
"_id" : "619ca88a624c41e408348424",
"credential_id" : "619ca68b624c41e408348406",
"group_id" : "61a253da88ca12a37218898d",
"identifiers" : {
"first_name" : "Lee",
"middle_name" : "Min",
"last_name" : "Ho"
"created_at" : "2021-12-01T17:20:49.000Z"
Here I am trying to achieve the output in the below format:
Expected Output
"_id" : "619ca88a624c41e408348424",
"first_name" : ,
"middle_name" : ,
"last_name" : ,
"credential" : {
"_id:" : "619ca68b624c41e408348406",
"title" : "Company ID"
"group" : {
"_id" : "61a253da88ca12a37218898d",
"group_name" : "Gold"
"created_at" : "2021-12-01T17:20:49.000Z"
But, I am getting the fields only from t_user_credentials but not able to get like in the above format
$lookup: {
from: "t_credentials",
localField: "_id",
foreignField: "credential_id",
as: "credentials"
$unwind: {
preserveNullAndEmptyArrays: true
$lookup: {
from: "t_groups",
localField: "_id",
foreignField: "group_id",
as: "groups"
$unwind: {
path: '$groups',
preserveNullAndEmptyArrays: true
$project: {
last_name: "$identifiers.last_name",
first_name: "$identifiers.first_name",
middle_name: "$identifiers.middle_name",
"credentials.title": 1,
created_at: 1,
group_id: 1
Can any one help me to solve this issue, it will be very helpful for me.
This query uses $replaceWith to merge the identifiers sub-document into the $$ROOT document. We also use $unset to remove fields we are no longer interested in. Before all of that we make sure to unwind our credential and group fields.
You can check out a live demo of this query here
Consider the following:
"t_credentials": [
"_id": "619ca68b624c41e408348406",
"title": "Company ID"
"t_groups": [
"_id": "61a253da88ca12a37218898d",
"group_name": "Gold"
"t_user_credentials": [
"_id": "619ca88a624c41e408348424",
"credential_id": "619ca68b624c41e408348406",
"group_id": "61a253da88ca12a37218898d",
"identifiers": {
"first_name": "Lee",
"middle_name": "Min",
"last_name": "Ho"
"created_at": "2021-12-01T17:20:49.000Z"
"$lookup": {
"from": "t_credentials",
"localField": "credential_id",
"foreignField": "_id",
"as": "credential"
"$lookup": {
"from": "t_groups",
"localField": "group_id",
"foreignField": "_id",
"as": "group"
$unwind: "$group",
$unwind: "$credential"
$replaceWith: {
$mergeObjects: [
$unset: [
"_id": "619ca88a624c41e408348424",
"created_at": "2021-12-01T17:20:49.000Z",
"credential": {
"_id": "619ca68b624c41e408348406",
"title": "Company ID"
"first_name": "Lee",
"group": {
"_id": "61a253da88ca12a37218898d",
"group_name": "Gold"
"last_name": "Ho",
"middle_name": "Min"

$lookup nested array in mongodb

I am struggling with the newish (lovely) lookup operator in MongoDB. I have 3 collections:
"_id" : ObjectId("5b0d2b2c7ac4792df69a9942"),
"name" : "Dream Theater",
"started_in" : NumberInt(1985),
"active" : true,
"country" : "US",
"current_members" : [
"previous_members" : [
"albums" : [
"genres" : [
"prog metal",
"prog rock"
"_id" : ObjectId("5b0d16ee7ac4792df69a9924"),
"title" : "Images and words",
"released" : ISODate("1992-07-07T00:00:00.000+0000"),
"songs" : [
"type" : "LP"
"title" : "Awake",
"released" : ISODate("1994-10-04T00:00:00.000+0000"),
"songs" : [
"type" : "LP",
"_id" : ObjectId("5b0d47667ac4792df69a9994")
"_id" : ObjectId("5b0d15ab7ac4792df69a9916"),
"title" : "Pull me under"
"_id" : ObjectId("5b0d15ee7ac4792df69a991e"),
"title" : "Another day"
"title" : "Take the time",
"_id" : ObjectId("5b0d2db37ac4792df69a995d")
"title" : "Surrounded",
"_id" : ObjectId("5b0d2dbe7ac4792df69a995e")
"title" : "Metropolis - part I",
"_id" : ObjectId("5b0d2dcb7ac4792df69a995f")
"title" : "Under a glass moon",
"_id" : ObjectId("5b0d2dd87ac4792df69a9960")
"title" : "Wait for sleep",
"_id" : ObjectId("5b0d2de27ac4792df69a9961")
"title" : "Learning to live",
"_id" : ObjectId("5b0d2dec7ac4792df69a9962")
"title" : "6:00",
"_id" : ObjectId("5b0d470d7ac4792df69a9991")
I can easily do an aggregation with $lookup to get the detailed albums array, but how do I get also the detailed songs in the corresponding albums?
I would like to extend the following query:
db.artists.aggregate([ {
$lookup: {
from: "albums",
localField: "albums",
foreignField: "_id",
as: "albums"
If you have mongodb version 3.6 then you can try with nested $lookup aggregation...
{ "$lookup": {
"let": { "albums": "$albums" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$albums" ] } } },
{ "$lookup": {
"let": { "songs": "$songs" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$songs" ] } } }
"as": "songs"
"as": "albums"
And for long-winded explanation you can go through $lookup multiple levels without $unwind?
Or If you have mongodb version prior to 3.6
{ "$lookup": {
"localField": "albums",
"foreignField": "_id",
"as": "albums"
{ "$unwind": "$albums" },
{ "$lookup": {
"localField": "albums.songs",
"foreignField": "_id",
"as": "albums.songs",
{ "$group": {
"_id": "$_id",
"name": { "$first": "$name" },
"started_in": { "$first": "$started_in" },
"active": { "$first": "$active" },
"country": { "$first": "$country" },
"albums": {
"$push": {
"_id": "$albums._id",
"title": "$albums.title",
"released": "$albums.released",
"type": "$albums.type",
"songs": "$albums.songs"

MongoDB join data inside an array of objects

I have document like this in a collection called diagnoses :
"_id" : ObjectId("582d43d18ec3f432f3260682"),
"patientid" : ObjectId("582aacff3894c3afd7ad4677"),
"doctorid" : ObjectId("582a80c93894c3afd7ad4675"),
"medicalcondition" : "high fever, cough, runny nose.",
"diagnosis" : "Viral Flu",
"addmissiondate" : "2016-01-12",
"dischargedate" : "2016-01-16",
"bhtno" : "125",
"prescription" : [
"drug" : ObjectId("58345e0e996d340bd8126149"),
"instructions" : "Take 2 daily, after meals."
"drug" : ObjectId("5836bc0b291918eb42966320"),
"instructions" : "Take 1 daily, after meals."
The drug id inside the prescription object array is from a separate collection called drugs, see sample document below :
"_id" : ObjectId("58345e0e996d340bd8126149"),
"genericname" : "Paracetamol Tab 500mg",
"type" : "X",
"isbrand" : false
I am trying to create a mongodb query using the native node.js driver to get a result like this:
"_id" : ObjectId("582d43d18ec3f432f3260682"),
"patientid" : ObjectId("582aacff3894c3afd7ad4677"),
"doctorid" : ObjectId("582a80c93894c3afd7ad4675"),
"medicalcondition" : "high fever, cough, runny nose.",
"diagnosis" : "Viral Flu",
"addmissiondate" : "2016-01-12",
"dischargedate" : "2016-01-16",
"bhtno" : "125",
"prescription" : [
"drug" :
"_id" : ObjectId("58345e0e996d340bd8126149"),
"genericname" : "Paracetamol Tab 500mg",
"type" : "X",
"isbrand" : false
"instructions" : "Take 2 daily, after meals."
Any advice on how to approach a similar result like this is much appreciated, thanks.
Using MongoDB 3.4.4 and newer
With the aggregation framework, the $lookup operators supports arrays
{ "$addFields": {
"prescription": { "$ifNull" : [ "$prescription", [ ] ] }
} },
{ "$lookup": {
"from": "drugs",
"localField": "prescription.drug",
"foreignField": "_id",
"as": "drugs"
} },
{ "$addFields": {
"prescription": {
"$map": {
"input": "$prescription",
"in": {
"$mergeObjects": [
{ "drug": {
"$arrayElemAt": [
"$indexOfArray": [
} }
} },
{ "$project": { "drugs": 0 } }
For older MongoDB versions:
You can create a pipeline that first flattens the prescription array using the $unwind operator and a $lookup subsequent pipeline step to do a "left outer join" on the "drugs" collection. Apply another $unwind operation on the created array from the "joined" field. $group the previously flattened documents from the first pipeline where there $unwind operator outputs a document for each element in the prescription array.
Assembling the above pipeline, run the following aggregate operation:
"$project": {
"patientid": 1,
"doctorid": 1,
"medicalcondition": 1,
"diagnosis": 1,
"addmissiondate": 1,
"dischargedate": 1,
"bhtno": 1,
"prescription": { "$ifNull" : [ "$prescription", [ ] ] }
"$unwind": {
"path": "$prescription",
"preserveNullAndEmptyArrays": true
"$lookup": {
"from": "drugs",
"localField": "prescription.drug",
"foreignField": "_id",
"as": "prescription.drug"
{ "$unwind": "$prescription.drug" },
"$group": {
"_id": "$_id",
"patientid" : { "$first": "$patientid" },
"doctorid" : { "$first": "$doctorid" },
"medicalcondition" : { "$first": "$medicalcondition" },
"diagnosis" : { "$first": "$diagnosis" },
"addmissiondate" : { "$first": "$addmissiondate" },
"dischargedate" : { "$first": "$dischargedate" },
"bhtno" : { "$first": "$bhtno" },
"prescription" : { "$push": "$prescription" }
Sample Output
"_id" : ObjectId("582d43d18ec3f432f3260682"),
"patientid" : ObjectId("582aacff3894c3afd7ad4677"),
"doctorid" : ObjectId("582a80c93894c3afd7ad4675"),
"medicalcondition" : "high fever, cough, runny nose.",
"diagnosis" : "Viral Flu",
"addmissiondate" : "2016-01-12",
"dischargedate" : "2016-01-16",
"bhtno" : "125",
"prescription" : [
"drug" : {
"_id" : ObjectId("58345e0e996d340bd8126149"),
"genericname" : "Paracetamol Tab 500mg",
"type" : "X",
"isbrand" : false
"instructions" : "Take 2 daily, after meals."
"drug" : {
"_id" : ObjectId("5836bc0b291918eb42966320"),
"genericname" : "Paracetamol Tab 100mg",
"type" : "Y",
"isbrand" : false
"instructions" : "Take 1 daily, after meals."
In MongoDB 3.6 or later versions
It seems that
$lookup will overwrite the original array instead of merging it.
A working solution (a workaround, if you prefer) is to create a different field,
and then merge two fields, as shown below:
{ "$lookup": {
"from": "drugs",
"localField": "prescription.drug",
"foreignField": "_id",
"as": "prescription_drug_info"
} },
{ "$addFields": {
"merged_drug_info": {
"$map": {
"input": "$prescription",
"in": {
"$mergeObjects": [
{ "$arrayElemAt": [
] }
} }
This would add two more fields and the name of the desired field
will be merged_drug_info. We can then add $project stage to filter
out excessive fields and $set stage to rename the field:
{ "$set": { "prescription": "$merged_drug_info" } },
{ "$project": { "prescription_drug_info": 0, "merged_drug_info": 0 } }

mongo $unwind and $group

I have two collections. One of which I wish to add a reference to the other and have it populated on return.
Here is an example json I am trying to achieve as the result:
"title": "Some Title",
"uid": "some-title",
"created_at": "1412159926",
"updated_at": "1412159926",
"id": "1",
"metadata": {
"date": "2016-10-17",
"description": "a description"
"tags": [
"name": "Tag 1",
"uid": "tag-1"
"name": "Tag 2",
"uid": "tag-2"
"name": "Tag 3",
"uid": "tag-3"
Here is the mongo query I have which gets my close, but it nests the original body of the item within the _id object.
$unwind: "$tags"
}, {
$lookup: {
from: "tags",
localField: "tags",
foreignField: "_id",
as: "tags"
}, {
$unwind: "$tags"
}, {
$group: {
"_id": {
"title": "$title",
"uid": "$uid",
"metadata": "$metadata"
"tags": {
"$push": "$tags"
So the result is this:
"_id" : {
"title" : "Some Title",
"uid" : "some-title",
"metadata" : {
"date" : "2016-10-17",
"description" : "a description"
"tags" : [
"_id" : ObjectId("580499d06fe29ce7093fb53a"),
"name" : "Tag 1",
"uid" : "tag-1"
"_id" : ObjectId("580499d06fe29ce7093fb53b"),
"name" : "Tag 2",
"uid" : "tag-2"
Is there a way to achieve the desired output? Also is there a way to not have to define in the $group all the fields which I wish to return, I would like to return the original Object but with the referenced documents in the tags array.
Since you had initially pivoted your original documents on the tags array field which means the documents will be denormalized, your $group pipeline should
use the _id field as its _id key and access the other fields using the $first or $last operator.
The group pipeline operator is similar to the SQL's GROUP BY clause. In SQL, you can't use GROUP BY unless you use any of the aggregation functions. The same way, we have to use an aggregation function in MongoDB as well, so unfortunately there is no other way of not having to define in the $group pipeline all the fields which you wish to return apart from using the $first or $last operator on each field:
{ "$unwind": "$tags" },
"$lookup": {
"from": "tags",
"localField": "tags",
"foreignField": "_id",
"as": "resultingArray"
{ "$unwind": "$resultingArray" },
"$group": {
"_id": "$_id",
"title": { "$first": "$title" },
"uid": { "$first": "$uid" },
"created_at": { "$first": "$created_at" },
"updated_at": { "$first": "$updated_at" },
"id": { "$first": "$id" },
"metadata": { "$first": "$metadata" },
"tags": { "$push": "$resultingArray" }
One trick I always use whenever I want to debug a pipeline that's giving unexpected results is to run the aggregation with just the first pipeline operator. If that gives the expected result, add the next.
In the answer above, you'd first try aggregating just the $unwind; if that works, add the $lookup. This can help you narrow down which operator is causing issues. In this case, you could run the pipeline with just the first three steps since you believe the $group is the one causing issues and then inspect the resulting documents from that pipeline:
{ "$unwind": "$tags" },
"$lookup": {
"from": "tags",
"localField": "tags",
"foreignField": "_id",
"as": "resultingArray"
{ "$unwind": "$resultingArray" }
which yields the output
/* 1 */
"_id" : ObjectId("5804a6c900ce8cbd028523d9"),
"title" : "Some Title",
"uid" : "some-title",
"created_at" : "1412159926",
"updated_at" : "1412159926",
"id" : "1",
"metadata" : {
"date" : "2016-10-17",
"description" : "a description"
"resultingArray" : {
"name" : "Tag 1",
"uid" : "tag-1"
/* 2 */
"_id" : ObjectId("5804a6c900ce8cbd028523d9"),
"title" : "Some Title",
"uid" : "some-title",
"created_at" : "1412159926",
"updated_at" : "1412159926",
"id" : "1",
"metadata" : {
"date" : "2016-10-17",
"description" : "a description"
"resultingArray" : {
"name" : "Tag 2",
"uid" : "tag-2"
/* 3 */
"_id" : ObjectId("5804a6c900ce8cbd028523d9"),
"title" : "Some Title",
"uid" : "some-title",
"created_at" : "1412159926",
"updated_at" : "1412159926",
"id" : "1",
"metadata" : {
"date" : "2016-10-17",
"description" : "a description"
"resultingArray" : {
"name" : "Tag 3",
"uid" : "tag-3"
From inspection you will see that for each input document, the last pipeline outputs 3 documents where 3 is the number of array elements in the computed field resultingArray and they all have a common _id and the other fields with the exception of the resultingArray field which is different, thus you get your desired results by adding a pipeline that groups the documents by the _id field and subsequently getting the other fields with $first or $last operator, as in the given solution:
{ "$unwind": "$tags" },
"$lookup": {
"from": "tags",
"localField": "tags",
"foreignField": "_id",
"as": "resultingArray"
{ "$unwind": "$resultingArray" },
"$group": {
"_id": "$_id",
"title": { "$first": "$title" },
"uid": { "$first": "$uid" },
"created_at": { "$first": "$created_at" },
"updated_at": { "$first": "$updated_at" },
"id": { "$first": "$id" },
"metadata": { "$first": "$metadata" },
"tags": { "$push": "$resultingArray" }