Need a Mongo query to generate a particular result-set with aggregation - mongodb

I am new to mongodb, I have a requirement and would like to know how to generate custom resultset using Mongo aggregate operator. Any help would be appreciated.
Need to group the collection by "company" and "status" and would need to produce resultset given below.
Collection
[
{
"company": "google",
"status": "active",
"offer": {
"job": "developer",
"salary": 10000.00
},
},
{
"company": "google",
"status": "active",
"offer": {
"job": "designer",
"salary": 500000.00
},
},
{
"company": "amazon",
"status": "inactive",
"offer": {
"job": "designer",
"salary": 500000.00
},
}
]
Expected Result-Set
[
{
"company" : "google",
"report" : [{
"status" : "active",
"totalSalary" : 60000
},
{
"status" : "inactive",
"totalSalary" : 0
}]
},
{
"company" : "amazon",
"report" : [{
"status" : "active",
"totalSalary" : 0
},
{
"status" : "inactive",
"totalSalary" : 500000.00
}]
}
]

You should 100% check the official documentation on aggregates, it's a bit complicated at first but once you get the hang of it they're great. I also recommend you https://mongoplayground.net/, it's a great site for doing this kind of tests.
What you're looking for is something like this
db.collection.aggregate([
{
$group: {
_id: {
company: "$company"
},
report: {
$addToSet: "$offer"
}
}
}
])
You can test it here. You also probably want to rename the resulting _id field that's mandatory in a group aggregate. You can find how to do that here

Related

Get paginated records and total records in one mongodb aggregate query

I have my monogdb departments data structure like as shown below
[{
category: "ABC",
sections: [
{
section_hod: "x111",
section_name: "SECTION A",
section_staff_count: "v11111",
section_id: "a1111",
:
},
{
section_hod: "x2222",
section_name: "SECTION B",
section_staff_count: "v2222",
section_id: "a2222",
:
}
]
}
:
:
]
I wrote a mongodb query like as shown below
db.getSiblingDB("departments").getCollection("DepartmentDetails").aggregate([
{ $unwind : "$sections"},
{ $match : { $and : [{ "sections.section_name" : "SECTION A"},
{ $or : [{ "category" : "ABC"}]}]}},
{$project : { "name" : "$sections.section_name", "hod" : "$sections.section_hod", "staff_count" : "$sections.section_staff_count", "id" : "$sections.section_id"}},
{$skip: 0}, {$limit: 10}
]);
which gives me a list of section details as shown below which contains name, hod, staff_count, id etc
[
{
"name": "xxxxx",
"hod": "xxxxx",
"staff_count": "xxxxx",
"id": "xxxxx"
},
{
"name": "yyyyy",
"hod": "yyyyy",
"staff_count": "yyyyy",
"id": "yyyyy"
}
:
:
:
]
Everything looks good, but the problem is I have so many records in the list with which I am trying to build a pagination. For implementing pagination I know I can use the skip and limit function for iterating the pages, but for doing that I need to know the total counts of all the records.
I can do this in two ways, First way is I can execute two queries one which will be a count and then the aggregate query passing the skip and limit, second way is execute one query which return me the total counts and the documents in the order of first paginated page.
I am trying to implement the second way and bring the expected result is as shown below
{
"documents": [
{
"name": "xxxxx",
"hod": "xxxxx",
"staff_count": "xxxxx",
"id": "xxxxx"
},
{
"name": "yyyyy",
"hod": "yyyyy",
"staff_count": "yyyyy",
"id": "yyyyy"
}
:
:
:
],
"totalCount": 5444
}
Not sure if this is achievable. Can someone please help me on this. My default limit is 10
You can do it like this, it will give you total records and paginated results in one go,
db.getSiblingDB("departments").getCollection("DepartmentDetails")
.aggregate([
{ $unwind : "$sections"},
{ $match : { $and : [{ "sections.section_name" : "SECTION A"},
{ $or : [{ "category" : "ABC"}]}]}},
{
$project : {
"name" : "$sections.section_name",
"hod" : "$sections.section_hod",
"staff_count" : "$sections.section_staff_count",
"id" : "$sections.section_id"
}
},
{
$facet: {
metaData: [{
$count: 'total'
}],
records: [
{$skip: 0},
{$limit: 10}
]
}
},
{
$project: {
records: 1,
total: {
$let: {
vars: {
totalObj: {
$arrayElemAt: ['$metaData', 0]
}
},
in: '$$totalObj.total'
}
},
}
}
]);

group by field and count mongodb [duplicate]

I have following collection
{
"_id" : ObjectId("5b18d14cbc83fd271b6a157c"),
"status" : "pending",
"description" : "You have to complete the challenge...",
}
{
"_id" : ObjectId("5b18d31a27a37696ec8b5773"),
"status" : "completed",
"description" : "completed...",
}
{
"_id" : ObjectId("5b18d31a27a37696ec8b5775"),
"status" : "pending",
"description" : "pending...",
}
{
"_id" : ObjectId("5b18d31a27a37696ec8b5776"),
"status" : "inProgress",
"description" : "inProgress...",
}
I need to group by status and get all the keys dynamically which are in status
[
{
"completed": [
{
"_id": "5b18d31a27a37696ec8b5773",
"status": "completed",
"description": "completed..."
}
]
},
{
"pending": [
{
"_id": "5b18d14cbc83fd271b6a157c",
"status": "pending",
"description": "You have to complete the challenge..."
},
{
"_id": "5b18d31a27a37696ec8b5775",
"status": "pending",
"description": "pending..."
}
]
},
{
"inProgress": [
{
"_id": "5b18d31a27a37696ec8b5776",
"status": "inProgress",
"description": "inProgress..."
}
]
}
]
Not that I think it's a good idea and mostly because I don't see any "aggregation" here at all is that after "grouping" to add to an array you similarly $push all that content into array by the "status" grouping key and then convert into keys of a document in a $replaceRoot with $arrayToObject:
db.collection.aggregate([
{ "$group": {
"_id": "$status",
"data": { "$push": "$$ROOT" }
}},
{ "$group": {
"_id": null,
"data": {
"$push": {
"k": "$_id",
"v": "$data"
}
}
}},
{ "$replaceRoot": {
"newRoot": { "$arrayToObject": "$data" }
}}
])
Returns:
{
"inProgress" : [
{
"_id" : ObjectId("5b18d31a27a37696ec8b5776"),
"status" : "inProgress",
"description" : "inProgress..."
}
],
"completed" : [
{
"_id" : ObjectId("5b18d31a27a37696ec8b5773"),
"status" : "completed",
"description" : "completed..."
}
],
"pending" : [
{
"_id" : ObjectId("5b18d14cbc83fd271b6a157c"),
"status" : "pending",
"description" : "You have to complete the challenge..."
},
{
"_id" : ObjectId("5b18d31a27a37696ec8b5775"),
"status" : "pending",
"description" : "pending..."
}
]
}
That might be okay IF you actually "aggregated" beforehand, but on any practically sized collection all that is doing is trying force the whole collection into a single document, and that's likely to break the BSON Limit of 16MB, so I just would not recommend even attempting this without "grouping" something else before this step.
Frankly, the same following code does the same thing, and without aggregation tricks and no BSON limit problem:
var obj = {};
// Using forEach as a premise for representing "any" cursor iteration form
db.collection.find().forEach(d => {
if (!obj.hasOwnProperty(d.status))
obj[d.status] = [];
obj[d.status].push(d);
})
printjson(obj);
Or a bit shorter:
var obj = {};
// Using forEach as a premise for representing "any" cursor iteration form
db.collection.find().forEach(d =>
obj[d.status] = [
...(obj.hasOwnProperty(d.status)) ? obj[d.status] : [],
d
]
)
printjson(obj);
Aggregations are used for "data reduction" and anything that is simply "reshaping results" without actually reducing the data returned from the server is usually better handled in client code anyway. You're still returning all data no matter what you do, and the client processing of the cursor has considerably less overhead. And NO restrictions.

MongoDB aggregate field in array of objects

I'm trying to solve a problem for some time now but with no luck, unfortunately.
So I'm refactoring some old code (which used the all known get each doc query and for loop over it) and I'm trying to aggregate the results to remove the thousands of calls the BE is making.
The current doc looks like this
{
"_id" : ObjectId("5c176fc65f543200019f8d66"),
"category" : "New client",
"description" : "",
"createdById" : ObjectId("5c0a858da9c0f000018382bb"),
"createdAt" : ISODate("2018-12-17T09:43:34.642Z"),
"sentAt" : ISODate("2018-12-17T09:44:25.902Z"),
"scheduleToBeSentAt" : ISODate("2018-01-17T11:43:00.000Z"),
"recipients" : [
{
"user" : ObjectId("5c0a858da9c0f000018382b5"),
"status" : {
"approved" : true,
"lastUpdated" : ISODate("2018-01-17T11:43:00.000Z")
}
},
{
"user" : ObjectId("5c0a858da9c0f000018382b6"),
"status" : {
"approved" : true,
"lastUpdated" : ISODate("2018-01-17T11:43:00.000Z")
}
},
],
"recipientsGroup" : "All",
"isActive" : false,
"notificationSent" : true
}
The field recipients.user is an objectID of a user from the Users collection.
What is the correct way to modify this so the result will be
{
"_id": ObjectId("5c176fc65f543200019f8d66"),
"category": "New client",
"description": "",
"createdById": ObjectId("5c0a858da9c0f000018382bb"),
"createdAt": ISODate("2018-12-17T09:43:34.642Z"),
"sentAt": ISODate("2018-12-17T09:44:25.902Z"),
"scheduleToBeSentAt": ISODate("2018-01-17T11:43:00.000Z"),
"recipients": [{
"user": {
"_id": ObjectId("5c0a858da9c0f000018382b5"),
"title": "",
"firstName": "Monique",
"lastName": "Heinrich",
"qualification": "Management",
"isActive": true
},
"status": {
"approved": true,
"lastUpdated": ISODate("2018-01-17T11:43:00.000Z")
}
},
{
"user": {
"_id": ObjectId("5c0a858da9c0f000018382b6"),
"title": "",
"firstName": "Marek",
"lastName": "Pucelik",
"qualification": "Management",
"isActive": true
},
"status": {
"approved": true,
"lastUpdated": ISODate("2018-01-17T11:43:00.000Z")
}
},
],
"recipientsGroup": "All",
"isActive": false,
"notificationSent": true
}
An aggregation is a powerful tool but sometimes the simple solution makes your brain hurt.....
I tried something like this but with no luck also.
db.getCollection('Protocols').aggregate([
{
$lookup: {
from: "Users",
localField: "recipients.user",
foreignField: "_id",
as: "users"
}
},
{
$project: {
"recipients": {
"status": 1,
"user": {
$filter: {
input: "$users",
cond: { $eq: ["$$this._id", "$user"] }
}
},
}
}
}
])
You can use the $lookup operator in your aggregation pipeline
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
But for performance reason you'd rather duplicate user object in your recipents array to avoid such complex queries.

$lookup and get count under certain conditions MongoDB

Have 2 collections for handling chat
For chat rooms
For chat Messages
Sample data for chatRooms is as follows
{
"data": [
{
"_id": "5a606ab0116e2c164b25ef33",
"topic": "akhil Ben chat",
"topicDesc": "question 1",
"roomName": "benakhil777akhil",
"createdOn": "2018-01-18T09:36:48.231Z",
"participants": [
"ben",
"akhil777"
],
"__v": 0
},
{
"_id": "5a4dbdaab46b426863e7ead3",
"topic": "test",
"topicDesc": "test123",
"roomName": "benakhil777test",
"createdOn": "2018-01-04T05:37:46.088Z",
"participants": [
"ben",
"akhil777"
],
"__v": 0
}
]}
Sample Data for chatMessages is as follows
{"data": [
{
"_id": "5a62281ea0652120a6668bae",
"topic": "akhil Ben chat",
"roomName": "benakhil777akhil",
"message": "test 1",
"__v": 0,
"readStatus": [
{
"recipient": "ben",
"_id": "5a62281ea0652120a6668bb0",
"status": true
},
{
"recipient": "akhil777",
"_id": "5a62281ea0652120a6668baf",
"status": true
}
],
"createdOn": "2018-01-19T17:17:18.456Z"
},
{
"_id": "5a622866a0652120a6668bb1",
"topic": "akhil Ben chat",
"roomName": "benakhil777akhil",
"message": "Test 2",
"__v": 0,
"readStatus": [
{
"recipient": "ben",
"_id": "5a622866a0652120a6668bb3",
"status": false
},
{
"recipient": "akhil777",
"_id": "5a622866a0652120a6668bb2",
"status": true
}
],
"createdOn": "2018-01-19T17:18:30.396Z"
},
{
"_id": "5a62287ca0652120a6668bb4",
"topic": "akhil Ben chat",
"roomName": "benakhil777akhil",
"message": "test 3",
"__v": 0,
"readStatus": [
{
"recipient": "ben",
"_id": "5a62287ca0652120a6668bb6",
"status": false
},
{
"recipient": "akhil777",
"_id": "5a62287ca0652120a6668bb5",
"status": true
}
],
"createdOn": "2018-01-19T17:18:52.018Z"
}
]}
In the above JSON readStatus store the status, which the user read the message or not. so that i can count the unread messages by a user for each chat room.
The status inside the readStatus holds the read status of message, true for message is read.
There are two rooms benakhil777akhil and benakhil777test.
What i want to get is the number of unread messages for each room by a user say ben
Also there is userDetails collection
say,
[{
"_id": "59e6d6ba02e11e1814481022",
"username": "ben",
"name": "Ben S",
"email": "qwerty#123.com",
},{
"_id": "59e6d6ba02e11e1814481022",
"username": "akhil777",
"name": "Akhil Clement",
"email": "qwerty#123.com",
}]
this will be the user details collection
and output JSON i need is like.
{
"data": [
{
"_id": "5a606ab0116e2c164b25ef33",
"topic": "akhil Ben chat",
"topicDesc": "question 1",
"roomName": "benakhil777akhil",
"createdOn": "2018-01-18T09:36:48.231Z",
"participants": [
"ben",
"akhil777"
],
"participantDetails":[{
"_id": "59e6d6ba02e11e1814481022",
"username": "ben",
"name": "Ben S",
"email": "qwerty#123.com",
},{
"_id": "59e6d6ba02e11e1814481022",
"username": "akhil777",
"name": "Akhil Clement",
"email": "qwerty#123.com",
}],
"unreadCount": 2,
"__v": 0
},
{
"_id": "5a4dbdaab46b426863e7ead3",
"topic": "test",
"topicDesc": "test123",
"roomName": "benakhil777test",
"createdOn": "2018-01-04T05:37:46.088Z",
"participants": [
"ben",
"akhil777"
],
"participantDetails":[{
"_id": "59e6d6ba02e11e1814481022",
"username": "ben",
"name": "Ben S",
"email": "qwerty#123.com",
},{
"_id": "59e6d6ba02e11e1814481022",
"username": "akhil777",
"name": "Akhil Clement",
"email": "qwerty#123.com",
}],
"unreadCount": 0,
"__v": 0
}
]}
Please try this aggregation pipeline
db.rooms.aggregate(
[
{$match : {participants : 'ben'}},
{$lookup : {
from : "chats",
localField : "roomName",
foreignField:"roomName",
as :"out"
}
},
{$unwind : {
path: "$out",
preserveNullAndEmptyArrays: true
}
},
{$unwind : {
path: "$out.readStatus",
preserveNullAndEmptyArrays: true
}
},
{$addFields : {
isMatch : { $and : [
{ $eq : ["$out.readStatus.recipient" , "ben" ] } , { $eq : [ "$out.readStatus.status" , false ] } ]
}
}
},
{$group : {
_id : {
_id : "$_id" ,
topic : "$topic",
topicDesc : "$topicDesc",
createdOn : "$createdOn",
participants : "$participants",
roomName : "$roomName"
},
unreadCount : { $sum : { $cond : [ "$isMatch" , 1, 0 ] } }
}
},
{$sort : {unreadCount : -1}}
]
).pretty()
result
{
"_id" : {
"_id" : "5a606ab0116e2c164b25ef33",
"topic" : "akhil Ben chat",
"topicDesc" : "question 1",
"createdOn" : "2018-01-18T09:36:48.231Z",
"participants" : [
"ben",
"akhil777"
],
"roomName" : "benakhil777akhil"
},
"unreadCount" : 2
}
{
"_id" : {
"_id" : "5a4dbdaab46b426863e7ead3",
"topic" : "test",
"topicDesc" : "test123",
"createdOn" : "2018-01-04T05:37:46.088Z",
"participants" : [
"ben",
"akhil777"
],
"roomName" : "benakhil777test"
},
"unreadCount" : 0
}
EDIT since addFields is not available in 3.2.17
{$group : {
_id : {
_id : "$_id" ,
topic : "$topic",
topicDesc : "$topicDesc",
createdOn : "$createdOn",
participants : "$participants",
roomName : "$roomName"
},
unreadCount : { $sum : { $cond : [ { $and : [
{ $eq : ["$out.readStatus.recipient" , "ben" ] } , { $eq : [ "$out.readStatus.status" , false ] } ]
} , 1, 0 ] } }
}
}
EDIT-2 added $project
{$project :
{
"_id" : "$_id._id",
"topic" : "$_id.topic",
"topicDesc" : "$_id.topicDesc",
"createdOn" : "$_id.createdOn",
"participants" : "$_id.participants",
"roomName" : "$_id.roomName",
"unreadCount" : "$unreadCount"
}
}
You can simplify your code to use below aggregation.
$cond with input criteria to check for read status flag, output 1 when false 0 when true.
inner $sum to count unread values in each chat message with outer $sum to sum the unread values across all matching chat messages.
db.chatRooms.aggregate(
[{
"$match":{"participants":"ben"}},
{"$lookup":{
"from":"chatMessages",
"localField":"roomName",
"foreignField":"roomName",
"as":"chatMessages"
}},
{"$project":{
"topic":1,
"topicDesc":1,
"roomName":1,
"createdOn":1,
"participants":1,
"unreadCount":{
"$sum":{
"$map":{
"input":"$chatMessages",
"as":"chatMessage",
"in":{
"$sum":{
"$map":{
"input":"$$chatMessage.readStatus",
"as":"mChatMessage",
"in":{"$cond":[{"$eq":["$$mChatMessage.status",false]},1,0]}
}
}
}
}
}
}
}}
])
result JSON with user details.
db.chatRooms.aggregate(
[
{$match : {participants : 'ben'}},
{ $unwind : {
path: "$participants",
preserveNullAndEmptyArrays: true
}
},
{ $lookup: {
from:"users",
localField:"participants",
foreignField:"username",
as:"userData"
}
},
{ $lookup: {
from:"chatmessages",
localField:"roomName",
foreignField:"roomName",
as:"out"
}
},
{ $unwind : {
path: "$out",
preserveNullAndEmptyArrays: true
}
},
{ $unwind : {
path: "$out.readStatus",
preserveNullAndEmptyArrays: true
}
},
{ $group : {
_id : {
_id : "$_id" ,
topic : "$topic",
topicDesc : "$topicDesc",
createdOn : "$createdOn",
roomName : "$roomName"
},
participants : {$addToSet : "$participants" } ,
participantDetails : {$addToSet : {$arrayElemAt : ["$userData", 0]}},
unreadCount : {
$sum : {
$cond : [ {
$and : [
{ $eq : ["$out.readStatus.recipient" , "ben" ] } ,
{ $eq : [ "$out.readStatus.status" , false ] }
]
} , 1, 0
]
}
}
}
}
,
{ $project :
{
_id : "$_id._id",
topic : "$_id.topic",
topicDesc : "$_id.topicDesc",
createdOn : "$_id.createdOn",
participants : "$_id.participants",
roomName : "$_id.roomName",
unreadCount : "$unreadCount",
participants : 1 ,
participantDetails : 1
}
}
])

Combining multiple sub-documents into a new doc in mongo

I am trying to query multiple sub-documents in MongoDB and return as a single doc.
I think the aggregation framework is the way to go, but, can't see to get it exactly right.
Take the following docs:
{
"board_id": "1",
"hosts":
[{
"name": "bob",
"ip": "10.1.2.3"
},
{
"name": "tom",
"ip": "10.1.2.4"
}]
}
{
"board_id": "2",
"hosts":
[{
"name": "mickey",
"ip": "10.2.2.3"
},
{
"name": "mouse",
"ip": "10.2.2.4"
}]
}
{
"board_id": "3",
"hosts":
[{
"name": "pavel",
"ip": "10.3.2.3"
},
{
"name": "kenrick",
"ip": "10.3.2.4"
}]
}
Trying to get a query result like this:
{
"hosts":
[{
"name": "bob",
"ip": "10.1.2.3"
},
{
"name": "tom",
"ip": "10.1.2.4"
},
{
"name": "mickey",
"ip": "10.2.2.3"
},
{
"name": "mouse",
"ip": "10.2.2.4"
},
{
"name": "pavel",
"ip": "10.3.2.3"
},
{
"name": "kenrick",
"ip": "10.3.2.4"
}]
}
I've tried this:
db.collection.aggregate([ { $unwind: '$hosts' }, { $project : { name: 1, hosts: 1, _id: 0 }} ])
But it's not quite what I want.
You can definitely do this with aggregate. Let's assume your data is in collection named board, so please replace it with whatever your collection name is.
db.board.aggregate([
{$unwind:"$hosts"},
{$group:{_id:null, hosts:{$addToSet:"$hosts"}}},
{$project:{_id:0, hosts:1}}
]).pretty()
it will return
{
"hosts" : [
{
"name" : "kenrick",
"ip" : "10.3.2.4"
},
{
"name" : "pavel",
"ip" : "10.3.2.3"
},
{
"name" : "mouse",
"ip" : "10.2.2.4"
},
{
"name" : "mickey",
"ip" : "10.2.2.3"
},
{
"name" : "tom",
"ip" : "10.1.2.4"
},
{
"name" : "bob",
"ip" : "10.1.2.3"
}
]
}
So your basic problem here is that the arrays are contained in separate documents. So while you are correct to $unwind the array for processing, in order to bring the content into a single array you would need to $group the result across documents, and $push the content to the result array:
db.collection.aggregate([
{ "$unwind": "$hosts" },
{ "$group": {
"_id": null,
"hosts": { "$push": "$hosts" }
}}
])
So just as $unwind will "deconstruct" the array elements, the $push accumulator in $group brings "reconstructs" the array. And since there is no other key to "group" on, this brings all the elements into a single array.
Note that a null grouping key is only really practical when the resulting document would not exceed the BSON limit. Otherwise you are better off leaving the individual elements as documents in themselves.
Optionally remove the _id with an additional $project if required.