Mongo aggregation groups and subgroup - mongodb

Hi I have a Mongo aggregation:
[
{
"$match" : {
"dateTime" : {
"$gte" : ISODate("2017-01-01T00:00:00.000+0000"),
"$lt" : ISODate("2018-01-01T00:00:00.000+0000")
}
}
},
{
"$group" : {
"_id" : "dateTime",
"totals" : {
"$sum" : "$payment.totalAmount"
},
"count" : {
"$sum" : 1.0
}
}
}
],
{
"allowDiskUse" : false
}
);
This works fine. It aggregates, and sums by date range I supplied and I get an output as follows.
{
"_id" : "dateTime",
"totals" : 2625293.825017198,
"count" : 12038.0
}
However, I also want to further refine the groupings.
I have a field called 'companyId' and I want to calculate the sum and count by each company Id for the given time range.
I would like to get an output similar to this, where I get a sum and count for each company ID in the date range I queried, not just a sum/count of all the data:
[
{
"companyId" : "Acme Co",
"totals" : 2625293.825017198,
"count" : 12038.0
},
{
"companyId" : "Beta Co",
"totals" : 162593.82198,
"count" : 138.0
},
{
"companyId" : "Cel Co",
"totals" : 593.82,
"count" : 38.0
}
]
How do I do this? I have not been able to find a good example online.
Thanks

Related

MongoDB, Grouping by Multiple Fields

pretty new to Mongo and am finding some simple things that i would do in SQL frustratingly difficult in Mongo.
I have an object similar to this below
[{
"_id" : ObjectId("5870fb29a1fe030e1a2909db"),
"updatedAt" : ISODate("2017-01-07T14:28:57.224Z"),
"createdAt" : ISODate("2017-01-07T14:28:57.224Z"),
"state" : "Available",
},
{
"_id" : ObjectId("5870fb29a1fe030e1a2909dc"),
"updatedAt" : ISODate("2017-01-07T14:28:57.224Z"),
"createdAt" : ISODate("2017-01-07T14:28:57.224Z"),
"state" : "notReady",
},
{
"_id" : ObjectId("5870fb29a1fe030e1a2909d9"),
"updatedAt" : ISODate("2017-01-07T14:28:57.224Z"),
"createdAt" : ISODate("2017-01-07T14:28:57.224Z"),
"state" : "Disconnected",
}]
What i'm looking to do it group the data by the Maximum date and the state.
Ideally the result i would be looking for would be something like the following.
{
latestDate: "2017-01-07T14:28:57",
states : {
available : 10,
disconnected : 5,
notReady : 2
}}
Basically i'm looking for the SQL equivalent of this:
SELECT createdAt, state, COUNT(rowid)
FROM db
WHERE date = (SELECT MAX(createdAt) FROM db)
GROUP BY 1,2
I've searched around here and have found some good info but am probably missing something straight forward. Ive only managed to get here so far
db.collection.aggregate([
{$project: {"_id" : 0,"state": 1, "date" : "$createdAt"}},
{$group : {"_id" : {"date":"$date", "state": "actual"}, "count":{"$sum":1}}}
])
Any help would be appreciated :)
db.collection.aggregate([
{
$group : {
_id : {
date : "$createdAt",
state : "$state"
},
count : {$sum : 1}
}
},
{
$group : {
_id : "$_id.date",
states : {
$addToSet : {
state : "$_id.state",
count : "$count"
}
}
}
},
{
$sort : {_id : -1}
},
{
$limit : 1
},
{
$project : {
_id : 0,
latestDate : "$_id",
states : "$states"
}
}
])
output :
{
"latestDate" : ISODate("2017-01-07T14:28:57.224Z"),
"states" : [
{
"state" : "Available",
"count" : 1
},
{
"state" : "notReady",
"count" : 1
},
{
"state" : "Disconnected",
"count" : 1
}
]
}

Mongodb embedded document - aggregation query

I have got the below documents in Mongo database:
db.totaldemands.insert({ "data" : "UKToChina", "demandPerCountry" :
{ "from" : "UK" , to: "China" ,
"demandPerItem" : [ { "item" : "apples" , "demand" : 200 },
{ "item" : "plums" , "demand" : 100 }
] } });
db.totaldemands.insert({ "data" : "UKToSingapore",
"demandPerCountry" : { "from" : "UK" , to: "Singapore" ,
"demandPerItem" : [ { "item" : "apples" , "demand" : 100 },
{ "item" : "plums" , "demand" : 50 }
] } });
I need to write a query to find the count of apples exported from UK to any country.
I have tried the following query:
db.totaldemands.aggregate(
{ $match : { "demandPerCountry.from" : "UK" ,
"demandPerCountry.demandPerItem.item" : "apples" } },
{ $unwind : "$demandPerCountry.demandPerItem" },
{ $group : { "_id" : "$demandPerCountry.demandPerItem.item",
"total" : { $sum : "$demandPerCountry.demandPerItem.demand"
} } }
);
But it gives me the output with both apples and plums like below:
{ "_id" : "apples", "total" : 300 }
{ "_id" : "plums", "total" : 150 }
But, my expected output is:
{ "_id" : "apples", "total" : 300 }
So, How can I modify the above query to return only the count of apples exported from UK ?
Also, is there any other better way to achieve the output without unwinding ?
You can add another $match to get only apples.
As you have embedded document structure and performing aggregation, $unwind is required here. The alternate option could be map and reduce. However, unwind is most suitable here.
If you are thinking about performance, unwind shouldn't cause performance issue.
db.totaldemands.aggregate(
{ $match : { "demandPerCountry.from" : "UK" ,
"demandPerCountry.demandPerItem.item" : "apples" } },
{ $unwind : "$demandPerCountry.demandPerItem" },
{ $group : { "_id" : "$demandPerCountry.demandPerItem.item",
"total" : { $sum : "$demandPerCountry.demandPerItem.demand"
} } },
{$match : {"_id" : "apples"}}
);

MongoDB: How do I sum up a unique field in $group aggregation query?

I have the following dataset after completing some aggregation magic:
{ "_id" : "5700edfe03fcdb000347bebb", "comment" : { "commentor" : "56f3f70d4de8c74a69d1d5e1", "id" : ObjectId("570175e6c002e46edb922aa1")}, "max" : ObjectId("570175e6c002e46edb922aa3")}
{ "_id" : "5700edfe03fcdb000347bebb", "comment" : { "commentor" : "56f3f70d4de8c74a69d1d5e6", "id" : ObjectId("570175e6c002e46edb922aa2")}, "max" : ObjectId("570175e6c002e46edb922aa3")}
{ "_id" : "5700edfe03fcdb000347bebb", "comment" : { "commentor" : "56f3f70d4de8c74a69d1d5e1", "id" : ObjectId("570175e6c002e46edb922aa3")}, "max" : ObjectId("570175e6c002e46edb922aa3")}
The _id represents a post and in the post, there are comments. In this case, there are 3 comments; 2 by the same commentor ("56f3f70d4de8c74a69d1d5e1") and one by another commentor ("56f3f70d4de8c74a69d1d5e6").
I want to write an aggregation query that would count up all the unique comments by commentor ("56f3f70d4de8c74a69d1d5e1") only and return that the commentor commented twice on post "5700edfe03fcdb000347bebb".
I tried the following:
{ "$group" : { "_id" : "$_id", "count" : { "$sum" : "$comment.commentor" } } }
The results were:
{ "_id" : "5700edfe03fcdb000347bebb", "count" : 0 }
Please note that I'm not trying to count up all the comments by all the commentors in that post so I'm not trying to do this:
{ "$group" : { "_id" : "$_id", "count" : { "$sum" : 1 } } }
Would result in:
{ "_id" : "5700edfe03fcdb000347bebb", "count" : 3 }
I just want the count of post by user ("56f3f70d4de8c74a69d1d5e1")
EDIT:
After some research, I see that $sum only works on numeric fields and not non-numeric fields: https://docs.mongodb.com/manual/reference/operator/aggregation/sum/#grp._S_sum
Is there any way I can get the number of comments posted by user ("56f3f70d4de8c74a69d1d5e1") per post "5700edfe03fcdb000347bebb"?
So after a bit of trial and error, I managed to figure it out.
group2 = {
"$group" : {
"_id" : "$_id",
"count" : {
"$sum" : {"$cond" : [ {"$eq" : ["$comms.c", "56f3f70d4de8c74a69d1d5e1"] }, 1 ,0 ] }
}
}
}
We are summing up the 1's on condition that comms.c equals to user "56f3f70d4de8c74a69d1d5e1".
Result:
{ "_id" : "5700edfe03fcdb000347bebb", "count" : 2 }

Mongodb difference in time is returning the current time

{
"_id" : ObjectId("57693a852956d5301b348a99"),
"First_Name" : "Sri Ram",
"Last_Name" : "Bandi",
"Email" : "chinni001sriram#gmail.com",
"Sessions" : [
{
"Class" : "facebook",
"ID" : "1778142655749042",
"Login_Time" : ISODate("2016-06-21T13:00:53.867Z"),
"Logout_Time" : ISODate("2016-06-21T13:01:04.640Z"),
"Duration" : null
}
],
"Count" : 1
}
This is my mongo data. and I want to set the duration as the difference of login and logout time. So, I executed the following query:
db.sessionData.update(
{ "Sessions.ID": "1778142655749042"},
{ $set: {
"Sessions.$.Duration": ISODate("Sessions.$.Logout_Time" - "Sessions.$.Login_Time")
}
}
)
But the result I'm getting is:
{
"_id" : ObjectId("57693a852956d5301b348a99"),
"First_Name" : "Sri Ram",
"Last_Name" : "Bandi",
"Email" : "chinni001sriram#gmail.com",
"Sessions" : [
{
"Class" : "facebook",
"ID" : "1778142655749042",
"Login_Time" : ISODate("2016-06-21T13:00:53.867Z"),
"Logout_Time" : ISODate("2016-06-21T13:01:04.640Z"),
"Duration" : ISODate("2016-06-21T13:02:58.010Z")
}
],
"Count" : 1
}
and duration wast set to current time/date instead of the difference.
You could use the aggregation framework to do the arithmetic operation using the $divide and $subtract operators to give you the difference as duration in seconds. The formula is given by
Duration (sec) = (Logout_Time - Login_Time)/1000
The aggregation pipeline should give you a new field that has this computed value and then you can use the forEach() cursor method on the aggregate() result to iterate the documents in the result and update the collection.
The following example shows this:
db.sessionData.aggregate([
{ "$match": { "Sessions.ID" : "1778142655749042" } },
{ "$unwind": "$Sessions" },
{ "$match": { "Sessions.ID" : "1778142655749042" } },
{
"$project": {
"Duration": {
"$divide": [
{ "$subtract": [ "$Sessions.Logout_Time", "$Sessions.Login_Time" ] },
1000
]
}
}
}
]).forEach(function (doc) {
db.sessionData.update(
{ "Sessions.ID": "1778142655749042", "_id": doc._id },
{
"$set": { "Sessions.$.Duration": doc.Duration }
}
);
});
Query results
{
"_id" : ObjectId("57693a852956d5301b348a99"),
"First_Name" : "Sri Ram",
"Last_Name" : "Bandi",
"Email" : "chinni001sriram#gmail.com",
"Sessions" : [
{
"Class" : "facebook",
"ID" : "1778142655749042",
"Login_Time" : ISODate("2016-06-21T13:00:53.867Z"),
"Logout_Time" : ISODate("2016-06-21T13:01:04.640Z"),
"Duration" : 10.773
}
],
"Count" : 1
}

Mongodb sort by sum of keys

I have a json document
{
{
"_id" : ObjectId("5715c4bbac530eb3018b456a"),
"content_id" : "5715c4bbac530eb3018b4569",
"views" : NumberLong(200),
"likes" : NumberLong(100),
"comments" : NumberLong(0)
},
{
"_id" : ObjectId("5715c4bbac530eb3018b4568"),
"content_id" : "5715c4bbac530eb3018b4567",
"views" : NumberLong(300),
"likes" : NumberLong(200),
"comments" : NumberLong(0)
},
{
"_id" : ObjectId("5715c502ac530ee5018b4956"),
"content_id" : "5715c502ac530ee5018b4955",
"views" : NumberLong(500),
"likes" : NumberLong(0),
"comments" : NumberLong(200)
}
}
How can we sort the document order by SUM("views", "likes", "comments")
something like in mysql
SELECT SUM(key1, key2, key3) AS key
FROM document
ORDER BY key
Thanks in advance.
First do a projection to obtain the sum of all the likes, views and comments, then sort based on that sum. I am considering group by content_id if is needed in the second snippet
db.test.aggregate([
{ $project : { "_id" : "$content_id", "total" : { $add : [ "$likes", "$views", "$comments"]}}},
{ $sort : { "total" : 1 }}
])
If you need a group operation if content_id can be duplicated
db.test.aggregate([
{ $project : { "_id" : "$content_id", "total" : { $add : [ "$likes", "$views", "$comments"]}}},
{ $group : { "_id" : "$_id" , totalPerId : { $sum : "$total" }}},
{ $sort : { "total" : 1 }}
])
Based on your test data, you will get:
{ "_id" : "5715c502ac530ee5018b4955", "totalPerId" : NumberLong(700) }
{ "_id" : "5715c4bbac530eb3018b4567", "totalPerId" : NumberLong(500) }
{ "_id" : "5715c4bbac530eb3018b4569", "totalPerId" : NumberLong(300) }