MongoDB needs to get total counts from all objects - mongodb

my aggregate query looks like this. I need to get the total count from the value.
db.getCollection('mydesk').aggregate([
{
$match: {
"accountId": ObjectId("616ea615edc5fa4278ccb7f6"),
"val" : { $ne : null},
"deskId": { "$in": [
ObjectId("61934f7efdb9dc5a7c1c3a01"),
ObjectId("61713730857c3243ec1d257c"),
ObjectId("629d9548e0c93e34e435e7b9"),
ObjectId("616eaf613bcd9655b8035a25"),
]}
}
},
{
$project: {
item: 1,
value: { $size: "$val.shapes" },
}
}
])
I got result like this. But need to get the total counts of my value.
/* 1 */
{
"_id" : ObjectId("616fab4f12b90d59d03f380e"),
"value" : 11
}
/* 2 */
{
"_id" : ObjectId("616fbad35700980a041cd190"),
"value" : 4
}
/* 3 */
{
"_id" : ObjectId("61713752857c3243ec1d257e"),
"value" : 12
}
Needed result :
{
"totalValueCount" : 27
}
Thanks in advance

One option is to use $group to $sum up the values:
db.getCollection('mydesk').aggregate([
{
$match: {
"accountId": ObjectId("616ea615edc5fa4278ccb7f6"),
"val" : { $ne : null},
"deskId": { "$in": [
ObjectId("61934f7efdb9dc5a7c1c3a01"),
ObjectId("61713730857c3243ec1d257c"),
ObjectId("629d9548e0c93e34e435e7b9"),
ObjectId("616eaf613bcd9655b8035a25"),
]}
}
},
{
$group: {
_id: null,
total: {$sum: { $size: "$val.shapes"}},
}
},
{$project: {_id: 0, total: 1}}
])

Related

How can i count total documents and also grouped counts simultanously in mongodb aggregation?

I have a dataset in mongodb collection named visitorsSession like
{ip : 192.2.1.1,country : 'US', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.3.1.8,country : 'UK', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.5.1.4,country : 'UK', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.8.1.7,country : 'US', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'},
{ip : 192.1.1.3,country : 'US', type : 'Visitors',date : '2019-12-15T00:00:00.359Z'}
I am using this mongodb aggregation
[{$match: {
nsp : "/hrm.sbtjapan.com",
creationDate : {
$gte: "2019-12-15T00:00:00.359Z",
$lte: "2019-12-20T23:00:00.359Z"
},
type : "Visitors"
}}, {$group: {
_id : "$country",
totalSessions : {
$sum: 1
}
}}, {$project: {
_id : 0,
country : "$_id",
totalSessions : 1
}}, {$sort: {
country: -1
}}]
using above aggregation i am getting results like this
[{country : 'US',totalSessions : 3},{country : 'UK',totalSessions : 2}]
But i also total visitors also along with result like totalVisitors : 5
How can i do this in mongodb aggregation ?
You can use $facet aggregation stage to calculate total visitors as well as visitors by country in a single pass:
db.visitorsSession.aggregate( [
{
$match: {
nsp : "/hrm.sbtjapan.com",
creationDate : {
$gte: "2019-12-15T00:00:00.359Z",
$lte: "2019-12-20T23:00:00.359Z"
},
type : "Visitors"
}
},
{
$facet: {
totalVisitors: [
{
$count: "count"
}
],
countrySessions: [
{
$group: {
_id : "$country",
sessions : { $sum: 1 }
}
},
{
$project: {
country: "$_id",
_id: 0,
sessions: 1
}
}
],
}
},
{
$addFields: {
totalVisitors: { $arrayElemAt: [ "$totalVisitors.count" , 0 ] },
}
}
] )
The output:
{
"totalVisitors" : 5,
"countrySessions" : [
{
"sessions" : 2,
"country" : "UK"
},
{
"sessions" : 3,
"country" : "US"
}
]
}
You could be better off with two queries to do this.
To save the two db round trips following aggregation can be used which IMO is kinda verbose (and might be little expensive if documents are very large) to just count the documents.
Idea: Is to have a $group at the top to count documents and preserve the original documents using $push and $$ROOT. And then before other matches/filter ops $unwind the created array of original docs.
db.collection.aggregate([
{
$group: {
_id: null,
docsCount: {
$sum: 1
},
originals: {
$push: "$$ROOT"
}
}
},
{
$unwind: "$originals"
},
{ $match: "..." }, //and other stages on `originals` which contains the source documents
{
$group: {
_id: "$originals.country",
totalSessions: {
$sum: 1
},
totalVisitors: {
$first: "$docsCount"
}
}
}
]);
Sample O/P: Playground Link
[
{
"_id": "UK",
"totalSessions": 2,
"totalVisitors": 5
},
{
"_id": "US",
"totalSessions": 3,
"totalVisitors": 5
}
]

how to count the size of an array object and unwind it in MongoDB

I need to count the size of an array object and I also need to get the averages for each field in the array labeled raised_amount. However, MongoDB will not let me count the array size after unwinding it(duh). Mongo will not let me count the array size before unwinding either. This is for a class I am taking. Quite the challenge.
db.research.aggregate({$unwind:"$funding_rounds"},
{"$group": {
"_id":{"name": "$name"},
"averageFunding" : {
"$avg" : "$funding_rounds.raised_amount"
}
}
},
{$project: { count: { $size:"$funding_rounds" }}},
{ $sort: { averageFunding: -1 } },
{"$limit":10})
Take out {$project: { count: { $size:"$funding_rounds" }}} and it works! However, I wouldn't have funding_round count. Try to count the rounds by themselves, and it works.
Example of data:
{
"name": "Facebook",
"total_money_raised": "$39.8M",
"funding_round": [
{
"example": 123,
"round_code": "a",
"raised_amount": "1232"
},
{
"example": 123,
"round_code": "bat",
"raised_amount": "1232"
},
{
"example": 123,
"round_code": "cat",
"raised_amount": "1232"
}
]
}
Any ideas on how to count the array size in this aggregation?
$size expect an array, and you $unwind array to object before counting. That's why MongoDB restrict to count size.
Try Below code:
db.getCollection('tests').aggregate([
{
$project: {
_id: 1,
name: 1,
total_money_raised : 1,
funding_round :1,
size: { $size:"$funding_round" }
}
},
{ $unwind : "$funding_round"},
{ $group:{
_id: "$name",
avgFunding : {"$avg" : "$funding_round.raised_amount"},
size: {$first : "$size"},
totalCount : {$sum: 1}
}
},
{ $sort: { "avgFunding": -1 } },
{ "$limit":10 }
])
Output:
/* 1 */
{
"_id" : "Facebook",
"avgFunding" : 1232.0,
"size" : 3,
"totalCount" : 3.0
}
If NAME field is unique:
Another thing that I need to mention here is if your name field is unique and you just want to have the size of an array you can then unwind and then count total documents while $group as below:
db.getCollection('tests').aggregate([
{ $unwind : "$funding_round"},
{ $group:{
_id: "$name",
"avgFunding" : {"$avg" : "$funding_round.raised_amount"},
size : {$sum: 1}
}
},
{ $sort: { "avgFunding": -1 } },
{ "$limit":10 }
])
Output:
/* 1 */
{
"_id" : "Facebook",
"avgFunding" : 1232.0,
"size" : 3.0
}
Where size is the total count of documents that are unwound from an array.

mongodb aggregate multiple arrays

I am using MongoDB version v3.4. I have a documents collection and sample datas are like this:
{
"mlVoters" : [
{"email" : "a#b.com", "isApproved" : false}
],
"egVoters" : [
{"email" : "a#b.com", "isApproved" : false},
{"email" : "c#d.com", "isApproved" : true}
]
},{
"mlVoters" : [
{"email" : "a#b.com", "isApproved" : false},
{"email" : "e#f.com", "isApproved" : true}
],
"egVoters" : [
{"email" : "e#f.com", "isApproved" : true}
]
}
Now if i want the count of distinct email addresses for mlVoters:
db.documents.aggregate([
{$project: { mlVoters: 1 } },
{$unwind: "$mlVoters" },
{$group: { _id: "$mlVoters.email", mlCount: { $sum: 1 } }},
{$project: { _id: 0, email: "$_id", mlCount: 1 } },
{$sort: { mlCount: -1 } }
])
Result of the query is:
{"mlCount" : 2.0,"email" : "a#b.com"}
{"mlCount" : 1.0,"email" : "e#f.com"}
And if i want the count of distinct email addresses for egVoters i do the same for egVoters field. And the result of that query would be:
{"egCount" : 1.0,"email" : "a#b.com"}
{"egCount" : 1.0,"email" : "c#d.com"}
{"egCount" : 1.0,"email" : "e#f.com"}
So, I want to combine these two aggregation and get the result as following (sorted by totalCount):
{"email" : "a#b.com", "mlCount" : 2, "egCount" : 1, "totalCount":3}
{"email" : "e#f.com", "mlCount" : 1, "egCount" : 1, "totalCount":2}
{"email" : "c#d.com", "mlCount" : 0, "egCount" : 1, "totalCount":1}
How can I do this? How should the query be like? Thanks.
First you add a field voteType in each vote. This field indicates its type. Having this field, you don't need to keep the votes in two separate arrays mlVoters and egVoters; you can instead concatenate those arrays into a single array per document, and unwind afterwards.
At this point you have one document per vote, with a field that indicates which type it is. Now you simply need to group by email and, in the group stage, perform two conditional sums to count how many votes of each type there are for every email.
Finally you add a field totalCount as the sum of the other two counts.
db.documents.aggregate([
{
$addFields: {
mlVoters: {
$ifNull: [ "$mlVoters", []]
},
egVoters: {
$ifNull: [ "$egVoters", []]
}
}
},
{
$addFields: {
"mlVoters.voteType": "ml",
"egVoters.voteType": "eg"
}
},
{
$project: {
voters: { $concatArrays: ["$mlVoters", "$egVoters"] }
}
},
{
$unwind: "$voters"
},
{
$project: {
email: "$voters.email",
voteType: "$voters.voteType"
}
},
{
$group: {
_id: "$email",
mlCount: {
$sum: {
$cond: {
"if": { $eq: ["$voteType", "ml"] },
"then": 1,
"else": 0
}
}
},
egCount: {
$sum: {
$cond: {
"if": { $eq: ["$voteType", "eg"] },
"then": 1,
"else": 0
}
}
}
}
},
{
$addFields: {
totalCount: {
$sum: ["$mlCount", "$egCount"]
}
}
}
])

MongoDB: Project to array item with minimum value of field

Suppose my collection consists of items that looks like this:
{
"items" : [
{
"item_id": 1,
"item_field": 10
},
{
"item_id": 2,
"item_field": 15
},
{
"item_id": 3,
"item_field": 3
},
]
}
Can I somehow select the entry of items with the lowest value of item_field, in this case the one with item_id 3?
I'm ok with using the aggregation framework. Bonus point if you can give me the code for the C# driver.
You can use $reduce expression in the following way.
The below query will set the initialValue to the first element of $items.item_field and followed by $lt comparison on the item_field and if true set $$this to $$value, if false keep the previous value and $reduce all the values to find the minimum element and $project to output min item.
db.collection.aggregate([
{
$project: {
items: {
$reduce: {
input: "$items",
initialValue:{
item_field:{
$let: {
vars: { obj: { $arrayElemAt: ["$items", 0] } },
in: "$$obj.item_field"
}
}
},
in: {
$cond: [{ $lt: ["$$this.item_field", "$$value.item_field"] }, "$$this", "$$value" ]
}
}
}
}
}
])
You can use $unwind to seperate items entries.
Then $sort by item_field asc and then $group.
db.coll.find().pretty()
{
"_id" : ObjectId("58edec875748bae2cc391722"),
"items" : [
{
"item_id" : 1,
"item_field" : 10
},
{
"item_id" : 2,
"item_field" : 15
},
{
"item_id" : 3,
"item_field" : 3
}
]
}
db.coll.aggregate([
{$unwind: {path: '$items', includeArrayIndex: 'index'}},
{$sort: { 'items.item_field': 1}},
{$group: {_id: '$_id', item: {$first: '$items'}}}
])
{ "_id" : ObjectId("58edec875748bae2cc391722"), "item" : { "item_id" : 3, "item_field" : 3 } }
We can get expected result using following query
db.testing.aggregate([{$unwind:"$items"}, {$sort: { 'items.item_field': 1}},{$group: {_id: "$_id", minItem: {$first: '$items'}}}])
Result is
{ "_id" : ObjectId("58edf28c73fed29f4b741731"), "minItem" : { "item_id" : 3, "item_field" : 3 } }
{ "_id" : ObjectId("58edec3373fed29f4b741730"), "minItem" : { "item_id" : 3, "item_field" : 3 } }

MongoDB Aggregation: Compute Running Totals from sum of previous rows

Sample Documents:
{ time: ISODate("2013-10-10T20:55:36Z"), value: 1 }
{ time: ISODate("2013-10-10T22:43:16Z"), value: 2 }
{ time: ISODate("2013-10-11T19:12:66Z"), value: 3 }
{ time: ISODate("2013-10-11T10:15:38Z"), value: 4 }
{ time: ISODate("2013-10-12T04:15:38Z"), value: 5 }
It's easy to get the aggregated results that is grouped by date.
But what I want is to query results that returns a running total
of the aggregation, like:
{ time: "2013-10-10" total: 3, runningTotal: 3 }
{ time: "2013-10-11" total: 7, runningTotal: 10 }
{ time: "2013-10-12" total: 5, runningTotal: 15 }
Is this possible with the MongoDB Aggregation?
EDIT: Since MongoDB v5.0 the prefered approach would be to use the new $setWindowFields aggregation stage as shared by Xavier Guihot.
This does what you need. I have normalised the times in the data so they group together (You could do something like this). The idea is to $group and push the time's and total's into separate arrays. Then $unwind the time array, and you have made a copy of the totals array for each time document. You can then calculated the runningTotal (or something like the rolling average) from the array containing all the data for different times. The 'index' generated by $unwind is the array index for the total corresponding to that time. It is important to $sort before $unwinding since this ensures the arrays are in the correct order.
db.temp.aggregate(
[
{
'$group': {
'_id': '$time',
'total': { '$sum': '$value' }
}
},
{
'$sort': {
'_id': 1
}
},
{
'$group': {
'_id': 0,
'time': { '$push': '$_id' },
'totals': { '$push': '$total' }
}
},
{
'$unwind': {
'path' : '$time',
'includeArrayIndex' : 'index'
}
},
{
'$project': {
'_id': 0,
'time': { '$dateToString': { 'format': '%Y-%m-%d', 'date': '$time' } },
'total': { '$arrayElemAt': [ '$totals', '$index' ] },
'runningTotal': { '$sum': { '$slice': [ '$totals', { '$add': [ '$index', 1 ] } ] } },
}
},
]
);
I have used something similar on a collection with ~80 000 documents, aggregating to 63 results. I am not sure how well it will work on larger collections, but I have found that performing transformations(projections, array manipulations) on aggregated data does not seem to have a large performance cost once the data is reduced to a manageable size.
here is another approach
pipeline
db.col.aggregate([
{$group : {
_id : { time :{ $dateToString: {format: "%Y-%m-%d", date: "$time", timezone: "-05:00"}}},
value : {$sum : "$value"}
}},
{$addFields : {_id : "$_id.time"}},
{$sort : {_id : 1}},
{$group : {_id : null, data : {$push : "$$ROOT"}}},
{$addFields : {data : {
$reduce : {
input : "$data",
initialValue : {total : 0, d : []},
in : {
total : {$sum : ["$$this.value", "$$value.total"]},
d : {$concatArrays : [
"$$value.d",
[{
_id : "$$this._id",
value : "$$this.value",
runningTotal : {$sum : ["$$value.total", "$$this.value"]}
}]
]}
}
}
}}},
{$unwind : "$data.d"},
{$replaceRoot : {newRoot : "$data.d"}}
]).pretty()
collection
> db.col.find()
{ "_id" : ObjectId("4f442120eb03305789000000"), "time" : ISODate("2013-10-10T20:55:36Z"), "value" : 1 }
{ "_id" : ObjectId("4f442120eb03305789000001"), "time" : ISODate("2013-10-11T04:43:16Z"), "value" : 2 }
{ "_id" : ObjectId("4f442120eb03305789000002"), "time" : ISODate("2013-10-12T03:13:06Z"), "value" : 3 }
{ "_id" : ObjectId("4f442120eb03305789000003"), "time" : ISODate("2013-10-11T10:15:38Z"), "value" : 4 }
{ "_id" : ObjectId("4f442120eb03305789000004"), "time" : ISODate("2013-10-13T02:15:38Z"), "value" : 5 }
result
{ "_id" : "2013-10-10", "value" : 3, "runningTotal" : 3 }
{ "_id" : "2013-10-11", "value" : 7, "runningTotal" : 10 }
{ "_id" : "2013-10-12", "value" : 5, "runningTotal" : 15 }
>
Here is a solution without pushing previous documents into a new array and then processing them. (If the array gets too big then you can exceed the maximum BSON document size limit, the 16MB.)
Calculating running totals is as simple as:
db.collection1.aggregate(
[
{
$lookup: {
from: 'collection1',
let: { date_to: '$time' },
pipeline: [
{
$match: {
$expr: {
$lt: [ '$time', '$$date_to' ]
}
}
},
{
$group: {
_id: null,
summary: {
$sum: '$value'
}
}
}
],
as: 'sum_prev_days'
}
},
{
$addFields: {
sum_prev_days: {
$arrayElemAt: [ '$sum_prev_days', 0 ]
}
}
},
{
$addFields: {
running_total: {
$sum: [ '$value', '$sum_prev_days.summary' ]
}
}
},
{
$project: { sum_prev_days: 0 }
}
]
)
What we did: within the lookup we selected all documents with smaller datetime and immediately calculated the sum (using $group as the second step of lookup's pipeline). The $lookup put the value into the first element of an array. We pull the first array element and then calculate the sum: current value + sum of previous values.
If you would like to group transactions into days and after it calculate running totals then we need to insert $group to the beginning and also insert it into $lookup's pipeline.
db.collection1.aggregate(
[
{
$group: {
_id: {
$substrBytes: ['$time', 0, 10]
},
value: {
$sum: '$value'
}
}
},
{
$lookup: {
from: 'collection1',
let: { date_to: '$_id' },
pipeline: [
{
$group: {
_id: {
$substrBytes: ['$time', 0, 10]
},
value: {
$sum: '$value'
}
}
},
{
$match: {
$expr: {
$lt: [ '$_id', '$$date_to' ]
}
}
},
{
$group: {
_id: null,
summary: {
$sum: '$value'
}
}
}
],
as: 'sum_prev_days'
}
},
{
$addFields: {
sum_prev_days: {
$arrayElemAt: [ '$sum_prev_days', 0 ]
}
}
},
{
$addFields: {
running_total: {
$sum: [ '$value', '$sum_prev_days.summary' ]
}
}
},
{
$project: { sum_prev_days: 0 }
}
]
)
The result is:
{ "_id" : "2013-10-10", "value" : 3, "running_total" : 3 }
{ "_id" : "2013-10-11", "value" : 7, "running_total" : 10 }
{ "_id" : "2013-10-12", "value" : 5, "running_total" : 15 }
Starting in Mongo 5, it's a perfect use case for the new $setWindowFields aggregation operator:
// { time: ISODate("2013-10-10T20:55:36Z"), value: 1 }
// { time: ISODate("2013-10-10T22:43:16Z"), value: 2 }
// { time: ISODate("2013-10-11T12:12:66Z"), value: 3 }
// { time: ISODate("2013-10-11T10:15:38Z"), value: 4 }
// { time: ISODate("2013-10-12T05:15:38Z"), value: 5 }
db.collection.aggregate([
{ $group: {
_id: { $dateToString: { format: "%Y-%m-%d", date: "$time" } },
total: { $sum: "$value" }
}},
// e.g.: { "_id" : "2013-10-11", "total" : 7 }
{ $set: { "date": "$_id" } }, { $unset: ["_id"] },
// e.g.: { "date" : "2013-10-11", "total" : 7 }
{ $setWindowFields: {
sortBy: { date: 1 },
output: {
running: {
$sum: "$total",
window: { documents: [ "unbounded", "current" ] }
}
}
}}
])
// { date: "2013-10-11", total: 7, running: 7 }
// { date: "2013-10-10", total: 3, running: 10 }
// { date: "2013-10-12", total: 5, running: 15 }
Let's focus on the $setWindowFields stage that:
chronologically $sorts grouped documents by date: sortBy: { date: 1 }
adds the running field in each document (output: { running: { ... }})
which is the $sum of totals ($sum: "$total")
on a specified span of documents (the window)
which is in our case any previous document: window: { documents: [ "unbounded", "current" ] } }
as defined by [ "unbounded", "current" ] meaning the window is all documents seen between the first document (unbounded) and the current document (current).