Update table with merge in mongodb - mongodb

I have a collection with values associated to the sells of (almost a million) different products by day, and I have to create the collection with the aggregation per week. I do it with the following (working) query.
Brief explanation:
I filter the dates I want to use in the query.
I convert the weird date format to a real date.
I group by name of the object, year and week, getting the sum per week.
I group again by name to have all dates in the same document.
I save it to a table.
[
{
$match:
/**
* query: The query in MQL.
*/
{
$and: [
{
"_id.date": {
$gte: "20220103",
},
},
{
"_id.date": {
$lte: "20230122",
},
},
],
},
},
{
$project:
/**
* specifications: The fields to
* include or exclude.
*/
{
_id: 1,
realDate: {
$dateFromString: {
dateString: "$_id.date",
format: "%Y%m%d",
},
},
count: 1,
},
},
{
$group:
/**
* _id: The id of the group.
* fieldN: The first field name.
*/
{
_id: {
name: "$_id.name",
year: {
$isoWeekYear: "$realDate",
},
week: {
$isoWeek: "$realDate",
},
},
total: {
$sum: "$count",
},
},
},
{
$group:
/**
* _id: The id of the group.
* fieldN: The first field name.
*/
{
_id: "$_id.name",
dates: {
$addToSet: {
year: "$_id.year",
week: "$_id.week",
count: "$total",
},
},
},
},
{
$merge:
/**
* into: The target collection.
* on: Fields to identify.
* let: Defined variables.
* whenMatched: Action for matching docs.
* whenNotMatched: Action for non-matching docs.
*/
{
into: "dataPerWeek",
on: "_id",
},
},
]
That works, and generate documents like:
{
"_id": "myProduct",
"dates": [
{
"year": {
"$numberLong": "2022"
},
"week": 52,
"count": 10
},
{
"year": {
"$numberLong": "2022"
},
"week": 50,
"count": 6
},
{
"year": {
"$numberLong": "2022"
},
"week": 49,
"count": 2
},
{
"year": {
"$numberLong": "2022"
},
"week": 51,
"count": 5
},
{
"year": {
"$numberLong": "2023"
},
"week": 1,
"count": 5
},
{
"year": {
"$numberLong": "2023"
},
"week": 2,
"count": 2
},
{
"year": {
"$numberLong": "2023"
},
"week": 3,
"count": 4
}
]
}
Now, I would want now to update this list every week adding only the new elements to the array (or creating a new object if it does not exist. But, if I repeat the merge query above limiting the dates to the last week, it basically removes all other data points. Is is possible to do this "update" with a single query?

You should store date values as Date objects. Storing date values as string is a design flaw.
Your pipeline can be shorter and your $merge stage would be like this:
[
{
$match: {
"_id.date": {
$gte: ISODate("2022-01-03"),
$lte: ISODate("2023-01-22"),
}
}
},
{
$group: {
_id: {
name: "$_id.name",
week: { $dateTrunc: { date: "_id.date", unit: "week", startOfWeek: "monday" } }
},
total: { $sum: "$count" },
}
},
{
$group: {
_id: "$_id.name",
dates: {
$push: { // $push should be faster than $addToSet, result is the same
year: { $isoWeekYear: "$_id.week" },
week: { $isoWeek: "$_id.week" },
count: "$total",
}
}
}
},
{
$merge: {
into: "dataPerWeek",
on: "_id",
whenMatched: [
{ $set: { dates: { $concatArrays: ["$dates", "$$new.dates"] } } }
]
}
}
]
The dates elements are simply concatenated. If you like to update existing elements then you need to iterate over all elements with $map

Related

MongoDB aggregate $group $sum that matches date inside array of objects

I'll explain my problem here and i'll put a tldr at the bottom summarizing the question.
We have a collection called apple_receipt, since we have some apple purchases in our application. That document has some fields that we will be using on this aggregation. Those are: price, currency, startedAt and history. Price, currency and startedAt are self-explanatory. History is a field that is an array of objects containing a price and startedAt. So, what we are trying to accomplish is a query that gets every document between a date of our choice, for example: 06-06-2020 through 10-10-2022 and get the total price combined of all those receipts that have a startedAt between that. We have a document like this:
{
price: 12.9,
currency: 'BRL',
startedAt: 2022-08-10T16:23:42.000+00:00
history: [
{
price: 12.9,
startedAt: 2022-05-10T16:23:42.000+00:00
},
{
price: 12.9,
startedAt: 2022-06-10T16:23:42.000+00:00
},
{
price: 12.9,
startedAt: 2022-07-10T16:23:42.000+00:00
}
]
}
If we query between dates 06-06-2022 to 10-10-2022, we would have a return like this: totalPrice: 38,7.
-total price of the 3 objects that have matched the date inside that value range-
I have tried this so far:
AppleReceipt.aggregate([
{
$project: {
price: 1,
startedAt: 1,
currency: 1,
history: 1,
}
},
{
$unwind: {
path: "$history",
preserveNullAndEmptyArrays: true,
}
},
{
$match: {
$or: [
{ startedAt: {$gte: new Date(filters.begin), $lt: new Date(filters.end)} },
]
}
},
{
$group: {
_id: "$_id",
data: { $push: '$$ROOT' },
totalAmountHelper: { $sum: '$history.price' }
}
},
{
$unwind: "$data"
},
{
$addFields: {
totalAmount: { $add: ['$totalAmountHelper', '$data.price'] }
}
}
])
It does bring me the total value but I couldn't know how to take into consideration the date to make the match stage to only get the sum of the documents that are between that date.
tl;dr: Want to make a query that gets the total sum of the prices of all documents that have startedAt between the dates we choose. Needs to match the ones inside history field - which is an array of objects, and also the startedAt outside of the history field.
https://mongoplayground.net/p/lOvRbX24QI9
db.collection.aggregate([
{
$set: {
"history_total": {
"$reduce": {
"input": "$history",
"initialValue": 0,
"in": {
$sum: [
{
"$cond": {
"if": {
$and: [
{
$gte: [
new Date("2022-06-06"),
{
$dateFromString: {
dateString: "$$this.startedAt"
}
}
]
},
{
$lt: [
{
$dateFromString: {
dateString: "$$this.startedAt"
}
},
new Date("2022-10-10")
]
},
]
},
"then": "$$this.price",
"else": 0
}
},
"$$value",
]
}
}
}
}
},
{
$set: {
"history_total": {
"$sum": [
"$price",
"$history_total"
]
}
}
}
])
Result:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"currency": "BRL",
"history": [
{
"price": 12.9,
"startedAt": "2022-05-10T16:23:42.000+00:00"
},
{
"price": 12.9,
"startedAt": "2022-06-10T16:23:42.000+00:00"
},
{
"price": 12.9,
"startedAt": "2022-07-10T16:23:42.000+00:00"
}
],
"history_total": 325.79999999999995,
"price": 312.9,
"startedAt": "2022-08-10T16:23:42.000+00:00"
}
]
Kudos goes to #user20042973

Aggregation: Grouping based on array element in MongoDB

I am new to MongoDB, trying to write an aggregation function such that my output for the input should be same as below
[
{
"_id": {
"month": 1,
"year": 2022
},
"childServices": [
{"service":"MCT Latency", "sli":99.9},
{"service":"MCT Packet Loss", "sli":99.9}
],
"service": "Network"
},
{
"_id": {
"month": 2,
"year": 2022
},
"childServices": [
{"service":"MCT Latency", "sli":98.9},
{"service":"MCT Packet Loss", "sli":99.9}
]
"service": "Network",
}
]
Tried with below, but it's not grouping each childService by date.
[{
$unwind: {
path: '$childServices'
}
}, {
$group: {
_id: {
month: {
$month: '$date'
},
year: {
$year: '$date'
}
},
service: {
$first: '$service'
},
childServices: {
$first: '$childServices.service'
},
sli: {
$avg: '$childServices.availability'
}
}
}, {
$sort: {
'_id.month': 1,
'_id.year': 1
}
}]
SAMPLE DATA
[{
"_id": {
"$oid": "62fc99c00f5b1cb61d5f1072"
},
"service": "Network",
"date": "01/02/2022 00:32:51",
"childServices": [
{
"service": "MCT Latency",
"availability": 99.9,
},
{
"service": "MCT Packet Loss",
"availability": 99.9,
}
},
{
"_id": {
"$oid": "62fc99df0f5b1cb61d5f1073"
},
"service": "Network",
"date": "02/02/2022 00:32:51",
"childServices": [
{
"service": "MCT Latency",
"availability": 98.3,
},
"service": "MCT Packet Loss",
"availability": 99.9,
}
}
]
Basically, I want to get into the childService > pick each service > group them by month+year and get their monthly avg.
Convert the date from a string to a date type, before grouping, like this:
db.collection.aggregate([
{
$unwind: {
path: "$childServices"
}
},
{
$addFields: {
date: {
"$toDate": "$date"
}
}
},
{
$group: { <---- Here we are grouping the data for each distinct combination of month, year and child service. This needs to be done because we are using $first accumulator
_id: {
month: {
$month: "$date"
},
year: {
$year: "$date"
},
service: "$childServices.service"
},
service: {
$first: "$service"
},
childServices: {
$first: "$childServices.service"
},
sli: {
$avg: "$childServices.availability"
}
}
},
{
"$group": { <-- In this group, we groupBy month and year, and we push the child services record into an array, using $push. This gives us, for every month and year, the average of all distinct childservices
"_id": {
month: "$_id.month",
year: "$_id.year"
},
"childServices": {
"$push": {
service: "$childServices",
sli: "$sli"
}
}
}
},
{
$sort: {
"_id.month": 1,
"_id.year": 1
}
}
])
Playground link.

Is it possible to group (aggregate) objects with dates into incremental intervals in MongoDB?

I am currently trying to create an aggregation pipeline in MongoDB to group the items into incremental time intervals, but I only succeeded in grouping them in disjoint time intervals so far.
Sample data:
{
"eventID": "abc",
"date": ISODate("2020-11-05T12:05:11.790Z"),
...........
},
{
"eventID": "xyz",
"date": ISODate("2020-11-05T12:12:11.790Z"),
...........
},
{
"eventID": "klm",
"date": ISODate("2020-11-05T12:28:11.790Z"),
...........
}
Current solution:
$group: {
"_id": {
"year": { $year: "$date" },
"dayOfYear": { $dayOfYear: "$date" },
"hour": { $hour: "$date" },
"interval": {
"$subtract": [
{ "$minute": "$date" },
{ "$mod": [{ "$minute": "$date"}, 10 ] }
]
}
},
"grouped_data": { "$push": { "eventID": "$eventID", "date": "$date" },
"count": { $sum: 1 } }
}
Which returns the data grouped in 10 minutes intervals but those are disjoint intervals (time windows of 10minutes that do not intersect).
Eg:
{
"_id": {
"year": 2020,
"dayOfYear": "314",
"hour": 12,
"interval": 0, // = interval beginning at minute 0 of 12th hour of the day
},
"grouped_data": [{ "eventID": "abc", "date": ISODate("2020-11-05T12:05:11.790Z" }],
"count": 1
},
{
"_id": {
"year": 2020,
"dayOfYear": "314",
"hour": 12,
"interval": 10, // = beginning at minute 10
},
"grouped_data": [{ "eventID": "xyz", "date": ISODate("2020-11-05T12:12:11.790Z") }],
"count": 1
},
{
"_id": {
"year": 2020,
"dayOfYear": "314",
"hour": 12,
"interval": 20, // = beginning at minute 20
},
"grouped_data": [{ "eventID": "klm", "date": ISODate("2020-11-05T12:28:11.790Z") }],
"count": 1
}
What I am actually looking for is grouping them in 10 minutes(or whatever is needed) incremental intervals. Eg: 0-9, 1-10, 2-11, etc. instead of 0-9, 10-19, 20-29 etc.
Edit:
The end goal here is to check if a count threshold is surpassed on a interval length defined by the user.
If user asks "Are there more than 2 events on a 10minute time window?", based on the sample data above and my current solution, the condition is not met. (1 event in 0-9 interval, and 1 event in 10-19). With incremental intervals I should be able to find that there are indeed 2 events in 10 minutes, but in the time interval 5-14. Eg:
{
"_id": {
*whatever logic for grouping in 10minutes window*
},
"grouped_data": [
{ "eventID": "abc", "date": ISODate("2020-11-05T12:05:11.790Z") },
{ "eventID": "xyz", "date": ISODate("2020-11-05T12:12:11.790Z") }],
"count": 2
},
{
"_id": {
*whatever logic for grouping in 10minutes window*
},
"grouped_data": [
{ "eventID": "klm", "date": ISODate("2020-11-05T12:28:11.790Z") }]
"count": 1
},
For me it is not clear which output you like to get, but this aggregation pipeline makes the sliding-window group:
db.collection.aggregate([
{
$group: {
_id: null,
data: { $push: "$$ROOT" },
min_date: { $min: "$date" },
max_date: { $max: "$date" }
}
},
{
$addFields: {
interval: {
$range: [
{ $toInt: { $divide: [{ $toLong: "$min_date" }, 1000] } },
{ $toInt: { $divide: [{ $toLong: "$max_date" }, 1000] } },
10 * 60]
}
}
},
{
$set: {
interval: {
$map: {
input: "$interval",
in: { $toDate: { $multiply: ["$$this", 1000] } }
}
}
}
},
{ $unwind: "$interval" },
{
$project: {
grouped_data: {
$filter: {
input: "$data",
cond: {
$and: [
{ $gte: ["$$this.date", "$interval"] },
{ $lt: ["$$this.date", { $add: ["$interval", 1000 * 60 * 10] }] },
]
}
}
},
interval: 1
}
}
])
Boundaries are given by input data, however can also use fixes dates:
db.collection.aggregate([
{ $group: { _id: null, data: { $push: "$$ROOT" } } },
{
$addFields: {
interval: {
$range: [
{ $toInt: { $divide: [{ $toLong: ISODate("2020-01-01T00:00:00Z") }, 1000] } },
{ $toInt: { $divide: [{ $toLong: ISODate("2020-12-31T23:59:59Z") }, 1000] } },
10 * 60]
}
}
},
{
$set: {
interval: {
$map: {
input: "$interval",
in: { $toDate: { $multiply: ["$$this", 1000] } }
}
}
}
},
{ $unwind: "$interval" },
{
$project: {
grouped_data: {
$filter: {
input: "$data",
cond: {
$and: [
{ $gte: ["$$this.date", "$interval"] },
{ $lt: ["$$this.date", { $add: ["$interval", 1000 * 60 * 10] }] },
]
}
}
},
interval: 1
}
}
])
I will try to answer my own question, maybe it will help other people on the internet. The solution I came up with is based on the answer of #Wernfried (thanks!).
db.getCollection("events_en").aggregate([
{
$match: { eventID: "XYZ" }
},
{
$group: {
_id: null,
events: { $push: "$$ROOT" },
limit: { $push: { $toDate: { $add: [{ $toLong: "$date" }, 1000 * 60 * 10] } } }
}
},
{ $unwind: "$limit" },
{
$project: {
events: {
$filter: {
input: "$events",
cond: {
$and: [
{ $lt: ["$$this.date", "$limit"] },
{ $gte: ["$$this.date", { $subtract: ["$limit", 1000 * 60 * 10] }] },
]
}
}
},
limit: 1,
}
},
{
$addFields: {
count: {
$size: "$events"
}
}
}
])
This will create a limit for each event, based on its date + 10 minutes (or whatever). And afterwards it filters the events (which are now duplicated for each of the limit using $unwind: "$limit"), based on that limit. The result is something like this:
{
"_id" : null,
"limit" : ISODate("2020-11-05T12:28:27.000+0000"),
"events" : [
{
"_id" : 13,
"eventID" : "XYZ",
"date" : ISODate("2020-11-05T12:18:27.000+0000")
},
{
"_id" : 63,
"eventID" : "XYZ",
"date" : ISODate("2020-11-05T12:19:55.000+0000")
},
............................
{
"_id" : 90,
"eventID" : "XYZ",
"date" : ISODate("2020-11-05T12:27:57.000+0000")
}
],
"count" : 5
}
{
"_id" : null,
"limit" : ISODate("2020-11-05T12:29:55.000+0000"),
"events" : [
{
"_id" : 63,
"eventID" : "XYZ",
"date" : ISODate("2020-11-05T12:19:55.000+0000")
},
{
"_id" : 90,
"eventID" : "XYZ",
"date" : ISODate("2020-11-05T12:27:57.000+0000")
},
{
"_id" : 97,
"eventID" : "XYZ",
"date" : ISODate("2020-11-05T12:29:36.000+0000")
}
],
"count" : 3
}
As you can see, looking at the limit of each group and at the dates of the events in each group, these intervals are now incremental, not disjoint. (event X is found in multiple groups, as long as it doesnt exceeds the time interval of 10minutes)

How to group orders by data range in MongoDB?

Suppose I have a collection like this:
[
{
"_id": ObjectId("5e3dd3d57f8bc30a7513e843"),
"deleted": true,
"date": "01/01/2020"
"total": 3
},
{
"_id": ObjectId("5e3dd3e97f8bc30a7513e99b"),
"date": "02/01/2020",
"deleted": false,
"total": 11
},
{
"_id": ObjectId("5e3dd3e97f8bc30a75137635"),
"date": "15/02/2020",
"deleted": false,
"total": 5
},
{
"_id": ObjectId("5e3dd3e97f8bc30a75131725"),
"date": "18/02/2020",
"deleted": false,
"total": 7
},
{
"_id": ObjectId("5e3dd3e97f8bc30a75131725"),
"date": "03/03/2020",
"deleted": false,
"total": 9
}
]
I need to merge these orders by a range to receive something like this:
{
"january": [order1, order2],
"february": [order3, order4],
"march": [order5]
}
of course I don't need the words "january, february" etc specifically, just something that let me group by data ranges. Something like this:
db.sales.aggregate( [
{ $group: { date: { "$gte": new Date(req.query.minDate), "$lte": new Date(req.query.maxDate) }, mergedOrders: { ?? } } }
])
which is not near a valid group aggregate call.
So, how do I group orders by data range? (I need to get, for each data range, the entire array of orders in that data range, as they are, without excluding fields)
You can try this :
db.sales.aggregate([
{ $match: { date: { "$gte": new Date(req.query.minDate), "$lte": new Date(req.query.maxDate) } } },
{
$group: {
_id: {
$month: {
$dateFromString: {
dateString: '$date',
format: "%d/%m/%Y"
}
}
}, mergedOrders: { $push: '$$ROOT' }
}
}, {
$addFields: {
_id: {
$let: {
vars: {
monthsInString: ['', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'July', 'Aug', 'Sept', 'Oct', 'Nov', 'Dec']
},
in: {
$arrayElemAt: ['$$monthsInString', '$_id']
}
}
}
}
}])
Test : MongoDB-Playground
You can use $group operator to group data based on a particular value in field. Then you may use push operator to have array of those values.
db.sales.aggregate([
$match: {
date: { "$gte": new Date(req.query.minDate), "$lte": new Date(req.query.maxDate) }
},
$group : {
_id: "$date",
orders: {
$push: {
deleted: "$deleted",
total: "$total"
}
}
},
$project: {
_id: 1,
orders: 1
}
)];

How to get last two values from array with aggregation group by date in mongodb?

UserActivity.aggregate([
{
$match: {
user_id: {$in: user_id},
"tracker_id": {$in:getTrackerId},
date: { $gte: req.app.locals._mongo_date(req.params[3]),$lte: req.app.locals._mongo_date(req.params[4]) }
}
},
{ $sort: {date: 1 } },
{ $unwind: "$duration" },
{
$group: {
_id: {
tracker: '$tracker_id',
$last:"$duration",
year:{$year: '$date'},
month: {$month: '$date'},
day: {$dayOfMonth: '$date'}
},
resultData: {$sum: "$duration"}
}
},
{
$group: {
_id: {
year: "$_id.year",
$last:"$duration",
month:"$_id.month",
day: "$_id.day"
},
resultData: {
$addToSet: {
tracker: "$_id.tracker",
val: "$resultData"
}
}
}
}
], function (err,tAData) {
tAData.forEach(function(key){
console.log(key);
});
});
I got output from this collection
{ _id: { year: 2015, month: 11, day: 1 },
resultData:[ { tracker: 55d2e6b043d77c0877105397, val: 60 },
{ tracker: 55d2e6b043d77c0877105397, val: 75 },
{ tracker: 55d2e6b043d77c0877105397, val: 25 },
{ tracker: 55d2e6b043d77c0877105397, val: 21 } ] }
{ _id: { year: 2015, month: 11, day: 2 },
resultData:[ { tracker: 55d2e6b043d77c0877105397, val: 100 },
{ tracker: 55d2e6b043d77c0877105397, val: 110 },
{ tracker: 55d2e6b043d77c0877105397, val: 40 },
{ tracker: 55d2e6b043d77c0877105397, val: 45 } ] }
But I need this output from this collection, I want to fetch two last record from each collection:
{ _id: { year: 2015, month: 11, day: 1 },
resultData:[ { tracker: 55d2e6b043d77c0877105397, val: 25 },
{ tracker: 55d2e6b043d77c0877105397, val: 21 } ] }
{ _id: { year: 2015, month: 11, day: 2 },
resultData:[ { tracker: 55d2e6b043d77c0877105397, val: 40 },
{ tracker: 55d2e6b043d77c0877105397, val: 45 } ] }
You have clear syntax errors in your $group statement with $last as that is not a valid usage, but I suspect this has something to do with what you are "trying" to do rather than what you are using to get your actual result.
Getting a result with the "best n values" is a bit of a problem for the aggregation framework. There is this recent answer from myself with a longer explaination of the basic case, but it all boils down to the aggregation framework lacks the basic tools to do this "limitted" grouping per grouping key that you want.
Doing it badly
The horrible way to approach this is very "iterative" per the number of results you want to return. It basically means pushing everything into an array and then using operators like $first ( after sorting in reverse ) to return the result off the stack and subsequently "filter" that result ( think an array pop or shift operation ) out of the results and then do it again to get the next one.
Basically this with a 2 iteration example:
UserActivity.aggregate(
[
{ "$match": {
"user_id": { "$in": user_id },
"tracker_id": { "$in": getTrackerId },
"date": {
"$gte": startDate,
"$lt": endDate
}
}},
{ "$unwind": "$duration" },
{ "$group": {
"_id": {
"tracker_id": "$tracker_id",
"date": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"val": { "$sum": "$duration" }
}
}},
{ "$sort": { "_id": 1, "val": -1 } },
{ "$group": {
"_id": "$_id.date",
"resultData": {
"$push": {
"tracker_id": "$_id.tracker_id",
"val": "$val"
}
}
}},
{ "$unwind": "$resultData " },
{ "$group": {
"_id": "$_id",
"last": { "$first": "$resultData" },
"resultData": { "$push": "$resultData" }
}},
{ "$unwind": "$resultData" },
{ "$redact": {
"if": { "$eq": [ "$resultData", "$last" ] },
"then": "$$PRUNE",
"else": "$$KEEP"
}},
{ "$group": {
"_id": "$_id",
"last": { "$first": "$last" },
"secondLast": { "$first": "$resultData" }
}},
{ "$project": {
"resultData": {
"$map": {
"input": [0,1],
"as": "index",
"in": {
"$cond": {
"if": { "$eq": [ "$$index", 0 ] },
"$last",
}
}
}
}
}}
],
function (err,tAData) {
console.log(JSON.stringify(tAData,undefined,2))
}
);
Also simplifying your date inputs to startDate and endDate as pre determined date object values before the pipeline code. But the principles here show this is not a performant or very scalable approach, and mostly due to needing to put all results into an array and then deal with that to just get the values.
Doing it better
A much better approach is to send an aggregation query to the server for each date in the range, as date is what you want as the eventual key. Since you only return each "key" at once, it is easy to just apply $limit to restrict the response.
The ideal case is to perform these queries in parallel and then combine them. Fortunately the node async library provides an async.map or specifically async.mapLimit which performs this function exactly:
N.B You don't want async.mapSeries for the best performance since queries are "serially executed in order" and that means only one operation occurs on the server at a time. The results are array ordered, but it's going to take longer. A client sort makes more sense here.
var dates = [],
returnLimit = 2,
OneDay = 1000 * 60 * 60 * 24;
// Produce an array for each date in the range
for (
var myDate = startDate;
myDate < endDate;
myDate = new Date( startDate.valueOf() + OneDay )
) {
dates.push(myDate);
}
async.mapLimit(
dates,
10,
function (date,callback) {
UserActivity.aggregate(
[
{ "$match": {
"user_id": { "$in": user_id },
"tracker_id": { "$in": getTrackerId },
"date": {
"$gte": date,
"$lt": new Date( date.valueOf() + OneDay )
}
}},
{ "$unwind": "$duration" },
{ "$group": {
"_id": {
"tracker_id": "$tracker_id",
"date": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
OneDay
]}
]},
new Date(0)
]
},
"val": { "$sum": "$duration" }
}
}},
{ "$sort": { "_id": 1, "val": -1 } },
{ "$limit": returnLimit },
{ "$group": {
"_id": "$_id.date",
"resultData": {
"$push": {
"tracker_id": "$_id.tracker_id",
"val": "$val"
}
}
}}
],
function (err,result) {
callback(err,result[0]);
}
);
},
function (err,results) {
if (err) throw err;
results.sort(function (a,b) {
return a._id > b._id;
});
console.log( JSON.stringify( results, undefined, 2 ) );
}
);
Now that is a much cleaner listing and a lot more efficient and scalable than the first approach. By issuing each aggregation per single date and then combining the results, the "limit" there allows up to 10 queries to execute on the server at the same time ( tune to your needs ) and ultimately return a singular response.
Since these are "async" and not performed in series ( the best performance option ) then you just need to sort the returned array as is done in the final block:
results.sort(function (a,b) {
return a._id > b._id;
});
And then everything is ordered as you would expect in the response.
Forcing the aggregation pipeline to do this where it really is not necessary is a sure path to code that will fail in the future if it does not already do so now. Parallel query operations and combining the results "just makes sense" for efficient and scalable output.
Also note that you should not use $lte with range selections on dates. Even if you though about it, the better approach is "startDate" with "endDate" being the next "whole day" ( start ) after the range you want. This makes a cleaner distinction on the selection that say "the last second of the day".