MongoDB Query Performance using $unwind - mongodb

I have this MongoDB document structure:
_id: ObjectID('1234')
email: 'myEmail#gmail.com'
score: [20, 34, 45, 90...]
I want the global average. Across all documents and this is what I came up with:
const globalMeta = await tests.aggregate([
{
$project: {
_id: 0,
score: 1
}
},
{
$unwind: "$score"
},
{
$group: {
_id: null,
$avg: "$score"
}
},
{
$project: {
_id: 0,
average: 1
}
}
]).toArray()
And the result is something like: average: 75
I am really concerned about performance. Would this method work if say, you had 1000's of documents?

Try this(not the same but can give you some ideas):
> db.tests.aggregate([
{ $project: { scores: { $avg: "$score"}, _id:0 } },
{ $group:{ _id:"Total" , FinalScoreAvg:{ $avg:"$scores"} }}
])

Related

MongoDB add grand total to sortByCount() in an aggregation pipeline

I have grouped all the users by country, but I would also like to have a row showing the grand total (users are tagged to a single country in our use case).
Data Model / Sample Input
The collection is filled with objects representing a country (name) and each contains a list of user objects in an array under users.
{ _id: ObjectId("..."),
name: 'SG',
type: 'COUNTRY',
increment: 200,
users:
[ ObjectId("..."),
ObjectId("..."),
...
Query
db.collection.aggregate([{$match:{type:"COUNTRY"}},{$unwind:"$users"},{$sortByCount:"$name"}])
Current Results
{ _id: 'SG', count: 76 }
{ _id: 'IN', count: 6 }
{ _id: 'US', count: 4 }
{ _id: 'FR', count: 3 }
{ _id: 'UK', count: 2 }
{ _id: 'RU', count: 1 }
{ _id: 'CO', count: 1 }
{ _id: 'DK', count: 1 }
{ _id: 'ID', count: 1 }
{ _id: 'PH', count: 1 }
Expected Results
{ _id: 'SG', count: 76 }
{ _id: 'IN', count: 6 }
{ _id: 'US', count: 4 }
{ _id: 'FR', count: 3 }
{ _id: 'UK', count: 2 }
{ _id: 'RU', count: 1 }
{ _id: 'CO', count: 1 }
{ _id: 'DK', count: 1 }
{ _id: 'ID', count: 1 }
{ _id: 'PH', count: 1 }
{ _id: null, count: 96 } <<< TOTAL COUNT ADDED
Any tips to achieve this without resorting to complex or dirty tricks?
You can also try using $facet to calculate counts by country name and total count, and then combine them together. Something like this:
db.collection.aggregate([
{
$match: {
type: "COUNTRY"
}
},
{
"$unwind": "$users"
},
{
"$facet": {
"groupCountByCountry": [
{
"$sortByCount": "$name"
}
],
"totalCount": [
{
"$group": {
"_id": null,
"count": {
"$sum": 1
}
}
}
]
}
},
{
"$project": {
array: {
"$concatArrays": [
"$groupCountByCountry",
"$totalCount"
]
}
}
},
{
"$unwind": "$array"
},
{
"$replaceRoot": {
"newRoot": "$$ROOT.array"
}
}
])
Here's the playground link.
I recommend just doing this in memory as the alternative is "hacky" but in order to achieve this in Mongo you just need to group all documents, add a new documents and unwind again, like so:
db.collection.aggregate([
{
$group: {
_id: null,
roots: {
$push: "$$ROOT"
},
sum: {
$sum: "$count"
}
}
},
{
$addFields: {
roots: {
"$concatArrays": [
"$roots",
[
{
_id: null,
count: "$sum"
}
]
]
}
}
},
{
$unwind: "$roots"
},
{
$replaceRoot: {
newRoot: "$roots"
}
}
])
Mongo Playground

MongoDB - Calculate field based on previous item

I have a tricky scenario - I need to calculate extra field, based on value from previous field. I have no idea how to do it in performant manner. Any thoughts?
Data:
{
_id: 1,
score: 66,
created_at: "2021-04-01"
},
{
_id: 2,
score: 12,
created_at: "2021-04-03"
},
{
_id: 3,
score: 7,
created_at: "2021-04-06"
}
What I want to achieve
{
_id: 1,
score: 66,
total_score: 66 // The most oldest item, so total score is the same as current score
created_at: "2021-04-01"
},
{
_id: 2,
score: 12,
total_score: 78 // Sum of current score and previous total_score
created_at: "2021-04-03"
},
{
_id: 3,
score: 7,
total_score: 85 // Sum of current score and previous total_score
created_at: "2021-04-06"
}
Any insights appreciated.
You can try aggregation query,
$lookup with pipeline and match greater than query by _id to select the previous record
$group by null and get the sum of score
$arrayElemAt to get first element from lookup result
$ifNull to check if the result is null then return 0 otherwise return the current value
$add to get total of score and return score from lookup
db.collection.aggregate([
{
$lookup: {
from: "collection",
let: { id: "$_id" },
pipeline: [
{ $match: { $expr: { $gt: ["$$id", "$_id"] } } },
{
$group: {
_id: null,
score: { $sum: "$score" }
}
}
],
as: "total_score"
}
},
{
$addFields: {
total_score: {
$add: [
"$score",
{
$ifNull: [
{ $arrayElemAt: ["$total_score.score", 0] },
0
]
}
]
}
}
}
])
Playground

Mongodb aggregation , group by items for the last 5 days

I'm trying to get the result in some form using mongodb aggregation.
here is my sample document in the collection:
[{
"_id": "34243243243",
"workType": "TESTWORK1",
"assignedDate":ISODate("2021-02-22T00:00:00Z"),
"status":"Completed",
},
{
"_id": "34243243244",
"workType": "TESTWORK2",
"assignedDate":ISODate("2021-02-21T00:00:00Z"),
"status":"Completed",
},
{
"_id": "34243243245",
"workType": "TESTWORK3",
"assignedDate":ISODate("2021-02-20T00:00:00Z"),
"status":"InProgress",
}...]
I need to group last 5 days data in an array by workType count having staus completed.
Expected result:
{_id: "TESTWORK1" , value: [1,0,4,2,3] ,
_id: "TESTWORK2" , value: [3,9,,3,5],
_id : "TESTWORK3", value: [,,,3,5]}
Here is what I'm trying to do, but not sure how to get the expected result.
db.testcollection.aggregate([
{$match:{"status":"Completed"}},
{$project: {_id:0,
assignedSince:{$divide:[{$subtract:[new Date(),$assignedDate]},86400000]},
workType:1
}
},
{$match:{"assignedSince":{"lte":5}}},
{$group : { _id:"workType", test :{$push:{day:"$assignedSince"}}}}
])
result: {_id:"TESTWORK1": test:[{5},{3}]} - here I'm getting the day , but I need the count of the workTypes on that day.
Is there any easy way to do this? Any help would be really appreciated.
Try this:
db.testcollection.aggregate([
{
$match: { "status": "Completed" }
},
{
$project: {
_id: 0,
assignedDate: 1,
assignedSince: {
$toInt: {
$divide: [{ $subtract: [new Date(), "$assignedDate"] }, 86400000]
}
},
workType: 1
}
},
{
$match: { "assignedSince": { "$lte": 5 } }
},
{
$group: {
_id: {
workType: "$workType",
assignedDate: "$assignedDate"
},
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.workType",
values: { $push: "$count" }
}
}
]);

How can I get a sum of sums using MongoDB aggregation?

I would like to get the total (sum) of the totalHoursForYear value. Here is my current aggregation:
const byYear = await this.workHistoryModel.aggregate([
{
$group: {
_id: '$year',
history: {
$push: '$$ROOT'
},
totalHoursForYear: {
$sum: {
$add: [
{ $multiply: ['$eightHourDays', 8] },
{ $multiply: ['$tenHourDays', 10] },
{ $multiply: ['$twelveHourDays', 12] },
]
}
},
}
},
]);
Now I need to get a sum of the totalHoursForYear to get the total for all years. Is this possible? I can't seem to get the syntax correct. When I attempt to add another sum:
$group: {
...,
totalHours: {
$sum: {
$add: '$totalHoursForYear'
},
$push: '$$ROOT'
}
}
I get an error: "The field 'totalHours' must specify one accumulator"
When you need total sum, you can do another $group with _id:null which means to consider all documents
With your script, add the following to get the total and get the same structure.
{
$group: {
_id: null,
data: {
$push: "$$ROOT"
},
total: {
$sum: "$totalHoursForYear"
}
}
},
{
"$unwind": "$data"
},
{
"$addFields": {
_id: "$data._id",
history: "$data.history",
totalHoursForYear: "$data.totalHoursForYear",
_id: "$$REMOVE",
data: "$$REMOVE"
}
}
Working Mongo playground

Sum value when satisfy condition in MongoDB

I am trying to get sum of values when certain condition is satisfied in the document.
In the below query i want to get sum of currentValue only when componentId = "ABC"
db.Pointnext_Activities.aggregate(
{ $project: {
_id: 0,
componentId:1,
currentValue:1
}
},
{ $group:
{ _id: "$componentId",
total: { $sum: "$currentValue" }
}
}
)
Please try this :
db.Pointnext_Activities.aggregate([{ $match: { componentId: 'ABC' } },
{
$group:
{
_id: "$componentId",
total: { $sum: "$currentValue" }
}
}, { $project: { 'componentId': '$_id', total: 1, _id: 0 } }])
If you just need the total value & doesn't care about componentId to be returned try this :
db.Pointnext_Activities.aggregate([{ $match: { componentId: 'ABC' } },
{
$group:
{
_id: "",
total: { $sum: "$currentValue" }
}
}, {$project :{total :1, _id:0}}])
It would be ideal in aggregation, if you always start with filter operation i.e; $match, as it would persist only needed documents for further steps.