MongoDB - Calculate field based on previous item - mongodb

I have a tricky scenario - I need to calculate extra field, based on value from previous field. I have no idea how to do it in performant manner. Any thoughts?
Data:
{
_id: 1,
score: 66,
created_at: "2021-04-01"
},
{
_id: 2,
score: 12,
created_at: "2021-04-03"
},
{
_id: 3,
score: 7,
created_at: "2021-04-06"
}
What I want to achieve
{
_id: 1,
score: 66,
total_score: 66 // The most oldest item, so total score is the same as current score
created_at: "2021-04-01"
},
{
_id: 2,
score: 12,
total_score: 78 // Sum of current score and previous total_score
created_at: "2021-04-03"
},
{
_id: 3,
score: 7,
total_score: 85 // Sum of current score and previous total_score
created_at: "2021-04-06"
}
Any insights appreciated.

You can try aggregation query,
$lookup with pipeline and match greater than query by _id to select the previous record
$group by null and get the sum of score
$arrayElemAt to get first element from lookup result
$ifNull to check if the result is null then return 0 otherwise return the current value
$add to get total of score and return score from lookup
db.collection.aggregate([
{
$lookup: {
from: "collection",
let: { id: "$_id" },
pipeline: [
{ $match: { $expr: { $gt: ["$$id", "$_id"] } } },
{
$group: {
_id: null,
score: { $sum: "$score" }
}
}
],
as: "total_score"
}
},
{
$addFields: {
total_score: {
$add: [
"$score",
{
$ifNull: [
{ $arrayElemAt: ["$total_score.score", 0] },
0
]
}
]
}
}
}
])
Playground

Related

I want to exclude results that contain a specific element from grouped results, how can I do that in mongoDB?

{
_id: ObjectId
details: array
0: Object
salesManagerId: ObjectId
1: Object
salaesMangerId: Object
createdAt: Date
},
{
_id: ObjectId
details: array
0: Object
salesManagerId: ObjectId
createdAt: Date
},
{
_id: ObjectId
details: array //no elements
createdAt: Date
}
The data type is as follows.
What I want is,
_id A B C
20211020 30 11 8
20211019 15 14 11
20211018 23 3 0
A: It should be output how many _ids are included by date grouped by date.
B: The number of elements in details in A. If it has element, count 1. not 0.
If the document is as follows, the value counted after excluding from A becomes B
{
_id: ObjectId
details: array //no elements
createdAt: Date
}
C:
The count of B becomes C, except when there are specific details.slaesManagerIds among B. details.salesManagerIds is provided as an array. For examples, [objectId(".."), ObjectId("..")]
I made query as follows.
db.sales.aggregate([
{
$group: {
_id: {
$dateToString: {
format: "%Y-%m-%d",
date: "$createdAt"
}
},
A: {
$sum: 1
},
B: {
$sum: {
$cond: [
{
$and: [
{
$isArray: "$details"
},
{
$gt: [
{
$size: "$details"
},
0
]
}
]
},
1,
0
]
}
}
}
},
{
$sort: {
_id: -1
}
}
])
It goes well until B. How can I write a query to get C ?
If you want to exclude a document if it has a certain nested field, you can use $exists operator.
Example:
You have these documents, and you want the ones that don't have salesManagerId as a nested field in details:
{
_id: ObjectId
details: array
0: Object
salesManagerId: ObjectId
createdAt: Date
},
{
_id: ObjectId
details: array //no elements
createdAt: Date
}
This is should be your filter:
{"details.salesManagerIds": {$exists: false}
This is how I would go about your query above:
db.sales.aggregate([
{
$group: {
_id: {
$dateToString: {
format: "%Y-%m-%d",
date: "$createdAt",
},
},
A: {
$sum: 1,
},
B: {
$sum: {
$cond: [
{
$and: [
{
$isArray: "$details",
},
{
$gt: [
{
$size: "$details",
},
0,
],
},
],
},
1,
0,
],
},
},
c: {
$sum: {
$cond: [{ "details.salesManagerId": { $exists: false } }, 1, 0],
},
},
},
},
{
$sort: {
_id: -1,
},
},
]);

Mongodb aggregation , group by items for the last 5 days

I'm trying to get the result in some form using mongodb aggregation.
here is my sample document in the collection:
[{
"_id": "34243243243",
"workType": "TESTWORK1",
"assignedDate":ISODate("2021-02-22T00:00:00Z"),
"status":"Completed",
},
{
"_id": "34243243244",
"workType": "TESTWORK2",
"assignedDate":ISODate("2021-02-21T00:00:00Z"),
"status":"Completed",
},
{
"_id": "34243243245",
"workType": "TESTWORK3",
"assignedDate":ISODate("2021-02-20T00:00:00Z"),
"status":"InProgress",
}...]
I need to group last 5 days data in an array by workType count having staus completed.
Expected result:
{_id: "TESTWORK1" , value: [1,0,4,2,3] ,
_id: "TESTWORK2" , value: [3,9,,3,5],
_id : "TESTWORK3", value: [,,,3,5]}
Here is what I'm trying to do, but not sure how to get the expected result.
db.testcollection.aggregate([
{$match:{"status":"Completed"}},
{$project: {_id:0,
assignedSince:{$divide:[{$subtract:[new Date(),$assignedDate]},86400000]},
workType:1
}
},
{$match:{"assignedSince":{"lte":5}}},
{$group : { _id:"workType", test :{$push:{day:"$assignedSince"}}}}
])
result: {_id:"TESTWORK1": test:[{5},{3}]} - here I'm getting the day , but I need the count of the workTypes on that day.
Is there any easy way to do this? Any help would be really appreciated.
Try this:
db.testcollection.aggregate([
{
$match: { "status": "Completed" }
},
{
$project: {
_id: 0,
assignedDate: 1,
assignedSince: {
$toInt: {
$divide: [{ $subtract: [new Date(), "$assignedDate"] }, 86400000]
}
},
workType: 1
}
},
{
$match: { "assignedSince": { "$lte": 5 } }
},
{
$group: {
_id: {
workType: "$workType",
assignedDate: "$assignedDate"
},
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.workType",
values: { $push: "$count" }
}
}
]);

MongoDB Query Performance using $unwind

I have this MongoDB document structure:
_id: ObjectID('1234')
email: 'myEmail#gmail.com'
score: [20, 34, 45, 90...]
I want the global average. Across all documents and this is what I came up with:
const globalMeta = await tests.aggregate([
{
$project: {
_id: 0,
score: 1
}
},
{
$unwind: "$score"
},
{
$group: {
_id: null,
$avg: "$score"
}
},
{
$project: {
_id: 0,
average: 1
}
}
]).toArray()
And the result is something like: average: 75
I am really concerned about performance. Would this method work if say, you had 1000's of documents?
Try this(not the same but can give you some ideas):
> db.tests.aggregate([
{ $project: { scores: { $avg: "$score"}, _id:0 } },
{ $group:{ _id:"Total" , FinalScoreAvg:{ $avg:"$scores"} }}
])

How to group MongoDB data by property?

Similar questions have been asked multiple times, but I couldn't find an answer for a problem like mine.
My data looks like this:
{ _id: 1, status: 'unpaid', subtotal: 5000, total: 4750, fees: 250 },
{ _id: 2, status: 'received', subtotal: 5000, total: 4750, fees: 250 },
{ _id: 3, status: 'paidout', subtotal: 5000, total: 4750, fees: 250 },
{ _id: <id>, status: 'paidout', subtotal: 5000, total: 4750, fees: 250 },
{ _id: <id>, status: 'unpaid', subtotal: 5000, total: 4750, fees: 250 }
What I am looking to achieve is, I want to get a sum of total for all records grouped by status and if the status is paidout then the sum will be of subtotal, and also the latest record with the status paidout.
This is my code right now:
const totals = await Payment.aggregate([
{
$match: {
_user: req.user._id,
}
},
{
$group: {
_id: '$status',
total: {
$sum: '$subtotal',
},
paidout: {
$sum: '$total'
},
lastPayout: {
$first: '$total'
}
}
}
])
This is the returned result:
[{ _id: 'paidout', total: 102000, paidout: 97962, lastPayout: 52825 },
{ _id: 'received', total: 60000, paidout: 57630, lastPayout: 57630 }]
Not quite the format I was looking for, so any help would be really appreciated!
Query :
db.collection.aggregate([
/** Checks a conditions and sum-up on either `subtotal` or `total` field */
/** Checks a conditions and adds last doc in iteration if `status is not paidout` then `latestDoc will be null` */
{
$group: {
_id: "$status",
totalAmount: { $sum: { $cond: [ { $eq: [ "$status", "paidout" ] }, "$subtotal", "$total" ] } },
latestDoc: { $last: { $cond: [ { $eq: [ "$status", "paidout" ] }, "$$ROOT", "$$REMOVE" ] } }
}
},
/** stage to remove `latestDoc` field where if it's `null` */
{ $addFields: { latestDoc: { $ifNull: [ "$latestDoc", "$$REMOVE" ] } } }
])
Test : mongoplayground
Note :
Getting the latest document latestDoc is working based on the order documents got inserted. But if your field status gets updated for existing documents then if you wanted to get latest document where status : 'paidout' maintain a timestamp field on which you need to sort prior to $group and then push last document to latestDoc field.

Top documents per bucket

I would like to get the documents with the N highest fields for each of N categories. For example, the posts with the 3 highest scores from each of the past 3 months. So each month would have 3 posts that "won" for that month.
Here is what my work so far has gotten, simplified.
// simplified
db.posts.aggregate([
{$bucket: {
groupBy: "$createdAt",
boundaries: [
ISODate('2019-06-01'),
ISODate('2019-07-01'),
ISODate('2019-08-01')
],
default: "Other",
output: {
posts: {
$push: {
// ===
// This gets all the posts, bucketed by month
score: '$score',
title: '$title'
// ===
}
}
}
}},
{$match: {_id: {$ne: "Other"}}}
])
I attempted to use the $slice operator in between the // ===s, but go an error (below).
postResults: {
$each: [{
score: '$score',
title: '$title'
}],
$sort: {score: -1},
$slice: 2
}
An object representing an expression must have exactly one field: { $each: [ { score: \"$score\", title: \"$title\" } ], $sort: { baseScore: -1.0 }, $slice: 2.0 }
$slice you're trying to use is dedicated for update operations. To get top N posts you need to run $unwind, then $sort and $group to get ordered array. As a last step you can use $slice (aggregation), try:
db.posts.aggregate([
{$bucket: {
groupBy: "$createdAt",
boundaries: [
ISODate('2019-06-01'),
ISODate('2019-07-08'),
ISODate('2019-08-01')
],
default: "Other",
output: {
posts: {
$push: {
score: '$score',
title: '$title'
}
}
}
}},
{ $match: {_id: {$ne: "Other"}}},
{ $unwind: "$posts" },
{ $sort: { "posts.score": -1 } },
{ $group: { _id: "$_id", posts: { $push: { "score": "$posts.score", "title": "$posts.title" } } } },
{ $project: { _id: 1, posts: { $slice: [ "$posts", 3 ] } } }
])