UseCase: I have the following data:
{"accountNumber":"1-1", "details":["version":{ "number": "1","accountGroup":"1", "editable":"false" , "amount":100 }]}
{"accountNumber":"1-2", "details":[version":{ "number": "2", "accountGroup":"1", "editable":"false" , "amount":200}]}
{"accountNumber":"2-1", "details":[version":{ "number": "1", "accountGroup":"2", "editable":"false", "amount":200 }]}
Where: my document is account. Each record has a accountGroup (1, 2). A group can have multiple versions. AccountNumber is being initialized by the combination of AccountGroup & version
I want to get the latest version of the account (accountNumber 1-2 & 2-1) along with the sum of their amount.
Expected output:
{accountNumber:2-1}, {accountNumber: 1-2}, total: 400 (sum of amount of the latest versions of the account group)
I am using the following query:
db.getCollection('account').aggregate([
{ "$sort": { "accountNumber": 1 } },
{ "$unwind": "$details"},
{ "$group": {
"_id": "$details.version.accountGroup",
"Latestversion": { "$last": "$$ROOT" },
"total": {
$sum: "$details.version.amount"
}
}
}])
It gets the sum of the all the versions which belongs to a group.
Current output:
{"accountNumber": "1-2", total: 300}, {"accountNumber":"2-1", total: 200}
I am new to Mongodb, any help is appreciated. Looking forward for a response.
You will need two $group stages.
First $group to find the latest document for each account group and second $group to sum amount from latest document.
Something like
aggregate([
{ "$sort": { "accountNumber": 1 } },
{ "$unwind": "$details"},
{ "$group": {
"_id": "$details.version.accountGroup",
"latest": { "$last": "$$ROOT" }
}
},
{ "$group": {
"_id": null,
"accountNumbers": { $push:"$latest.accountNumber" },
"total": { $sum: "$latest.details.version.amount" }
}
}
])
You can update your structure to below and remove $unwind.
{"accountNumber":"1-1", detail:{"number": "1","accountGroup":"1", "editable":"false" , "amount":100 }}
Related
I have a mongodb collection:
{
_id: ... ,
userId: ...,
gadgets: 3
}
I need to find all users who have at least 5 records in the collection. For every such user, I then need to sort the number of gadgets in descending order and get the 5th value. And, I am still at mongodb 4.4 and so, can't do $sortArray.
For example, if user has records with 14, 10, 7, 7, 5 gadgets (in any order), result should be:
{
userId: ....
qualifyingGadgets: 5
}
I don't have a MongoDB server at version 4.4 to test, but I think you'll have all these aggregation operations available.
db.users.aggregate([
{
"$group": {
"_id": "$userId",
"gadgets": {"$push": "$gadgets"}
}
},
{
"$match": {
"$expr": {
"$gte": [{"$size": "$gadgets"}, 5]
}
}
},
{
"$unwind": "$gadgets"
},
{
"$sort": {"gadgets": -1}
},
{
"$group": {
"_id": "$_id",
"gadgets": {"$push": "$gadgets"}
}
},
{
"$project": {
"_id": 0,
"userId": "$_id",
"qualifyingGadgets": {"$arrayElemAt": ["$gadgets", 4]}
}
}
])
Try it on mongoplayground.net.
I have a collection billow and i need to find date wise total cost and sum of all cost available in this collection. I can find total cost of a day but failed to get sum of all cost from the collection
[{
"date":"12-2-2015",
"cost":100
},
{
"date":"13-2-2015",
"cost":10
},
{
"date":"12-2-2015",
"cost":40
},
{
"date":"13-2-2015",
"cost":30
},
{
"date":"13-2-2015",
"cost":80
}]
I can find output like
[{
"day": "12-2-2015",
"cost": 140
},{
"day": "13-2-2015",
"cost": 120
}]
But I want output like this.
{
"day": "12-2-2015",
"cost": 140,
"total": 260
}
use this aggregate I dont add $match stage you could add to match date
db.collection.aggregate([
{
$group: {
_id: null,
orig: {
$push: "$$ROOT"
},
"total": {
$sum: "$cost"
},
}
},
{
$unwind: "$orig"
},
{
$project: {
date: "$orig.date",
cost: "$orig.cost",
total: "$total"
}
},
{
$group: {
_id: "$date",
cost: {
$sum: "$cost"
},
orig: {
$push: "$$ROOT.total"
}
},
},
{
"$unwind": "$orig"
},
{
$group: {
_id: {
_id: "$_id",
cost: "$cost",
total: "$orig"
},
},
},
{
$project: {
date: "$_id._id",
"cost": "$_id.cost",
total: "$_id.total",
_id: 0
}
}
])
https://mongoplayground.net/p/eN-pDg2Zz7u
It is like 2 queries.
There are 3 solutions that i can think of
2 queries (works no matter the collection size)
1 query and facet (the bellow solution)
group and pack each group in an array
(limitation = ngroups(distinct day dates) small enough to fit in 1 array 16MB distinct dates,
(which is true for like 200.000? distinct days see this)
1 query no facet
for example group and pack all collection into 1 array
(limitation = all collection must fit in 100MB memory
because of $push see this)
*for the limits i think they are like that, based on what i have understanded.
Query
Test code here
db.collection.aggregate([
{
"$facet": {
"total": [
{
"$group": {
"_id": null,
"total": {
"$sum": "$cost"
}
}
}
],
"coll": [
{
"$group": {
"_id": "$date",
"cost": {
"$sum": "$cost"
}
}
}
]
}
},
{
"$unwind": {
"path": "$coll"
}
},
{
"$project": {
"total": {
"$let": {
"vars": {
"t": {
"$arrayElemAt": [
"$total",
0
]
}
},
"in": "$$t.total"
}
},
"date": "$coll._id",
"cost": "$coll.cost"
}
}
])
I would do one query to get a cursor, then iterate the cursor and at the same time sum the total cost and push the relevant doc, then add the total to each group. In this way you perform only one query to mongodb and let your server do the rest while keeping the code simple.
// 1. Fetch the groups
const grouped = db.data.aggregate([
{ $group: {
_id: "$date",
cost: { $sum: "$cost" }
}}
]);
// 2. Iterate the cursor, push the results into an array while summing the total cost
let total = 0;
const result = [];
grouped.forEach(group => {
total += group.cost;
result.push(group); // push as much as your limit
});
// 3. Add total to each group
result.forEach(group => group.total = total);
So i have this json file:
{"_id":190,"name":"Adrien Renda","scores":[{"score":64.16109192679477,"type":"exam"},{"score":66.93730600935531,"type":"quiz"},{"score":96.0560340227047,"type":"homework"}]}
{"_id":191,"name":"Efrain Claw","scores":[{"score":94.67153825229884,"type":"exam"},{"score":82.30087932110595,"type":"quiz"},{"score":75.86075840047938,"type":"homework"}]}
{"_id":192,"name":"Len Treiber","scores":[{"score":39.19832917406515,"type":"exam"},{"score":98.71679252899352,"type":"quiz"},{"score":44.8228929481132,"type":"homework"}]}
{"_id":193,"name":"Mariela Sherer","scores":[{"score":47.67196715489599,"type":"exam"},{"score":41.55743490493954,"type":"quiz"},{"score":70.4612811769744,"type":"homework"}]}
{"_id":194,"name":"Echo Pippins","scores":[{"score":18.09013691507853,"type":"exam"},{"score":35.00306967250408,"type":"quiz"},{"score":80.17965154316731,"type":"homework"}]}
{"_id":195,"name":"Linnie Weigel","scores":[{"score":52.44578368517977,"type":"exam"},{"score":90.7775054046383,"type":"quiz"},{"score":11.75008382913026,"type":"homework"}]}
{"_id":196,"name":"Santiago Dollins","scores":[{"score":52.04052571137036,"type":"exam"},{"score":33.63300076481705,"type":"quiz"},{"score":78.79257377604428,"type":"homework"}]}
{"_id":197,"name":"Tonisha Games","scores":[{"score":38.51269589995049,"type":"exam"},{"score":31.16287577231703,"type":"quiz"},{"score":79.15856355963004,"type":"homework"}]}
{"_id":198,"name":"Timothy Harrod","scores":[{"score":11.9075674046519,"type":"exam"},{"score":20.51879961777022,"type":"quiz"},{"score":64.85650354990375,"type":"homework"}]}
{"_id":199,"name":"Rae Kohout","scores":[{"score":82.11742562118049,"type":"exam"},{"score":49.61295450928224,"type":"quiz"},{"score":28.86823689842918,"type":"homework"}]}
in a mongodb collection. And i'm trying to read the maximum and minimum score of the last 5 students and display them. I'm using mongolite in r studio and i've tried this:
res2 = con$aggregate(
'[{"$group":{"_id": "$_id", "MaxScore": {"$max": "$scores.score"}, "MinScore":{"$min":"$scores.score"}}},
{ "$sort" : { "_id" : -1} },
{"$limit": 5}
]'
)
The sorting and limit work just fine but the scores come out wrong. I'm guessing because they're embedded documents but i have no idea how to fix it.
This is the end result of the above command
You don't need to perform $group query to calculate $max / $min scores, you can calculate them during $project stage
db.collection.aggregate([
{
"$project": {
"_id": 1,
"MaxScore": {
"$max": "$scores.score"
},
"MinScore": {
"$min": "$scores.score"
}
}
},
{
"$sort": {
"_id": -1
}
},
{
"$limit": 5
}
])
MongoPlayground
If you want $group code working, just add before $group stage $unwind operator like below:
db.collection.aggregate([
{
$unwind: "$scores"
},
{
$group: {
_id: "$_id",
MaxScore: {
$max: "$scores.score"
},
MinScore: {
$min: "$scores.score"
}
}
},
{
"$sort": {
"_id": -1
}
},
{
"$limit": 5
}
])
MongoPlayground
I'm learning aggregate in mongodb. I'm working with the collection:
{
"body" : ""
,
"email" : "oJJFLCfA#qqlBNdpY.com",
"author" : "Linnie Weigel"
},
{
"body" : ""
,
"email" : "ptHfegMX#WgxhlEeV.com",
"author" : "Dinah Sauve"
},
{
"body" : ""
,
"email" : "kfPmikkG#SBxfJifD.com",
"author" : "Zachary Langlais"
}
{
"body" : ""
,
"email" : "gqEMQEYg#iiBqZCez.com",
"author" : "Jesusa Rickenbacker"
}
]
I try to obtain the number of body of each author. But when I execute the command sum of aggregate mongodb, the result is 1(because the structure has only one element) . How can I do this operation?. I tried with $addToSet. But I don't know how to obtain each element of collection and to do the operation.
In order to count the comments by each author you want to $group by that author and $sum the occurrences. Basically just a "$sum: 1" operation. But it seems like you have "comments" as an array here based on your own comments and the closing bracket on your partial data listing. For that you need to process with $unwind first:
db.collection.aggregate([
{ "$unwind": "$comments" },
{ "$group": {
"_id": "$comments.author",
"count": { "$sum": 1 }
}}
])
That will obtain the total of all author comments by author for the entire collection. If you were just after getting the total comments by author per document ( or what looks like a blog post model ) then you use the document _id as part of the group statement:
db.collection.aggregate([
{ "$unwind": "$comments" },
{ "$group": {
"_id": {
"_id": "$_id"
"author": "$comments.author"
},
"count": { "$sum": 1 }
}}
])
And if you then want the summary of author counts per document with just a single document returned with all the authors in an array, then use $addToSet from here, with another $group pipeline stage:
db.collection.aggregate([
{ "$unwind": "$comments" },
{ "$group": {
"_id": {
"_id": "$_id"
"author": "$comments.author"
},
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id._id",
"comments": {
"$addToSet": {
"author": "$_id.author",
"count": "$count"
}
}
}}
])
But really, the author values are already unique and "sets" are not ordered in any way, so you might change this using $push after first introducing a $sort to have the list ordered by the number of comments made:
db.collection.aggregate([
{ "$unwind": "$comments" },
{ "$group": {
"_id": {
"_id": "$_id"
"author": "$comments.author"
},
"count": { "$sum": 1 }
}},
{ "$sort": { "_id._id": 1, "count": -1 } },
{ "$group": {
"_id": "$_id._id",
"comments": {
"$push": {
"author": "$_id.author",
"count": "$count"
}
}
}}
])
I have MongoDB collection that stores documents in this format:
"name" : "Username",
"timeOfError" : ISODate("...")
I'm using this collection to keep track of who got an error and when it occurred.
What I want to do now is create a query that retrieves errors per user, per month or something similar. Something like this:
{
"result": [
{
"_id": "$name",
"errorsPerMonth": [
{
"month": "0",
"errorsThisMonth": 10
},
{
"month": "1",
"errorsThisMonth": 20
}
]
}
]
}
I have tried several different queries, but none have given the desired result. The closest result came from this query:
db.collection.aggregate(
[
{
$group:
{
_id: { $month: "$timeOfError"},
name: { $push: "$name" },
totalErrorsThisMonth: { $sum: 1 }
}
}
]
);
The problem here is that the $push just adds the username for each error. So I get an array with duplicate names.
You need to compound the _id value in $group:
db.collection.aggregate([
{ "$group": {
"_id": {
"name": "$name",
"month": { "$month": "$timeOfError" }
},
"totalErrors": { "$sum": 1 }
}}
])
The _id is essentially the "grouping key", so whatever elements you want to group by need to be a part of that.
If you want a different order then you can change the grouping key precedence:
db.collection.aggregate([
{ "$group": {
"_id": {
"month": { "$month": "$timeOfError" },
"name": "$name"
},
"totalErrors": { "$sum": 1 }
}}
])
Or if you even wanted to or had other conditions in your pipeline with different fields, just add a $sort pipeline stage at the end:
db.collection.aggregate([
{ "$group": {
"_id": {
"month": { "$month": "$timeOfError" },
"name": "$name"
},
"totalErrors": { "$sum": 1 }
}},
{ "$sort": { "_id.name": 1, "_id.month": 1 } }
])
Where you can essentially $sort on whatever you want.