Calculating the sum of specific fields from a complex array object - mongodb

I would like to migrate one of my FireBase projects to Mongo and move the calculations from server side to DB. I already wrote most of the queries but this one is beyond my knowledge.
Player data are saved by week and I need to calculate the sum of donations and points for each players (the rest of the fields should be ignored).
PS: Some of the players are already banned so it would be enough the calculate the fields for a given player set (like: tag in ['playerId1', 'playerId2', ...]). If it's too complex I will do this filtering later on server side.
[
{
"week":"2021-01",
"players":[
{
"donations":20,
"games":3,
"name":"Player1",
"points":258,
"tag":"playerId1"
},
{
"donations":37,
"games":5,
"name":"Player2",
"points":634,
"tag":"playerId2"
},
{ ... }
]
},
{
"week":"2021-02",
"players":[ { ... } ]
}
]
So the result should be something like this:
[
{
"name":"Player1",
"tag":"playerId1",
"donations":90,
"points":980
},
{
"name":"Player2",
"tag":"playerId2",
"donations":80,
"points":1211
}
]
I think the $unwind and the $group operators could be the key but I can't figure out how to use them properly here.

$unwind deconstruct players array
$group by name and get sum of donations and points and get first tag
$project to show required fields
db.collection.aggregate([
{ $unwind: "$players" },
{
$group: {
_id: "$players.name",
donations: { $sum: "$players.donations" },
points: { $sum: "$players.points" },
tag: { $first: "$players.tag" }
}
},
{
$project: {
_id: 0,
name: "$_id",
points: 1,
tag: 1,
donations: 1
}
}
])
Playground
PS: Some of the players are already banned so it would be enough the calculate the fields for a given player set (like: tag in ['playerId1', 'playerId2', ...]).
You can put match condition after $unwind stage,
{ $match: { "players.tag": { $in: ['playerId1', 'playerId2', ..more] } } }

You were right,
play
db.collection.aggregate([
{//Denormalize
"$unwind": "$players"
},
{//Group by name
"$group": {
"_id": "$players.name",
"donations": {
"$sum": "$players.donations"
},
"points": {
"$sum": "$players.points"
},
}
}
])
You can add project stage if you really need name as key than _id

Related

How do I sort results based on a specific array item in MongoDB?

I have an array of documents that looks like this:
patient: {
conditions: [
{
columnToSortBy: "value",
type: "PRIMARY"
},
{
columnToSortBy: "anotherValue",
type: "SECONDARY"
},
]
}
I need to be able to $sort by columnToSortBy, but using the item in the array where type is equal to PRIMARY. PRIMARY is not guaranteed to be the first item in the array every time.
How do I set my $sort up to accommodate this? Is there something akin to:
// I know this is invalid. It's for illustration purposes
$sort: "columnToSortBy", {$where: {type: "PRIMARY"}}
Is it possible to sort a field, but only when another field matches a query? I do not want the secondary conditions to affect the sort in any way. I am sorting on that one specific element alone.
You need to use aggregation framework
db.collection.aggregate([
{
$unwind: "$patient.conditions" //reshape the data
},
{
"$sort": {
"patient.conditions.columnToSortBy": -1 //sort it
}
},
{
$group: {
"_id": "$_id",
"conditions": { //re group it
"$push": "$patient.conditions"
}
}
},
{
"$project": { //project it
"_id": 1,
"patient.conditions": "$conditions"
}
}
])
Playground

Count nested wildcard array mongodb query

I have the following data of users and model cars:
[
{
"user_id":"ebebc012-082c-4e7f-889c-755d2679bdab",
"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d":1,
"car_37c04124-cb12-436c-902b-6120f4c51782":0,
"car_b78ddcd0-1136-4f45-8599-3ce8d937911f":1
},
{
"user_id":"f3eb2a61-5416-46ba-bab4-459fbdcc7e29",
"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d":1,
"car_0d15eae9-9585-4f49-a416-46ff56cd3685":1
}
]
I want to see how many users have a car_ with the value 1 using mongodb, something like:
{"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d": 2}
For this example.
The issue is that I will never know how are the fields car_ are going to be, they will have a random structure (wildcard).
Notes:
car_id and user_id are at the same level.
The car_id is not given, I simply want to know for the entire database which are the most commmon cars_ with value 1.
$group by _id and convert root object to array using $objectToArray,
$unwind deconstruct root array
$match filter root.v is 1
$group by root.k and get total count
db.collection.aggregate([
{
$group: {
_id: "$_id",
root: { $first: { $objectToArray: "$$ROOT" } }
}
},
{ $unwind: "$root" },
{ $match: { "root.v": 1 } },
{
$group: {
_id: "$root.k",
count: { $sum: 1 }
}
}
])
Playground

mongodb - $sort child documents only

When i do a find all in a mongodb collection, i want to get maintenanceList sorted by older maintenanceDate to newest.
The sort of maintenanceDate should not affect parents order in a find all query
{
"_id":"507f191e810c19729de860ea",
"color":"black",
"brand":"brandy",
"prixVenteUnitaire":200.5,
"maintenanceList":[
{
"cost":100.40,
"maintenanceDate":"2017-02-07T00:00:00.000+0000"
},
{
"cost":4000.40,
"maintenanceDate":"2019-08-07T00:00:00.000+0000"
},
{
"cost":300.80,
"maintenanceDate":"2018-08-07T00:00:00.000+0000"
}
]
}
Any guess how to do that ?
Thank you
Whatever order the fields are in with the previous pipeline stage, as operations like $project and $group effectively "copy" same position.So, it will not change the order of your fields in your aggregated result.
And the sort of maintenanceDate through aggregation will not affect parents order in a find all query.
So, simply doing this should work.
Assuming my collection name is example.
db.example.aggregate([
{
"$unwind": "$maintenanceList"
},
{
"$sort": {
"_id": 1,
"maintenanceList.maintenanceDate": 1
}
},
{
"$group": {
"_id": "$_id",
"color": {
$first: "$color"
},
"brand": {
$first: "$brand"
},
"prixVenteUnitaire": {
$first: "$prixVenteUnitaire"
},
"maintenanceList": {
"$push": "$maintenanceList"
}
}
}
])
Output:

MongoDB get different sum for each list per document

I have these documents in my collection
{
id:1,
small:[{k:'A',v:1},{k:'B',v:2},{k:'D',v:3}],
big:[{k:'A',v:2},{k:'B',v:3},{k:'C',v:1},{k:'D',v:4}]
},
{
id:2,
small:[{k:'A',v:1},{k:'B',v:2},{k:'D',v:3}],
big:[{k:'A',v:2},{k:'B',v:3},{k:'C',v:1},{k:'D',v:4}]
},
{
id:3,
small:[{k:'A',v:1},{k:'B',v:2},{k:'D',v:3}],
big:[{k:'A',v:2},{k:'B',v:3},{k:'C',v:1},{k:'D',v:4}]
}
Now, I want to get the sum for each key in both lists. I want my output to look like this:
{k:'A',small:3, big:6},
{k:'B',small:6, big:9},
{k:'D',small:9, big:12}
Notice that the output did not contain the key 'C'. This is because I only want to output the keys that are existing in the 'small' list. What mongodb functions
should I use for this?
Thanks!
Try below aggregation:
db.col.aggregate([
{ $unwind: "$small" },
{ $unwind: "$big" },
{ $redact: {
$cond: {
if: { $eq: [ "$small.k", "$big.k" ] },
then: "$$KEEP",
else: "$$PRUNE"
}
}
},
{
$group: { _id: "$small.k", small: { $sum: "$small.v" }, big: { $sum: "$big.v" } }
},
{
$sort: { "_id": 1 }
}
])
In general we need to have only one small and big in each document (that's why double $unwind). Then we want to keep only documents where keys are equal. That's the moment where C is filtered out - has no pair in small and we're utilizing $redact for that. Aggregation is just a $group with $sum.

mongodb aggregation framework group + project

I have the following issue:
this query return 1 result which is what I want:
> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } } }])
{
"result" : [
{
"_id" : "b91e51e9-6317-4030-a9a6-e7f71d0f2161",
"version" : 1.2000000000000002
}
],
"ok" : 1
}
this query ( I just added projection so I can later query for the entire document) return multiple results. What am I doing wrong?
> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } }, $project: { _id : 1 } }])
{
"result" : [
{
"_id" : ObjectId("5139310a3899d457ee000003")
},
{
"_id" : ObjectId("513931053899d457ee000002")
},
{
"_id" : ObjectId("513930fd3899d457ee000001")
}
],
"ok" : 1
}
found the answer
1. first I need to get all the _ids
db.items.aggregate( [
{ '$match': { 'owner.id': '9e748c81-0f71-4eda-a710-576314ef3fa' } },
{ '$group': { _id: '$item.id', dbid: { $max: "$_id" } } }
]);
2. then i need to query the documents
db.items.find({ _id: { '$in': "IDs returned from aggregate" } });
which will look like this:
db.items.find({ _id: { '$in': [ '1', '2', '3' ] } });
( I know its late but still answering it so that other people don't have to go search for the right answer somewhere else )
See to the answer of Deka, this will do your job.
Not all accumulators are available in $project stage. We need to consider what we can do in project with respect to accumulators and what we can do in group. Let's take a look at this:
db.companies.aggregate([{
$match: {
funding_rounds: {
$ne: []
}
}
}, {
$unwind: "$funding_rounds"
}, {
$sort: {
"funding_rounds.funded_year": 1,
"funding_rounds.funded_month": 1,
"funding_rounds.funded_day": 1
}
}, {
$group: {
_id: {
company: "$name"
},
funding: {
$push: {
amount: "$funding_rounds.raised_amount",
year: "$funding_rounds.funded_year"
}
}
}
}, ]).pretty()
Where we're checking if any of the funding_rounds is not empty. Then it's unwind-ed to $sort and to later stages. We'll see one document for each element of the funding_rounds array for every company. So, the first thing we're going to do here is to $sort based on:
funding_rounds.funded_year
funding_rounds.funded_month
funding_rounds.funded_day
In the group stage by company name, the array is getting built using $push. $push is supposed to be part of a document specified as the value for a field we name in a group stage. We can push on any valid expression. In this case, we're pushing on documents to this array and for every document that we push it's being added to the end of the array that we're accumulating. In this case, we're pushing on documents that are built from the raised_amount and funded_year. So, the $group stage is a stream of documents that have an _id where we're specifying the company name.
Notice that $push is available in $group stages but not in $project stage. This is because $group stages are designed to take a sequence of documents and accumulate values based on that stream of documents.
$project on the other hand, works with one document at a time. So, we can calculate an average on an array within an individual document inside a project stage. But doing something like this where one at a time, we're seeing documents and for every document, it passes through the group stage pushing on a new value, well that's something that the $project stage is just not designed to do. For that type of operation we want to use $group.
Let's take a look at another example:
db.companies.aggregate([{
$match: {
funding_rounds: {
$exists: true,
$ne: []
}
}
}, {
$unwind: "$funding_rounds"
}, {
$sort: {
"funding_rounds.funded_year": 1,
"funding_rounds.funded_month": 1,
"funding_rounds.funded_day": 1
}
}, {
$group: {
_id: {
company: "$name"
},
first_round: {
$first: "$funding_rounds"
},
last_round: {
$last: "$funding_rounds"
},
num_rounds: {
$sum: 1
},
total_raised: {
$sum: "$funding_rounds.raised_amount"
}
}
}, {
$project: {
_id: 0,
company: "$_id.company",
first_round: {
amount: "$first_round.raised_amount",
article: "$first_round.source_url",
year: "$first_round.funded_year"
},
last_round: {
amount: "$last_round.raised_amount",
article: "$last_round.source_url",
year: "$last_round.funded_year"
},
num_rounds: 1,
total_raised: 1,
}
}, {
$sort: {
total_raised: -1
}
}]).pretty()
In the $group stage, we're using $first and $last accumulators. Right, again we can see that as with $push - we can't use $first and $last in project stages. Because again, project stages are not designed to accumulate values based on multiple documents. Rather they're designed to reshape documents one at a time. Total number of rounds is calculated using the $sum operator. The value 1 simply counts the number of documents passed through that group together with each document that matches or is grouped under a given _id value. The project may seem complex, but it's just making the output pretty. It's just that it's including num_rounds and total_raised from the previous document.