Get sum of Nested Array in Aggregate - mongodb

Ok, I have an issue I cannot seem to solve.
I have a document like this:
{
"playerId": "43345jhiuy3498jh4358yu345j",
"leaderboardId": "5b165ca15399c020e3f17a75",
"data": {
"type": "EclecticData",
"holeScores": [
{
"type": "RoundHoleData",
"xtraStrokes": 0,
"strokes": 3,
},
{
"type": "RoundHoleData",
"xtraStrokes": 1,
"strokes": 5,
},
{
"type": "RoundHoleData",
"xtraStrokes": 0,
"strokes": 4
}
]
}
}
Now, what I am trying to accomplish is using aggregate sum the strokes and then order it afterwards. I am trying this:
var sortedBoard = db.collection.aggregate(
{$match: {"leaderboardId": boardId}},
{$group: {
_id: "$playerId",
played: { $sum: 1 },
strokes: {$sum: '$data.holeScores.strokes'}
}
},
{$project:{
type: "$SortBoard",
avgPoints: '$played',
sumPoints: "$strokes",
played : '$played'
}}
);
The issue here is that I do net get the strokes sum correct, since this is inside another array.
Hope someone can help me with this and thanks in advance :-)

You need to say $sum twice:
var sortedBoard = db.collection.aggregate([
{ "$match": { "leaderboardId": boardId}},
{ "$group": {
"_id": "$playerId",
"SortBoard": { "$first": "$SortBoard" },
"played": { "$sum": 1 },
"strokes": { "$sum": { "$sum": "$data.holeScores.strokes"} }
}},
{ "$project": {
"type": "$SortBoard",
"avgPoints": "$playeyed",
"sumPoints": "$strokes",
"played": "$played"
}}
])
The reason is because you are using it both as a way to "sum array values" and also as an "accumulator" for $group.
The other thing you appear to be missing is that $group only outputs the fields you tell it to, therefore if you want to access other fields in other stages or output, you need to keep them with something like $first or another accumulator. We also appear to be missing a pipeline stage in the question anyway, but it's worth noting just to be sure.
Also note you really should wrap aggregation pipelines as an official array [], because the legacy usage is deprecated and can cause problems in some language implementations.
Returns the correct details of course:
{
"_id" : "43345jhiuy3498jh4358yu345j",
"avgPoints" : 1,
"sumPoints" : 12,
"played" : 1
}

Related

MongoDB $push aggregaton won't keep the right order

I tried to make a $group aggregation with MongoDB, like the following example:
"$group": {
"_id": "$test_id",
"feeling": {
"$push": "$feeling"
},
"reference_id": {
"$push": "$_id"
},
"training_start": {
"$push": "$training_start"
},
"training_duration": {
"$push": "$duration_ms"
}
}
The aggregation works fine, but the created arrays are sorted different. That means, if I check the result of the aggregation by looking at reference_id[x] and training_start[x] then the value of training_start in the source collection is not equal to training_start[x].
Maybe an example shows my problem more precisely:
One document after the $group aggregation:
{
_id: "string_1",
reference_id: [1, 2, 3],
training_start: [01:00:00, 02:00:00, 03:00:00] (date times)
}
Documents from source collection:
{
_id:1,
training_start: 01:00:00,
test_id: "string_1"
},
{
_id:2,
training_start: 03:00:00,
test_id: "string_1"
},
{
_id:3,
training_start: 02:00:00,
test_id: "string_1"
}
The first elements in these arrays are always in the right order. So I checked if each grouped field has the same number of entries by using the code below. And the annoying result is, that the amount of entries in each array is equal. So there is no shift in the arrays caused by missing values.
"$group": {
"_id": "$test_id",
"sum": {
"$sum": {
"$cond": {
"if": {
"$lte": [
"$training_start", null
]
},
"then": 0,
"else": 1
}
}
}
Does anybody know, if there is an other way to create arrays (already tried $addToSet) which keep the order, the elements where pushed in? Or am I the problem?
Greetings Max

Mongo count array item occurrences across all collection documents

I have a mongo collection where each document has an array with multiple hashtags (a simple string). I would like to count how many times each hashtag has appeared and return something like this:
{hashtag: "hashtag1",
count: numOcurrences
}
{hashtag: "hashtag2",
count: numOcurrences
}
...
It seems similar to this problem, but since I don't want to filter by any parameter, just count the overall occurrences I think It has to be a cleaner way to solve this, sadly my mongo knowledge is very limited...
The collection which the hashtags are in, looks similar to this, being the field "hastag" the array of hashtags:
{"_id": ...,
"hashtag" : [
"hashtag1",
"hashtag2"
],
"likes" : ...
},
{"_id": ...,
...
}
Your case is a bit easier than the other problem you mentioned, and you could solve it using the aggregation below:
db.hashtags.aggregate([
{
"$unwind": "$hashtag"
},
{
"$group": {
"_id": "$hashtag",
"count": {
"$sum": 1
}
}
},
// you can skip this projection if it's okay for you to have the result like [{ _id: "hashtag1", count: 2 }]
{
"$project": {
"_id": 0,
"hashtag": "$_id",
"count": 1
}
}
])
You can see a working example in mongoplayground

MongoDB $setUnion on object ($setUnion but with additional information)

stackoverflow community,
I do not often work with big Arrays of Objects within in mongodb
so I have no idea how to solve this problem:
1.
i am working within one file, so obviously it's an aggregate witch firstly does an {$match:{"_id" : ObjectId("5c3f5cb04147b3082648278b") }},
2.
ok now I have another step that $project + $filter to filter out some objects, but it is not important for this (i think)
I have an array of objects, similar to this
{
"_id": ObjectId(".."),
"data":
[
{
id : 01,
groupId: 22,
noteId: 876543
},
{
id : 02,
groupId: 33,
noteId: 767676
},
{
id : 03,
groupId: 22,
noteId: 876543
},
{
id : 04,
groupId: 76,
noteId: 876543
}
]
}
but with thousands of entries and more values per object.
Every groupId can have any noteId, but the same groups have always the same noteId.
The Problem: noteIds can be shared between groups.
I added this
{ $project: {
"groupIds": {"$setUnion": "$data.groupId"}
}}
witch gives me all the groupIds
but it is very important that I also get all the related noteId's because
it is an arbitrary ID in relation with nothing else.
is it possible to somehow union an object by a specified field?
or is there another way to solve this? If I maybe filter for Objects with $in($data.groupId, $setUnion('union from above') I still would not know how to only extract the 2 fields that I need.
thanks for your help in advance
H.M.
You can use below aggregation
db.collection.aggregate([
{ "$unwind": "$data" },
{ "$group": {
"_id": {
"_id": "$_id",
"groupId": "$data.groupId"
},
"noteIds": {
"$push": {
"noteId": "$data.noteId",
}
}
}},
{ "$group": {
"_id": "$_id._id",
"data": {
"$push": {
"groupId": "$_id.groupId",
"noteIds": "$noteIds"
}
}
}}
])

MongoDB Sum Array With Objects

Say I have an aggregation that returns the following:
[
{driverId: 21312asd12, cars: 2, totalMiles: 30000, family: 4},
{driverId: 55512a23a2, cars: 3, totalMiles: 55000, family: 2},
...
]
How would I go about running a summation of each data set on a groupId basis to return the following? Do I use an $unwind? Do another grouping?
For example I would like to return:
{
totalDrivers: 2,
totalCars: 5,
totalMiles: 85000,
totalFamily: 6
}
You seem to just be referring to the documents in the output as an "array", therefore just add another $group to the end of your pipeline:
{ "$group": {
"_id": null,
"totalDrivers": { "$sum": 1 },
"totalCars": { "$sum": "$cars" },
"totalMiles": { "$sum": "$totalMiles" },
"totalFamily": { "$sum": "$family" }
}}
Where null is essentially just a blank grouping key that is not a field present in the document to group on. The result should be a single document (albeit in an array, depending on the API method call used or server version).
Or if you actually mean that each document has a field with an array like this, then $unwind and process the group either per document or with a null as above:
{ "$unwind": "$someArray" },
{ "$group": {
"_id": "$_id",
"totalDrivers": { "$sum": 1 },
"totalCars": { "$sum": "$someArray.cars" },
"totalMiles": { "$sum": "$someArray.totalMiles" },
"totalFamily": { "$sum": "$someArray.family" }
}}
At any rate, you should really post the code you are using when asking questions like this. It is very likely that your pipeline may not be as efficient to get to your end goal as you think, and if you posted that it both gives a clear picture of what you are doing as well as leaves it open for suggested improvement.

MongoDB: how to do $or with $where (it doesn't do logical OR)

How can we use $or with such a $where clause?
This query should always be returning all records (because of the date in 2015), but it doesn't return anything.
In parts, it works, but when trying to apply the $or to the Date or $where, it doesn't work as intended.
Thanks to Sammaye to fixing my previous version of this, to the following (still not working though):
db.turnys.find({
$or:[
{ start:{
$lte:new Date("2015-03-31T09:52:29.338Z")
} },
{ $where:"this.users.length == this.seats" }
]
});
How can I accomplish the intended $or?
Here is a sample of the turnys collection:
[
{
"gId": "5335e4a7b8cf51bcd054b423",
"seats": 2,
"start": "2014-03-31T08:47:48.946Z",
"end": "2014-03-31T08:49:48.946Z",
"rMin": 800,
"rMax": 900,
"users": [],
"_id": "53392bb42b70450000a834d8"
},
{
"gId": "5335e4a7b8cf51bcd054b423",
"seats": 2,
"start": "2014-03-31T08:47:48.946Z",
"end": "2014-03-31T08:49:48.946Z",
"rMin": 1000,
"rMax": 1100,
"users": [],
"_id": "53392bb42b70450000a834da"
},
Thanks!
The problem is that $ors do not work that way, in reality what you need is:
db.turnys.find({
$or:[
{ start:{
$lte:new Date("2015-03-31T09:52:29.338Z")
} },
{ $where:"this.users.length == this.seats" }
]
});
That will now create an $or query with two clauses. Each element of the $or array is classed as a $anded clause.
As I referenced to you on your other question, the use of the $where operator should be avoided as shown in the given reasons there.
So again as shown what you should be doing is "allocating" a total_users value within your document, using the $inc operator on updates. But your "query" should look like this with the use of .aggregate():
db.collection.aggregate([
{ "$project": {
"gId": 1,
"start": 1,
"alloc": { "$eq": [ "$total_users", "$seats" ] }
}},
{ "$match": {
"$or": [
{ "alloc": 1, },
{ "start": { "$lte": new Date("2015-03-31T09:52:29.338Z") } }
]
}}
])
Or even possibly use the "array size" form that was mentioned with more recent versions ( still to be released as of writing ) of MongoDB.
But also to "clarify" you need to make sure your "test" operations are actually valid.