MongoDB Query with sum - mongodb

I have a simple document setup:
{
VG: "East",
Artikellist: {
Artikel1: "Sprite",
Amount1: 1,
Artikel2: "Fanta",
Amount2: 3
}
}
actually i just want to query these document to get a list of selling articels in each VG, or maybe town, doesnt matter. In addition the Query should sum the Amount of each product and give it back to me.
I Know i'm thinking in a SQL language, but that's actually the case.
May idea was this on here:
db.collection.aggregate([{
$group: {
_id: {
VG: "$VG",
Artikel1: "$Artikellist.Artikel1",
Artikel2: "$Artikellist.Artikel2",
$sum: "$Artikellist.Amount1",
$sum: "$Artikellist.Amount2"
},
}
}]);
The hardest point here that i have 5 different values for VG and and it could be maximum of 5 Artikel and regarding amounts in one list.
So Hopefully you can help me here. Sry for my bad english and for my badder Mongo Skills.

If Artikel1 is always "Sprite" and Artikel2 - "Fanta", then you can try this one:
db.test.aggregate({$group : { _id : {VG : "$VG", Artikel1 : "$Artikellist.Artikel1", Artikel2 : "$Artikellist.Artikel2"}, Amount1 : {$sum : "$Artikellist.Amount1"},Amount2 : {$sum : "$Artikellist.Amount2"}}});
If values of Artikel1 and Artikel2 can vary I suggest changing the structure of document say to:
{
VG: "East",
Artikellist: [
{ Artikel: "Sprite",
Amount: 1},
{ Artikel: "Fanta",
Amount: 3 }
]}
and then use the following approach:
db.test.aggregate({$unwind : "$Artikellist"}, {$group : {_id : {VG : "$VG", Artikel : "$Artikellist.Artikel"}, Amount : {$sum : "$Artikellist.Amount"}}})

Related

How to improve aggregate pipeline

I have pipeline
[
{'$match':{templateId:ObjectId('blabla')}},
{
"$sort" : {
"_id" : 1
}
},
{
"$facet" : {
"paginatedResult" : [
{
"$skip" : 0
},
{
"$limit" : 100
}
],
"totalCount" : [
{
"$count" : "count"
}
]
}
}
])
Index:
"key" : {
"templateId" : 1,
"_id" : 1
}
Collection has 10.6M documents 500k of it is with needed templateId.
Aggregate use index
"planSummary" : "IXSCAN { templateId: 1, _id: 1 }",
But the request takes 16 seconds. What i did wrong? How to speed up it?
For start, you should get rid of the $sort operator. The documents are already sorted by _id since the documents are already guaranteed to sorted by the { templateId: 1, _id: 1 } index. The outcome is sorting 500k which are already sorted anyway.
Next, you shouldn't use the $skip approach. For high page numbers you will skip large numbers of documents up to almost 500k (rather index entries, but still).
I suggest an alternative approach:
For the first page, calculate an id you know for sure falls out of the left side of the index. Say, if you know that you don't have entries back dated to 2019 and before, you can use a match operator similar to this:
var pageStart = ObjectId.fromDate(new Date("2020/01/01"))
Then, your match operator should look like this:
{'$match' : {templateId:ObjectId('blabla'), _id: {$gt: pageStart}}}
For the next pages, keep track of the last document of the previous page: if the rightmost document _id is x in a certain page, then pageStart should be x for the next page.
So your pipeline may look like this:
[
{'$match' : {templateId:ObjectId('blabla'), _id: {$gt: pageStart}}},
{
"$facet" : {
"paginatedResult" : [
{
"$limit" : 100
}
]
}
}
]
Note, that now the $skip is missing from the $facet operator as well.

MongoDB $divide on aggregate output

Is there a possibility to calculate mathematical operation on already aggregated computed fields?
I have something like this:
([
{
"$unwind" : {
"path" : "$users"
}
},
{
"$match" : {
"users.r" : {
"$exists" : true
}
}
},
{
"$group" : {
"_id" : "$users.r",
"count" : {
"$sum" : 1
}
}
},
])
Which gives an output as:
{ "_id" : "A", "count" : 7 }
{ "_id" : "B", "count" : 49 }
Now I want to divide 7 by 49 or vice versa.
Is there a possibility to do that? I tried $project and $divide but had no luck.
Any help would be really appreciated.
Thank you,
From your question, it looks like you are assuming result count to be 2 only. In that case I can assume users.r can have only 2 values(apart from null).
The simplest thing I suggest is to do this arithmetic via javascript(if you're using it in mongo console) or in case of using it in progam, use the language you're using to access mongo) e.g.
var results = db.collection.aggregate([theAggregatePipelineQuery]).toArray();
print(results[0].count/results[1].count);
EDIT: I am sharing an alternative to above approach because OP commented about the constraint of not using javascript code and the need to be done only via query. Here it is
([
{ /**your existing aggregation stages that results in two rows as described in the question with a count field **/ },
{ $group: {"_id": 1, firstCount: {$first: "$count"}, lastCount: {$last: "$count"}
},
{ $project: { finalResult: { $divide: ['$firstCount','$lastCount']} } }
])
//The returned document has your answer under `finalResult` field

How to match and group in multiple cases in mongodb aggregation?

i have 4 players with there scores in different matches.
e.g
{user: score} -- json keys
{'a': 10}, {'a':12}, {'b':16}
I am trying to find out a way in which i can found sum of single player using aggregation function.
users.aggregation([{$match:{'user':'a'}},{$group:{_id: null, scores:{$sum:'$score'}])
i am repeating same thing for b also and continue
In shot i am doing same thing for different users for too many times.
What is the best way or different way or optimize way, so i can write aggregate query once for all users
You can just match out the required users with the $in clause, and then group as #Sourbh Gupta suggested.
db.users.aggregate([
{$match:{'user':{$in: ['a', 'b', 'c']}}},
{$group:{_id: '$user', scores:{$sum:'$score'}}}
])
group the data on the basis of user. i.e.
users.aggregation([{$group:{_id: "$user", scores:{$sum:'$score'}}}])
Not too sure about your document structures, but if you've got 2 diffrent fields for 2 diffrent scores you can group together and sum then and then project and sum then 2 grouped sums (if that makes sense)
So for example, I have these docuemnts:
> db.scores.find()
{ "_id" : ObjectId("5858ed67b11b12dce194eec8"), "user" : "bob", "score" : { "a" : 10 } }
{ "_id" : ObjectId("5858ed6ab11b12dce194eec9"), "user" : "bob", "score" : { "a" : 12 } }
{ "_id" : ObjectId("5858ed6eb11b12dce194eeca"), "user" : "bob", "score" : { "b" : 16 } }
Notice we have a user bob and he has 2x a scores and 1x b score.
We can now write an aggregation query to do a match for bob then sum the scores.
db.scores.aggregate([
{ $match: { user : "bob" } },
{ $group: { _id : "$user", sumA : { $sum : "$score.a" }, sumB : { $sum : "$score.b" } } },
{ $project: { user: 1, score : { $sum: [ "$sumA", "$sumB" ] } } }
]);
This will give us the following result
{ "_id" : "bob", "score" : 38 }

Can I group floating point numbers by range in MongoDB?

I have a MongoDB set up with documents like this
{
"_id" : ObjectId("544ced7b9f40841ab8afec4e"),
"Measurement" : {
"Co2" : 38,
"Humidity" : 90
},
"City" : "Antwerp",
"Datum" : ISODate("2014-10-01T23:13:00.000Z"),
"BikeId" : 26,
"GPS" : {
"Latitude" : 51.20711593206187,
"Longitude" : 4.424424413969158
}
}
Now I try to aggregate them by date and location and also add the average of the measurement to the result. So far my code looks like this:
db.stadsfietsen.aggregate([
{$match: {"Measurement.Co2": {$gt: 0}}},
{
$group: {
_id: {
hour: {$hour: "$Datum"},
Location: {
Long: "$GPS.Longitude",
Lat: "$GPS.Latitude"
}
},
Average: {$avg: "$Measurement.Co2"}
}
},
{$sort: {"_id": 1}},
{$out: "Co2"}
]);
which gives me a nice list of all the possible combinations of hour and GPS coordinates, in this form:
{
"_id" : {
"hour" : 0,
"Location" : {
"Long" : 3.424424413969158,
"Lat" : 51.20711593206187
}
},
"Average" : 82
}
The problem is that there are so many unique coordinates, that it's not useful.
Can I group the documents together when there are values that are close together? Say from Longitude 51.207 to Longitude 51.209?
There is no standard support for ranges in $group.
Mathematically
You could calculate a new value that will be the same for several geolocations. For example you could simulate a floor method:
_id:{ hour:{$hour:"$Datum"}, Location:{
Long: $GPS.Longitude - mod($GPS.Longitude, 0.01),
Lat: $GPS.Latitude - mod($GPS.Latitude, 0.01)
}}
Geospatial Indexing
You could restructure you're application to use a Geospatial index and search for all locations in a given range. If this is applicable depends very much on your use case.
Map-Reduce
Map-Reduce is more powerful than the aggregation framework. You can definitely use this to do your calculations, but it's more complex and therefore I can't present you a ready-made solution without spending another hour.

MongoDB custom sorting

i have a collection of records as follows
{
"_id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"p":{
"type":"1",
"txt":"test message"
},
"users":[
{
"uid":"52872ed59542f",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"uid":"524eb460986e4",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"uid":"524179060781e",
"pt":ISODate("2013-11-27T12:48:35Z")
}
],
},
{
"_id":418,
"ptime":ISODate("2013-11-25T11:18:42.961Z"),
"p":{
"type":"1",
"txt":"test message 2"
},
"users":[
{
"uid":"524eb460986e4",
"pt":ISODate("2013-11-23T11:18:42.961Z")
},
{
"uid":"52872ed59542f",
"pt":ISODate("2013-11-24T11:18:42.961Z")
},
{
"uid":"524179060781e",
"pt":ISODate("2013-11-22T12:48:35Z")
}
],
}
How to sort the above records with descending order of ptime and pt where users uid ="52872ed59542f" ?
If you want to do such a sort, you probably want to store your data in a different way. MongoDB in generally is not near as good with manipulating nested documents as top level fields. In your case, I would recommend splitting out ptime, pt and uid into their own collection:
messages
{
"_id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"type":"1",
"txt":"test message"
},
users
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"52872ed59542f",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"524eb460986e4",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"524179060781e",
"pt":ISODate("2013-11-27T12:48:35Z")
}
You can then set an index on the users collection for uid, ptime and pt.
You will need to do two queries to also get the text messages themselves though.
You can use the Aggregation Framework to sort by first ptime and then users.pt field as follows.
db.users.aggregate(
{$sort : {'ptime' : 1}},
{$unwind : "$users"},
{$match: {"users.uid" : "52872ed59542f"}},
{$sort : {'users.pt' : 1}},
{$group : {_id : {id : "$_id", "ptime" : "$ptime", "p" : "$p"}, users : {$push : "$users"}}},
{$group : {_id : "$_id.id", "ptime" : {$first : "$_id.ptime"}, "p" : {$first : "$_id.p"}, users : {$push : "$users"}}}
);
db.yourcollection.find(
{
users:{
$elemMatch:{uid:"52872ed59542f"}
}
}).sort({ptime:-1})
But you will have problems with order by pt field. You should use Aggregation Framework to project data or use Derick's approach.