I have the following problem with mongo using the aggregation framework.
Suppose and item with time in seconds, t, and an event id occurring, e, like:
item:{t:11433, e:some_id}
what I want is to aggregate according to t and e. It means counting the number of id 'e' in a time t.
This is easy to do using the aggregation with $group.
However, I would like to have a different time course. For example, I want to count number of same event id in a time slot of eg. 5 seconds. I could do this progammatically, in js or python . I was just wondering if it could work using just mongo, using a cascade of group.
I tried to project using $divide[t,10]. For 11433, this would give, 1143.3 But it seems that I can't remove the 0.3 in Mongo (Otherwise I could group in this other scale).
Any hint?
thanks
To get an integer group key for a 5-second interval, you could use the formula
t = t - (t % 5) // % is the modula operator
In the aggregation framework this would look like this:
db.xx.aggregate([
// you need two projections, as they can not be nested
// this does not work:
// { $project: { _id: 0, e: 1, t: 1, tk: { $subtract: [ "$t", $mod: [ "$t", 5 ] ] } } },
//
// get modula 5 of time in seconds:
{ $project: { _id: 0, e: 1, t: 1, tm5: { $mod: [ "$t", 5 ] } } },
// subtract it from time:
{ $project: { _id: 0, e: 1, ti: { $subtract: [ "$t", "$tm5" ] } } },
// now group on e and interval,
{ $group: { _id: { e: "$e", interval: "$ti" }, count: { $sum: 1 } } },
])
For this example collection:
> db.xx.find()
{ "_id" : ObjectId("515e5a7157a0887a97cc8d1d"), "t" : 11433, "e" : "some_id" }
{ "_id" : ObjectId("515e60d457a0887a97cc8d1e"), "t" : 11434, "e" : "some_id" }
{ "_id" : ObjectId("515e60d857a0887a97cc8d1f"), "t" : 11438, "e" : "some_id" }
the result is:
{
"result" : [
{
"_id" : {
"e" : "some_id",
"interval" : 11435
},
"count" : 1
},
{
"_id" : {
"e" : "some_id",
"interval" : 11430
},
"count" : 2
}
],
"ok" : 1
}
Related
I am trying to build a dashboard chart in Mongo-Atlas.
The Table should should show the date on x-axis, the _id on y-axis.
The Values should be the count difference to the date before.
I have a collection with data points such as:
_id: "someName"
timestamp: 2019-09-05T06:24:24.689+00:00
count: 50
_id: "someName"
timestamp: 2019-09-04T06:24:24.689+00:00
count: 40
...
The goal is to get the difference of the count to the data point before. Having the same name.
_id: "someName"
timestamp: 2019-09-05T06:24:24.689+00:00
count: 50
difference: 10
_id: "someName"
timestamp: 2019-09-04T06:24:24.689+00:00
count: 40
difference: 17
...
That way I could make a table listing the differences
so far I created a aggregation pipeline
[
{$sort: {
"timestamp": -1
}},
{$group: {
_id: "$_id",
count: {
$push: { count: "$count", timestamp: "$timestamp" }
}
}},
{$project: {
_id: "$_id",
count: "$count",
countBefore: { $slice: [ "$count", 1, { $size: "$count" } ] }
}}
]
I was hoping to substract count and countBefore such that i get an array with the datapoints an the difference...
So I tried to follow with:
{$project: {
countDifference: {
$map: {
input: "$countBefore",
as: "before",
in: {
$subtract: ["$$before.count", "$count.count"]
/*"$count.count" seems to be the problem, since an integer works*/
}
}
}
}
}
Mongo Atlas only shows "An unknown error occurred"
I would be glad for some advice :)
The following query can get us the expected output:
db.collection.aggregate([
{
$sort:{
"timestamp":1
}
},
{
$group:{
"_id":"$id",
"counts":{
$push:"$count"
}
}
},
{
$project:{
"differences":{
$reduce:{
"input":"$counts",
"initialValue":{
"values":[],
"lastValue":0
},
"in":{
"values":{
$concatArrays:[
"$$value.values",
[
{
$subtract:["$$this","$$value.lastValue"]
}
]
]
},
"lastValue":"$$this"
}
}
}
}
},
{
$project:{
"_id":0,
"id":"$_id",
"plots":"$differences.values"
}
}
]).pretty()
Data Set:
{
"_id" : ObjectId("5d724550ef5e6630fde5b71e"),
"id" : "someName",
"timestamp" : "2019-09-05T06:24:24.689+00:00",
"count" : 50
}
{
"_id" : ObjectId("5d724550ef5e6630fde5b71f"),
"id" : "someName",
"timestamp" : "2019-09-04T06:24:24.689+00:00",
"count" : 40
}
{
"_id" : ObjectId("5d724796ef5e6630fde5b720"),
"id" : "someName",
"timestamp" : "2019-09-06T06:24:24.689+00:00",
"count" : 61
}
{
"_id" : ObjectId("5d724796ef5e6630fde5b721"),
"id" : "someName",
"timestamp" : "2019-09-07T06:24:24.689+00:00",
"count" : 72
}
{
"_id" : ObjectId("5d724796ef5e6630fde5b722"),
"id" : "someName",
"timestamp" : "2019-09-08T06:24:24.689+00:00",
"count" : 93
}
{
"_id" : ObjectId("5d724796ef5e6630fde5b723"),
"id" : "someName",
"timestamp" : "2019-09-09T06:24:24.689+00:00",
"count" : 100
}
Output:
{ "id" : "someName", "plots" : [ 40, 10, 11, 11, 21, 7 ] }
Explanation: We are pushing count for the same id into counts array and then applying $reduce operation on it to prepare a set of new values in which current value would hold difference between the current and previous value of counts array. For the very first value, the previous value is taken as zero.
I have the following records:
{ "_id" : 1, "c" : 120, "b" : [ { "f1" : 10 }, { "f1" : 10 } ] }
{ "_id" :2, "c" : 5, "b" : [ { "f1" : 10 }, { "f1" : 10 } ] }
I need the output this way:
{ "_id" : 1, 'total':140}
{ "_id" :2, 'total':25 }
where total = sum of value in 'c' with sum of values in f1 for same record.
When i unwind the field 'b' it creates two documents with same id and hence data is duplicated and when i sum it up, i get:
db.test2.aggregate([
{'$unwind':'$b'},
{'$project':{'total':{'$add':['$c','$b.f1']}}},
{'$group':{'_id':'$_id', 'total':{'$sum':'$total'}}}
])
outputs:
{ "_id" : 1, 'total':260}
{ "_id" :2, 'total':30 }
(not what i wanted, as it has added 120 and 5 again to total due to duplication during unwinding)
So i tried:
db.test2.aggregate([
{'$unwind':'$b'},
{'$group':{'_id':'$_id', 'c':{'$push': '$c'},'f1':{'$sum':'$b.f1'}}},
{'$project':{'total':{'$add':[{'$arrayElemAt':['$c',0]},'$f1']}}}
])
outputs:
{ "_id" : 1, 'total':140}
{ "_id" :2, 'total':25 }
( what i wanted)
Is there any other way to achieve this?
You can try below query. Sum operator to first calculate sum in array followed by add to calculate total with other field.
db.test2.aggregate([{
$project: {
total: {"$add":["$c", {"$sum":"$b.f1"}]}
}
}]
An alternative:
db.test2.aggregate([{
$project: {
_id: 0,
c: "$c",
b: {
$reduce: {
input: "$b.f1",
initialValue: 0,
in: {
$add: ["$$value", "$$this"]
}
}
}
}
},
{
$project: {
_id: 0,
total: {
$sum: ["$c", "$b"]
}
}
}
])
That would create result:
{
"total" : 140
}
{
"total" : 25
}
If you need the field _id then replace the _id: 0 in both $project to _id: 1
That would create this result:
{
"_id" : 1,
"total" : 140
}
{
"_id" : 2,
"total" : 25
}
I am new to Mongo aggregation.I want to calculate the difference betwen Two values (The last collection for each day -The first collection for each day).the data base record data every 5 mn for many ressource name.The structucture of the document is :
{
_id : ObjectId("5820511a95d447ed648b45d6"),
DeviceName : "OLT01FTV",
ResourceName : "CM MAC:00-07-11-11-39-20",
CollectionTime : ISODate("2016-11-07T09:30:00.000+01:00"),
GranularityPeriod : 5,
A : 0,
B: 17,
C: 4,
D: 21,
E: 3,
F: 0
}
A,B...F are the differrent counters.
Below, the illustration of that I'm trying to have :
result
([
{ "$match": {
"CollectionTime": {
$gte: ISODate("2016-09-05T00:00:00.000Z"),
$lt: ISODate("2016-10-07T00:00:00.000Z")
}
}},
{ "$unwind": "$u2000" },
{ "$group": {
"_id": null,
"firstUC": { "$first": "$UC" },
"lastUC": { "$last": "$UC" },
"firstSM-MISS": { "$first": "$SM-MISS" },
"lastSM-MISS": { "$last": "$SM-MISS" }
}},
{ "$project": {
"diff": {
"$divide": [
{ "$subtract": [ "$firstUC", "$lastUC" ] },
{ "$subtract": [ "$firstSM-MISS", "$lastSM-MISS" ] }
]
}
}}
])
This will get you the difference between the 'A' values for your above scenario. You can add the other fields if you want to get the difference for them also.
db.collection.aggregate([
{ "$match": {
"CollectionTime": {
$gte: ISODate("2016-11-01T00:00:00.000Z"),
$lt: ISODate("2016-11-30T00:00:00.000Z")
}
}},
{ "$sort": { "CollectionTime": 1 } },
{ "$group": {
"_id": null,
"firstA": { "$first": "$A" },
"lastA": { "$last": "$A" }
}},
{ "$project": {
_id: 0,
diffA: {
$subtract: [ "$lastA", "$firstA"]
}
}}
])
* EDIT *
So I'm using the following sample documents I created with the following to match your schema:
// Create 3 Documents 1 second apart
for (var i = 1; i < 4; i++) {
db.foo.insert({
DeviceName : "OLT01FTV",
ResourceName : "CM MAC:00-07-11-11-39-20",
CollectionTime : new Date(),
GranularityPeriod : 5,
A : 1*i,
B: 2*i,
C: 3*i,
D: 4*i,
E: 5*i,
F: 6*i
})
sleep(1000); // To add a delay between insertions so we can visibly see the date difference
}
This results in the following 3 documents being created:
> db.foo.find().pretty()
{
"_id" : ObjectId("582b1a6ced19a7334a5dee31"),
"DeviceName" : "OLT01FTV",
"ResourceName" : "CM MAC:00-07-11-11-39-20",
"CollectionTime" : ISODate("2016-11-15T14:23:40.934Z"),
"GranularityPeriod" : 5,
"A" : 1,
"B" : 2,
"C" : 3,
"D" : 4,
"E" : 5,
"F" : 6
}
{
"_id" : ObjectId("582b1a6ded19a7334a5dee32"),
"DeviceName" : "OLT01FTV",
"ResourceName" : "CM MAC:00-07-11-11-39-20",
"CollectionTime" : ISODate("2016-11-15T14:23:41.936Z"),
"GranularityPeriod" : 5,
"A" : 2,
"B" : 4,
"C" : 6,
"D" : 8,
"E" : 10,
"F" : 12
}
{
"_id" : ObjectId("582b1a6eed19a7334a5dee33"),
"DeviceName" : "OLT01FTV",
"ResourceName" : "CM MAC:00-07-11-11-39-20",
"CollectionTime" : ISODate("2016-11-15T14:23:42.939Z"),
"GranularityPeriod" : 5,
"A" : 3,
"B" : 6,
"C" : 9,
"D" : 12,
"E" : 15,
"F" : 18
}
The first step of the aggregation pipeline will match on all documents between the date range - which I set to beginning of November... so no worry there, then the sorting will sort by collection time:
After the grouping we have one document with the firstA and lastA value:
{ "_id" : null, "firstA" : 1, "lastA" : 3 }
And finally - perform the subtract in the projection and hide the ID field:
{ "diffA" : 2 }
I'd like to get percentages from a group pipeline in a MongoDB aggregate.
My data:
{
_id : 1,
name : 'hello',
type : 'big'
},
{
_id : 2,
name : 'bonjour',
type : 'big'
},
{
_id : 3,
name : 'hi',
type : 'short'
},
{
_id : 4,
name : 'salut',
type : 'short'
},
{
_id : 5,
name : 'ola',
type : 'short'
}
My request group by type, and count:
[{
$group : {
_id : {
type : '$type'
},
"count" : {
"$sum" : 1
}
}
}]
Result:
[
{
_id {
type : 'big',
},
count : 2
},
{
_id {
type : 'short',
},
count : 3
}
]
But I'd like to have count AND percentage, like that:
[
{
_id {
type : 'big',
},
count: 2,
percentage: 40%
},
{
_id {
type : 'short',
},
count: 3,
percentage: 60%
}
]
But I've no idea how to do that. I've tried $divide and other things, but without success. Could you please help me?
Well I think percentage should be string if the value contains %
First get you will need to count the number of document.
var nums = db.collection.count();
db.collection.aggregate(
[
{ "$group": { "_id": {"type": "$type"}, "count": { "$sum": 1 }}},
{ "$project": {
"count": 1,
"percentage": {
"$concat": [ { "$substr": [ { "$multiply": [ { "$divide": [ "$count", {"$literal": nums }] }, 100 ] }, 0,2 ] }, "", "%" ]}
}
}
]
)
Result
{ "_id" : { "type" : "short" }, "count" : 3, "percentage" : "60%" }
{ "_id" : { "type" : "big" }, "count" : 2, "percentage" : "40%" }
First find total number of documents in collections using count method and used that count variable to calculate percentage in aggregation like this :
var totalDocument = db.collectionName.count() //count total doc.
used totalDocument in aggregation as below :
db.collectionName.aggregate({"$group":{"_id":{"type":"$type"},"count":{"$sum":1}}},
{"$project":{"count":1,"percentage":{"$multiply":[{"$divide":[100,totalDocument]},"$count"]}}})
EDIT
If you need to this in single aggregation query then unwind used in aggregation but using unwind it creates Cartesian problem check below aggregation query :
db.collectionName.aggregate({"$group":{"_id":null,"count":{"$sum":1},"data":{"$push":"$$ROOT"}}},
{"$unwind":"$data"},
{"$group":{"_id":{"type":"$data.type"},"count":{"$sum":1},
"total":{"$first":"$count"}}},
{"$project":{"count":1,"percentage":{"$multiply":[{"$divide":[100,"$total"]},"$count"]}}}
).pretty()
I recconmed first find out toatal count and used that count in aggregation as per first query.
I have question collection each profile can have many questions.
{"_id":"..." , "pid":"...",.....}
Using mongo DB new aggregation framework how can I calculate the avg number of questions per profile?
tried the following without success:
{ "aggregate" : "question" , "pipeline" : [ { "$group" : { "_id" : "$pid" , "qCount" : { "$sum" : 1}}} , { "$group" : { "qavg" : { "$avg" : "qCount"} , "_id" : null }}]}
Can it be done with only one group operator?
Thanks.
For this you just need to know the amount of questions, and the amount of different profiles (uniquely identified with "pid" I presume). With the aggregation framework, you need to do that in two stages:
First, you calculate the number of questions per PID
Then you calculate the average of questions per PID
You'd do that like this:
Step one:
db.profiler.aggregate( [
{ $group: { _id: '$pid', count: { '$sum': 1 } } },
] );
Which outputs (in my case, with some sample data):
{
"result" : [
{ "_id" : 2, "count" : 7 },
{ "_id" : 1, "count" : 1 },
{ "_id" : 3, "count" : 3 },
{ "_id" : 4, "count" : 5 }
],
"ok" : 1
}
I have four profiles, respectively with 7, 1, 3 or 5 questions.
Now with this result, we run another group, but in this case we don't really want to group by anything, and thus do we need to set the _id value to null as you see in the second group below:
db.profiler.aggregate( [
{ $group: { _id: '$pid', count: { '$sum': 1 } } },
{ $group: { _id: null, avg: { $avg: '$count' } } }
] );
And then this outputs:
{
"result" : [
{ "_id" : null, "avg" : 4 }
],
"ok" : 1
}
Which tells me that I have on average, 4 questions per profile.