Why would I get a "-Infinity" result for $avg in an aggregate query? - mongodb

My aggregate query for retrieving "average percents per month" returns a -Infinity average for some months. What would cause this?
The relative properties in mycollection are mydate and mynumericfield, which stores a percentage value as a double.
db.mycollection.aggregate(
[
{
$match: {
mydate: {
$gte: new Date(Date.UTC(2014, 8, 1)),
$lte: new Date(Date.UTC(2014, 12, 1)),
}
}
},
{
$group : {
_id : { month: { $month: "$mydate" }, year: { $year: "$mydate" } },
average: { $avg: "$mynumericfield" },
count: { $sum: 1 }
}
}
]
)
Here's a sample of the result:
/* 1 */
{
"result" : [
{
"_id" : {
"month" : 9,
"year" : 2014
},
"average" : 84.2586640583996598,
"count" : 20959.0000000000000000
},
{
"_id" : {
"month" : 11,
"year" : 2014
},
"average" : 96.9326915103888638,
"count" : 20743.0000000000000000
},
{
"_id" : {
"month" : 10,
"year" : 2014
},
"average" : -Infinity,
"count" : 20939.0000000000000000
},
{
"_id" : {
"month" : 12,
"year" : 2014
},
"average" : -Infinity,
"count" : 20913.0000000000000000
}
],
"ok" : 1.0000000000000000
}

I've managed to somehow reproduce the -Infinity problem but on a smaller scale.
Let's take sample collection nums having only 4 documents in it:
{ "_id" : 0, "grp" : 1, "mynum" : -Infinity }
{ "_id" : 1, "grp" : 1, "mynum" : 5 }
{ "_id" : 3, "grp" : 2, "mynum" : 8 }
{ "_id" : 4, "grp" : 2, "mynum" : 89 }
Performing simple aggregation on this collection like:
> db.nums.aggregate([{$group:{"_id":"$grp", "average" : {$avg : "$mynum"}}}])
gives the following result:
{ "_id" : 2, "average" : 48.5 }
{ "_id" : 1, "average" : -Infinity }
which is identical in effects to what you have experienced.
Please try to find out whether in your collection there is a document which has mynumericfield with value -Infinity - maybe your situation is similar to the reproduced one:
> db.mycollection.find({mynumericfield : -Infinity})
I hope it might help you some way.

Related

how to get count of data season wise in MongoDb

I wants to get data count by 3 months wise in MongoDB
count by January to March
count by March to June
count by June to September
count by September to December
{
"_id" : ObjectId("62aecc436b905928c4209e63"),
"date" : ISODate("2021-06-25T00:00:00.000Z"),
"order_id" : 1,
"total": 100
},
{
"_id" : ObjectId("62aecc436b905928c4209e64"),
"date" : ISODate("2022-03-20T00:00:00.000Z"),
"order_id" : 2,
"total": 200
}
{
"_id" : ObjectId("62aecc436b905928c4209e65"),
"date" : ISODate("2022-11-03T00:00:00.000Z"),
"order_id" : 3,
"total": 300
}
in version 5.0 try this
db.data.aggregate(
[
{
$group:
{
_id:{ $dateTrunc: { date: "$date", unit: "quarter" } },
count: { $count: {}}
}
}
]
)
result
{
"_id" : ISODate("2021-04-01T07:00:00.000+07:00"),
"count" : 1
},
{
"_id" : ISODate("2022-01-01T07:00:00.000+07:00"),
"count" : 1
},
{
"_id" : ISODate("2022-10-01T07:00:00.000+07:00"),
"count" : 1
}
_id is the start date of the quarter.

MongoDB: get sum by year/month with nested values

I'm trying to sum (spending by month/year) of a collection with nested amounts - with no luck.
This is the collection (extract):
[
{
"_id" : ObjectId("5faaf88d0657287993e541a5"),
"segment" : {
"l1" : "Segment A",
"l2" : "001"
},
"invoiceNo" : "2020.10283940",
"invoicePos" : 3,
"date" : ISODate("2019-09-06T00:00:00.000Z"),
"amount" : {
"document" : {
"amount" : NumberDecimal("125.000000000000"),
"currCode" : "USD"
},
"local" : {
"amount" : NumberDecimal("123.800000000000"),
"currCode" : "CHF"
},
"global" : {
"amount" : NumberDecimal("123.800000000000"),
"currCode" : "CHF"
}
}
},
...
]
I would like to sum up the aggregated invoice volume per month in "global" currency.
I tried this query on MongoDB:
db.invoices.aggregate(
{$project : {
month : {$month : "$date"},
year : {$year : "$date"},
amount : 1
}},
{$unwind: '$amount'},
{$group : {
_id : {month : "$month" ,year : "$year" },
total : {$sum : "$amount.global.amount"}
}})
I am getting as result this:
/* 1 */
{
"_id" : ObjectId("5faaf88d0657287993e541a5"),
"amount" : {
"document" : {
"amount" : NumberDecimal("125.000000000000"),
"currCode" : "USD"
},
"local" : {
"amount" : NumberDecimal("123.800000000000"),
"currCode" : "CHF"
},
"global" : {
"amount" : NumberDecimal("123.800000000000"),
"currCode" : "CHF"
}
},
"month" : 9,
"year" : 2019
}
/* 2 */
{
"_id" : ObjectId("5faaf88d0657287993e541ac"),
"amount" : {
"document" : {
"amount" : NumberDecimal("105.560000000000"),
"currCode" : "CHF"
},
"local" : {
"amount" : NumberDecimal("105.560000000000"),
"currCode" : "CHF"
},
"global" : {
"amount" : NumberDecimal("105.560000000000"),
"currCode" : "CHF"
}
},
"month" : 11,
"year" : 2020
}
This however does not sum up all invoices per month, but looks like single invoice lines - no aggregation.
I would like to get a result like this:
[
{
"month": 11,
"year": 2020,
"amount" : NumberDecimal("99999.99")
},
{
"month": 10,
"year": 2020,
"amount" : NumberDecimal("99999.99")
},
{
"month": 9,
"year": 2020,
"amount" : NumberDecimal("99999.99")
}
]
What is wrong with my query?
Would this be helpful?
db.invoices.aggregate([
{
$group: {
_id: {
month: {
$month: "$date"
},
year: {
$year: "$date"
}
},
total: {
$sum: "$amount.global.amount"
}
}
},
{$sort:{"_id.year":-1, "_id.month":-1}}
])
Playground
If you need any extra explanation let me know, but the code is pretty short and self-explanatory.
In principle your aggregation pipeline is fine, there a few mistakes:
An aggregation pipeline expects an array
$unwind is useless, because $amount is not an array. One element in -> one document out
You can use date function directly
So, short and simple:
db.invoices.aggregate([
{
$group: {
_id: { month: { $month: "$date" }, year: { $year: "$date" } },
total: { $sum: "$amount.global.amount" }
}
}
])

Aggregate value of each hour by MongoDB

Like the image, the above table represents my original data, time field is irregular. Now I want to get the data that represents the average value between every hour. What I thought was by using $match, $group, $project even with for method. I don't get an accurate idea and method.
id: ObjectId,
value: Number,
time: Date()
I have sample collection, hours.:
{ "_id" : 1, "value" : 10, "dt" : ISODate("2019-10-17T00:01:32Z") }
{ "_id" : 2, "value" : 16, "dt" : ISODate("2019-10-17T00:02:12Z") }
{ "_id" : 3, "value" : 8, "dt" : ISODate("2019-10-17T01:04:09Z") }
{ "_id" : 4, "value" : 12, "dt" : ISODate("2019-10-17T02:14:21Z") }
{ "_id" : 5, "value" : 6, "dt" : ISODate("2019-10-17T02:54:02Z") }
{ "_id" : 6, "value" : 11, "dt" : ISODate("2019-10-17T04:06:31Z") }
The following aggregation query returns the average value by the hour (the hour is of the date field):
db.hours.aggregate( [
{ $project: { value: 1, hr: { $hour: "$dt" } } } ,
{ $addFields: { hour: { $add: [ "$hr", 1 ] } } },
{ $group: { _id: "$hour",
count: { $sum: 1 },
totalValue: { $sum: "$value" },
avgValue: { $avg: "$value" }
}
},
{ $project: { hour: "$_id", _id: 0, count: 1, totalValue: 1, avgValue: 1} }
] )
=>
{ "count" : 2, "totalValue" : 18, "avgValue" : 9, "hour" : 3 }
{ "count" : 1, "totalValue" : 8, "avgValue" : 8, "hour" : 2 }
{ "count" : 1, "totalValue" : 11, "avgValue" : 11, "hour" : 5 }
{ "count" : 2, "totalValue" : 26, "avgValue" : 13, "hour" : 1 }
Finally, I solve this issue. Below is my code.

Mongo aggregation $subtract between dynamic document value

Need to find the difference between two values of attendance,group by ward_id, based on patient id for two dates. The result has dynamic values based on the array. The difference is between two dates. Key would be ward_id, the difference will be between counts of patient's visit to the ward.
Example sample data
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-03T21:31:29.902Z"),
"ward_id" : 2561
},
"count" : 4112,
"values" : [
{
"count" : 9,
"patient" : ObjectId("54766f973f35473ffc644618")
},
{
"count" : 19,
"patient" : ObjectId("546680e2d660e2dc5ebfea39")
},
{
"count" : 47,
"patient" : ObjectId("546680e3d660e2dc5ebfea72")
},
{
"count" : 1,
"patient" : ObjectId("546a137bdab5f21e612ea7ef")
},
{
"count" : 93,
"patient" : ObjectId("546680e3d660e2dc5ebfea89")
}
]
}
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-03T21:31:29.902Z"),
"ward_id" : 3720
},
"count" : 1,
"values" : [
{
"count" : 1,
"patient" : ObjectId("546a136ddab5f21e612ea6a6")
}
]
}
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-04T21:31:29.902Z"),
"ward_id" : 2561
},
"count" : 4112,
"values" : [
{
"count" : 10,
"patient" : ObjectId("54766f973f35473ffc644618")
},
{
"count" : 10,
"patient" : ObjectId("546680e2d660e2dc5ebfea39")
},
{
"count" : 6,
"patient" : ObjectId("5474e9e46606f32570fa48ff")
},
{
"count" : 1,
"patient" : ObjectId("5474e9e36606f32570fa48f2")
},
{
"count" : 1,
"patient" : ObjectId("546680e3d660e2dc5ebfea77")
},
{
"count" : 543,
"patient" : ObjectId("546680e2d660e2dc5ebfea43")
},
{
"count" : 1,
"patient" : ObjectId("5485fdc8d27a9122956b1c66")
}
]
}
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-04T21:31:29.902Z"),
"ward_id" : 3720
},
"count" : 1,
"values" : [
{
"count" : 7,
"patient" : ObjectId("546a136ddab5f21e612ea6a6")
}
]
}
Output
{
"ward_id":2561,
"result" : [{"person": ObjectId("54766f973f35473ffc644618"),
"count_1": 9,
"count_1": 10,
"difference":1 },{"person": ObjectId("546680e2d660e2dc5ebfea39"),
"count_1": 19,
"count_1": 10,
"difference":-9 } ....]
},
{
"ward_id":3720,
"result" : [{"person": ObjectId("546a136ddab5f21e612ea6a6"),
"count_1": 9,
"count_1": 10,
"difference":1 },{"person": ObjectId("546680e2d660e2dc5ebfea39"),
"count_1": 1,
"count_1": 7,
"difference":-6 }]
}
you can use the aggregation framework's $subtract operator outlined here: http://docs.mongodb.org/manual/reference/operator/aggregation-arithmetic/
db.wards.aggregate([
{
$match: {id: {$elemMatch: {ward_id: my_ward_id, ts: my_desired_ts}}},
},
{
$limit: 2
},
{
$project: {values: 1}
},
{
$unwind: '$values'
},
{
$match: {patient: my_patient_id}
},
{
$group: {
_id: null,
'count1': {$first: '$values.count'},
'count2': {$last: '$values.count'}
}
},
{
$subtract: ['$count1', '$count2']
}
])
i haven't tested this but it would probably look like something above

How to group by seconds without the ISODate decimal part in MongoDB

I wanted to query the database in order to find the number of post per second to feed into a graph to show activity trend. I use spring-data-mongo but for now, the first step is to do this in the mongo shell before worrying about how to do from java.
I used the aggregation framework on it as shown below:
db.post.group({
key:{dateCreated: 1},
cond: { dateCreated:
{
"$gt": new ISODate("2013-08-09T05:51:15Z"),
"$lt": new ISODate("2013-08-09T05:51:20Z")
}
},
reduce: function(cur, result){
result.count += 1
},
initial: {count:0}
})
The result is encouraging but is seems because of the decimal part of the ISODate, the count seems wrong as it does group per seconds with the decimal making each count 1.
[
{
"dateCreated" : ISODate("2013-08-09T05:51:15.332Z"),
"count" : 1
},
{
"dateCreated" : ISODate("2013-08-09T05:51:15.378Z"),
"count" : 1
},
{
"dateCreated" : ISODate("2013-08-09T05:51:15.377Z"),
"count" : 1
},
// many more here
]
Is there a way to just consider only the seconds part as in result like below:
[
{
"dateCreated" : ISODate("2013-08-09T05:51:15Z"),
"count" : 5
},
{
"dateCreated" : ISODate("2013-08-09T05:51:16Z"),
"count" : 8
},
{
"dateCreated" : ISODate("2013-08-09T05:51:17Z"),
"count" : 3
},
{
"dateCreated" : ISODate("2013-08-09T05:51:18Z"),
"count" : 10
},
{
"dateCreated" : ISODate("2013-08-09T05:51:19Z"),
"count" : 2
},
{
"dateCreated" : ISODate("2013-08-09T05:51:20Z"),
"count" : 13
}
]
Thank for reading this.
For those in the same situation. here is how I modified my query. Thanks to #Sammaye.
db.post.aggregate(
{
$match: { dateCreated:
{
"$gt": new ISODate("2013-08-09T05:51:15.000Z"),
"$lt": new ISODate("2013-08-09T05:51:20.000Z")
}
}
},
{
$group: {
_id: {
hour: {$hour: "$dateCreated"},
minute: {$minute: "$dateCreated"},
second: {$second: "$dateCreated"}
},
cnt: {$sum : 1}
}
}
)
{
"result" : [
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 19
},
"cnt" : 26
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 18
},
"cnt" : 29
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 17
},
"cnt" : 27
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 16
},
"cnt" : 25
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 15
},
"cnt" : 16
}
],
"ok" : 1
}