Like the image, the above table represents my original data, time field is irregular. Now I want to get the data that represents the average value between every hour. What I thought was by using $match, $group, $project even with for method. I don't get an accurate idea and method.
id: ObjectId,
value: Number,
time: Date()
I have sample collection, hours.:
{ "_id" : 1, "value" : 10, "dt" : ISODate("2019-10-17T00:01:32Z") }
{ "_id" : 2, "value" : 16, "dt" : ISODate("2019-10-17T00:02:12Z") }
{ "_id" : 3, "value" : 8, "dt" : ISODate("2019-10-17T01:04:09Z") }
{ "_id" : 4, "value" : 12, "dt" : ISODate("2019-10-17T02:14:21Z") }
{ "_id" : 5, "value" : 6, "dt" : ISODate("2019-10-17T02:54:02Z") }
{ "_id" : 6, "value" : 11, "dt" : ISODate("2019-10-17T04:06:31Z") }
The following aggregation query returns the average value by the hour (the hour is of the date field):
db.hours.aggregate( [
{ $project: { value: 1, hr: { $hour: "$dt" } } } ,
{ $addFields: { hour: { $add: [ "$hr", 1 ] } } },
{ $group: { _id: "$hour",
count: { $sum: 1 },
totalValue: { $sum: "$value" },
avgValue: { $avg: "$value" }
}
},
{ $project: { hour: "$_id", _id: 0, count: 1, totalValue: 1, avgValue: 1} }
] )
=>
{ "count" : 2, "totalValue" : 18, "avgValue" : 9, "hour" : 3 }
{ "count" : 1, "totalValue" : 8, "avgValue" : 8, "hour" : 2 }
{ "count" : 1, "totalValue" : 11, "avgValue" : 11, "hour" : 5 }
{ "count" : 2, "totalValue" : 26, "avgValue" : 13, "hour" : 1 }
Finally, I solve this issue. Below is my code.
Related
I have documents like this:
{
"_id" : ObjectId("588e505fcdefc41e84c184cb"),
"Id" : 58614891,
"modifyDate" : 1485567717000,
"data" : [
{
"id" : 99,
"stats" : {
"totalDepth" : 4,
"totalSpeed" : 2,
"totalLostSessions" : 2,
"KDI" : 8,
}
},
{
"id" : 18,
"stats" : {
"totalDepth" : 2,
"totalSpeed" : 1,
"totalLostSessions" : 1,
"KDI" : 2,
}
}
],
"timestampPull" : 1485721695291,
"region" : "eu",
"Status" : 200
}
{
"_id" : ObjectId("588e5060cdefc41e84c184cd"),
"Id" : 38004043,
"modifyDate" : 1485515118000,
"data" : [
{
{
"id" : 18,
"stats" : {
"totalDepth" : 5,
"totalSpeed" : 3,
"totalLostSessions" : 2,
"KDI" : 14,
}
},
{
"id" : 62,
"stats" : {
"totalDepth" : 1,
"totalSpeed" : 0,
"totalLostSessions" : 1,
"KDI" : 1,
}
},
{
"id" : 0,
"stats" : {
"totalDepth" : 155,
"totalSpeed" : 70,
"totalLostSessions" : 85,
"KDI" : 865,
}
}
],
"timestampPull" : 1485721696025,
"region" : "na",
"Status" : 200
}
And i want to calculate average values of every stats if "data" id match.
{
"id" : 99,
"stats" : {
"totalDepth" : 4,
"totalSpeed" : 2,
"totalLostSessions" : 2,
"KDI" : 8,
}
},
{
"id" : 18,
"stats" : {
"totalDepth" : 3.5,
"totalSpeed" : 2,
"totalLostSessions" : 1.5,
"KDI" : 8,
}
} ...
It is possible to perform such operation on mongoDB? I can easily pull every data to application and average it there, but that's not very effective.
You can try below aggregation.
$unwind the data array.
$group by id and calculate the $avg of values and count to $sum the number of values.
$match to keep the data where count is gt than 1.
db.collection.aggregate({
$unwind: "$data"
}, {
$group: {
_id: "$data.id",
count: {
$sum: 1
},
"totalDepth": {
$avg: "$data.stats.totalDepth"
},
"totalSpeed": {
$avg: "$data.stats.totalSpeed"
},
"totalLostSessions": {
$avg: "$data.stats.totalLostSessions"
},
"KDI": {
$avg: "$data.stats.KDI"
}
}
}, {
$match: {
count: {
$gt: 1
}
}
})
My aggregate query for retrieving "average percents per month" returns a -Infinity average for some months. What would cause this?
The relative properties in mycollection are mydate and mynumericfield, which stores a percentage value as a double.
db.mycollection.aggregate(
[
{
$match: {
mydate: {
$gte: new Date(Date.UTC(2014, 8, 1)),
$lte: new Date(Date.UTC(2014, 12, 1)),
}
}
},
{
$group : {
_id : { month: { $month: "$mydate" }, year: { $year: "$mydate" } },
average: { $avg: "$mynumericfield" },
count: { $sum: 1 }
}
}
]
)
Here's a sample of the result:
/* 1 */
{
"result" : [
{
"_id" : {
"month" : 9,
"year" : 2014
},
"average" : 84.2586640583996598,
"count" : 20959.0000000000000000
},
{
"_id" : {
"month" : 11,
"year" : 2014
},
"average" : 96.9326915103888638,
"count" : 20743.0000000000000000
},
{
"_id" : {
"month" : 10,
"year" : 2014
},
"average" : -Infinity,
"count" : 20939.0000000000000000
},
{
"_id" : {
"month" : 12,
"year" : 2014
},
"average" : -Infinity,
"count" : 20913.0000000000000000
}
],
"ok" : 1.0000000000000000
}
I've managed to somehow reproduce the -Infinity problem but on a smaller scale.
Let's take sample collection nums having only 4 documents in it:
{ "_id" : 0, "grp" : 1, "mynum" : -Infinity }
{ "_id" : 1, "grp" : 1, "mynum" : 5 }
{ "_id" : 3, "grp" : 2, "mynum" : 8 }
{ "_id" : 4, "grp" : 2, "mynum" : 89 }
Performing simple aggregation on this collection like:
> db.nums.aggregate([{$group:{"_id":"$grp", "average" : {$avg : "$mynum"}}}])
gives the following result:
{ "_id" : 2, "average" : 48.5 }
{ "_id" : 1, "average" : -Infinity }
which is identical in effects to what you have experienced.
Please try to find out whether in your collection there is a document which has mynumericfield with value -Infinity - maybe your situation is similar to the reproduced one:
> db.mycollection.find({mynumericfield : -Infinity})
I hope it might help you some way.
Need to find the difference between two values of attendance,group by ward_id, based on patient id for two dates. The result has dynamic values based on the array. The difference is between two dates. Key would be ward_id, the difference will be between counts of patient's visit to the ward.
Example sample data
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-03T21:31:29.902Z"),
"ward_id" : 2561
},
"count" : 4112,
"values" : [
{
"count" : 9,
"patient" : ObjectId("54766f973f35473ffc644618")
},
{
"count" : 19,
"patient" : ObjectId("546680e2d660e2dc5ebfea39")
},
{
"count" : 47,
"patient" : ObjectId("546680e3d660e2dc5ebfea72")
},
{
"count" : 1,
"patient" : ObjectId("546a137bdab5f21e612ea7ef")
},
{
"count" : 93,
"patient" : ObjectId("546680e3d660e2dc5ebfea89")
}
]
}
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-03T21:31:29.902Z"),
"ward_id" : 3720
},
"count" : 1,
"values" : [
{
"count" : 1,
"patient" : ObjectId("546a136ddab5f21e612ea6a6")
}
]
}
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-04T21:31:29.902Z"),
"ward_id" : 2561
},
"count" : 4112,
"values" : [
{
"count" : 10,
"patient" : ObjectId("54766f973f35473ffc644618")
},
{
"count" : 10,
"patient" : ObjectId("546680e2d660e2dc5ebfea39")
},
{
"count" : 6,
"patient" : ObjectId("5474e9e46606f32570fa48ff")
},
{
"count" : 1,
"patient" : ObjectId("5474e9e36606f32570fa48f2")
},
{
"count" : 1,
"patient" : ObjectId("546680e3d660e2dc5ebfea77")
},
{
"count" : 543,
"patient" : ObjectId("546680e2d660e2dc5ebfea43")
},
{
"count" : 1,
"patient" : ObjectId("5485fdc8d27a9122956b1c66")
}
]
}
{
"_id" : {
"type" : "patient_attendence",
"ts" : ISODate("2015-02-04T21:31:29.902Z"),
"ward_id" : 3720
},
"count" : 1,
"values" : [
{
"count" : 7,
"patient" : ObjectId("546a136ddab5f21e612ea6a6")
}
]
}
Output
{
"ward_id":2561,
"result" : [{"person": ObjectId("54766f973f35473ffc644618"),
"count_1": 9,
"count_1": 10,
"difference":1 },{"person": ObjectId("546680e2d660e2dc5ebfea39"),
"count_1": 19,
"count_1": 10,
"difference":-9 } ....]
},
{
"ward_id":3720,
"result" : [{"person": ObjectId("546a136ddab5f21e612ea6a6"),
"count_1": 9,
"count_1": 10,
"difference":1 },{"person": ObjectId("546680e2d660e2dc5ebfea39"),
"count_1": 1,
"count_1": 7,
"difference":-6 }]
}
you can use the aggregation framework's $subtract operator outlined here: http://docs.mongodb.org/manual/reference/operator/aggregation-arithmetic/
db.wards.aggregate([
{
$match: {id: {$elemMatch: {ward_id: my_ward_id, ts: my_desired_ts}}},
},
{
$limit: 2
},
{
$project: {values: 1}
},
{
$unwind: '$values'
},
{
$match: {patient: my_patient_id}
},
{
$group: {
_id: null,
'count1': {$first: '$values.count'},
'count2': {$last: '$values.count'}
}
},
{
$subtract: ['$count1', '$count2']
}
])
i haven't tested this but it would probably look like something above
The structure is the following:
{
"_id" : "79f00e2f-5ff6-42e9-a341-3d50410168de",
"bookings" : [
{
"name" : "name1",
"email" : "george_bush#gov.us",
"startDate" : ISODate("2013-12-31T22:00:00Z"),
"endDate" : ISODate("2014-01-09T22:00:00Z")
},
{
"name" : "name2",
"email" : "george_bush#gov.us",
"startDate" : ISODate("2014-01-19T22:00:00Z"),
"endDate" : ISODate("2014-01-24T22:00:00Z")
}
],
"name" : "Hotel0",
"price" : 0,
"rating" : 2
}
Now, I want to generate a report telling me how many bookings were made, grouped by booking month (assume that only booking start date matters) and also grouped by hotels rating.
I expect the answer to be like that:
{
{
rating: 0,
counts: {
month1: 10,
month2: 20,
...
month12: 7
}
}
{
rating: 1,
counts: {
month1: 5,
month2: 8,
...
month12: 9
}
}
...
{
rating: 6,
counts: {
month1: 22,
month2: 23,
...
month12: 24
}
}
}
I tried this with aggregation framework but I'm a little bit stuck.
The following query:
db.book.aggregate([
{ $unwind: '$bookings' },
{ $project: { bookings: 1, rating: 1, month: { $month: '$bookings.startDate' } } },
{ $group: { _id: { rating: '$rating', month: '$month' }, count: { $sum: 1 } } }
]);
Will give you the result per rating/month, but it does not make a subdocument for months. In general, you can not convert a value (such as the month nr) to a key (such as month1)—this is something you can probably quite easily handle in your application though.
The above aggregation results in:
"result" : [
{
"_id" : {
"rating" : 2,
"month" : 1
},
"count" : 1
},
{
"_id" : {
"rating" : 2,
"month" : 12
},
"count" : 1
}
],
"ok" : 1
I wanted to query the database in order to find the number of post per second to feed into a graph to show activity trend. I use spring-data-mongo but for now, the first step is to do this in the mongo shell before worrying about how to do from java.
I used the aggregation framework on it as shown below:
db.post.group({
key:{dateCreated: 1},
cond: { dateCreated:
{
"$gt": new ISODate("2013-08-09T05:51:15Z"),
"$lt": new ISODate("2013-08-09T05:51:20Z")
}
},
reduce: function(cur, result){
result.count += 1
},
initial: {count:0}
})
The result is encouraging but is seems because of the decimal part of the ISODate, the count seems wrong as it does group per seconds with the decimal making each count 1.
[
{
"dateCreated" : ISODate("2013-08-09T05:51:15.332Z"),
"count" : 1
},
{
"dateCreated" : ISODate("2013-08-09T05:51:15.378Z"),
"count" : 1
},
{
"dateCreated" : ISODate("2013-08-09T05:51:15.377Z"),
"count" : 1
},
// many more here
]
Is there a way to just consider only the seconds part as in result like below:
[
{
"dateCreated" : ISODate("2013-08-09T05:51:15Z"),
"count" : 5
},
{
"dateCreated" : ISODate("2013-08-09T05:51:16Z"),
"count" : 8
},
{
"dateCreated" : ISODate("2013-08-09T05:51:17Z"),
"count" : 3
},
{
"dateCreated" : ISODate("2013-08-09T05:51:18Z"),
"count" : 10
},
{
"dateCreated" : ISODate("2013-08-09T05:51:19Z"),
"count" : 2
},
{
"dateCreated" : ISODate("2013-08-09T05:51:20Z"),
"count" : 13
}
]
Thank for reading this.
For those in the same situation. here is how I modified my query. Thanks to #Sammaye.
db.post.aggregate(
{
$match: { dateCreated:
{
"$gt": new ISODate("2013-08-09T05:51:15.000Z"),
"$lt": new ISODate("2013-08-09T05:51:20.000Z")
}
}
},
{
$group: {
_id: {
hour: {$hour: "$dateCreated"},
minute: {$minute: "$dateCreated"},
second: {$second: "$dateCreated"}
},
cnt: {$sum : 1}
}
}
)
{
"result" : [
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 19
},
"cnt" : 26
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 18
},
"cnt" : 29
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 17
},
"cnt" : 27
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 16
},
"cnt" : 25
},
{
"_id" : {
"hour" : 5,
"minute" : 51,
"second" : 15
},
"cnt" : 16
}
],
"ok" : 1
}