How to Get the Max Daily Value per Week by Grouping - mongodb

I am making a project which requires me to first calculate how much distance was traveled per day. and then on that data I have to how show What was the maximum, minimum and average distance traveled that particular week?
This is a mongoDB script I have written.
db = connect("localhost:27017/mydb");
var result = db.trips.aggregate([
{
"$unwind" : "$trips"
},
{
"$match" : {
"trips.startTime" : {"$lte" : ISODate("2015-10-31T23:59:59Z"), "$gte" : ISODate("2015-10-25T00:00:00Z")}
}
},
{
"$group" :
{
"_id" : {
"date" : {"$dayOfMonth" : "$trips.startTime"}
},
"distance" :{"$sum" : "$trips.distance"}
}
}
]);
while(result.hasNext())
{
print(tojson(result.next()));
}
Which when replaced by dynamic dates gives me correct values.
Now it leaves me with two options, either I modify the current group query or write a double group query. Double group query seems a more valid approach. My attempt at writing such a query.
{
"$group" :
{
"_id" : {
"week" : "$_id.date"
},
"max-distance" : {
"$max" : "$distance"
}
}
}
Adding these lines didn't make a difference, clearly I know I am doing wrong, but how to correct it. i would need help with that
Thanks

You seem to wan the $week operator, but of course you need a valid Date as input in order to extract the "week" from that.
What you may not know is that you can instead use "date math" to round out the date to a "day", where the result is still a Date object. Then you can use the $week operator to obtain your $max values:
db.trips.aggregate([
{ "$unwind" : "$trips" },
{ "$match": {
"trips.startTime" : {
"$lte": ISODate("2015-10-31T23:59:59Z"),
"$gte": ISODate("2015-10-25T00:00:00Z")
}
}},
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$trips.startTime", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$trips.startTime", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"distance": { "$sum": "$trips.distance" }
}},
{ "$group": {
"_id": { "$week": "$_id" },
"max-distance": { "$max": "$distance" }
}
]);
The basic trick in the first part is when you $subtract one Date object from another, the result is the millseconds in difference. So using the epoch date the data is converted to it's milliseconds equivalent and then you can use the math to round that number to a day.
(1000 * 60 * 60 * 24) is the number of milliseconds in a day, so finding the modulo ( $mod ) of that returns the remainder of milleseconds past the day, which you can subtract from the date value in the document to round to a day.
The same is true of $add when adding a Date object to a number, the result is a Date. So this handles the conversion, and then the $week can be extracted from there.

Related

MongoDB date math with aggregation variable

I'm trying to build an aggregation of things that haven't reported in by some interval (heartbeat) - I need to calculate a value based on a stored heartbeat:
db.things.aggregate([
{$project: {"lastmsg":1, "props.settings":1}},
{$unwind: "$props.settings"},
{$project: {
_id:0,
"lastmsg": "$lastmsg",
"heartbeat": {$multiply: [{$toInt: "$props.settings.heartbeat"},2000]},
"now": new Date(), "subtracted": new Date(new Date().getTime()- "$heartbeat")
}
}
])
Result returned is like this:
{ "lastmsg" : ISODate("2020-04-23T12:41:37.667Z"), "heartbeat" : 240000, "now" : ISODate("2020-05-14T16:26:11.824Z"), "subtracted" : ISODate("1970-01-01T00:00:00Z") }
{ "lastmsg" : ISODate("2020-05-14T16:24:24.228Z"), "heartbeat" : 240000, "now" : ISODate("2020-05-14T16:26:11.824Z"), "subtracted" : ISODate("1970-01-01T00:00:00Z") }
The "subtracted" projection is not doing the date math as expected. I can plug in a specific number and it works but this defeats the purpose...
As a last step I will match to see what of these things hasn't checked in within the interval of heartbeat:
{ $match: { "lastmsg":{$gte: "$subtracted")}
Any help would be greatly appreciated...
I don't know how your data is like (you should post your data to help), but I think this can solve the problem.
You can use the $$NOW variable, that returns the current date in ISODate format.
Test data:
[
{
"lastmsg": ISODate("2020-04-23T12:41:37.667Z"),
"heartbeat": 240000
},
{
"lastmsg": ISODate("2020-05-14T16:24:24.228Z"),
"heartbeat": 240000
}
]
Query:
db.collection.aggregate([
{
$addFields: {
"now": "$$NOW",
"subtracted": {
$subtract: [
"$$NOW",
"$heartbeat"
]
}
}
},
{
$match: {
"lastmg": {
$gte: "$subtracted"
}
}
}
])

MongoDB find in one query records created in last week and since always

I'm trying to build a query that return a total count of users and a count of users created on last week.
There's a field called timeStamp that represents the date of creation.
I'm trying to do this with aggregation, I guess I should first group all users by timeStamp but then I don't know exactly what could I do to achieve this.
EDIT:
Sample user document:
{
"_id" : ObjectId("57be35d6fab7762415376b1b"),
"provider" : "local",
"isValidAccount" : true,
"isActive" : true,
"timeStamp" : ISODate("2016-08-25T00:03:34.533Z"),
"scope" : "getm-user",
"tkbSponsor" : "example#example.com",
"userId" : "example#example.com",
"passwd" : "$2a$14$WARJLD4RtYOApJvTNwQHluLvWpZzQzvUxudIln.j5aQJaxYsJtHEG",
"posFavorites" : [ ],
}
What I do need is a count of ALL users and another count of all users created 7 days ago.
You first need to create a date range query that satisfy the given condition of users created last week, this means defining two variables that will hold the date objects representing the start of the day last week and the end. You will need this to query your collection later on in the pipeline.
You can start with the $group pipeline step that groups all the documents in the collection and calculates the total docs using $sum. You can also calculate the conditional sum based on the date range using the $cond tenary operator to feed the $sum.
The following explains the above approach:
var today = new Date();
var lastWeekStart = new Date(today.getFullYear(), today.getMonth(), today.getDate() - 7);
var lastWeekEnd = new Date(today.getFullYear(), today.getMonth(), today.getDate() - 7);
var start = new Date(lastWeekStart.setHours(0,0,0,0));
var end = new Date(lastWeekEnd.setHours(23,59,59,999));
db.collection.aggregate([
{
"$group": {
"_id": null,
"total": { "$sum": 1 },
"usersCreatedLastWeek": {
"$sum": {
"$cond": [
{
"$and": [
{ "$gte": [ "$timeStamp", start ] },
{ "$lte": [ "$timeStamp", end ] }
]
},
1,
0
]
}
}
}
}
])
Users created last week: timeStamps with week equal to last calendar week.
After declaring today's date. we can use aggregation stages in a pipeline like so:
Project to get time stamp's year and week, and also current year and week.
Project again to compare :
current year with time stamp's year.
current week(-1) with time stamp's week.
Match comparison fields with 0, as 0 means equal.
Lastly group to get total of such time stamps of last week.
Execute this on mongo shell :
var today = new Date();
db.yourCollectionName.aggregate([{
$project: {
"tsYear": {$year: "$timeStamp"},
"tsWeek": {$week: "$timeStamp"},
"todYear": {$year: today},
"todWeek": {$week: today}
}
}, {
$project: {
cmpWeek: {$cmp: ['$tsWeek', {$add: [-1, '$todWeek']}]},
cmpYear: {$cmp: ['$tsYear', '$todYear']}
}
}, {
$match: {
cmpWeek: 0,
cmpYear: 0
}
}, {
$group: {
_id: "UsersCreated",
totalLastWeek: {$sum: 1}
}
}
])

How do I do a 'group by' for a datetime when I want to group by just the date using $group? [duplicate]

I am working on a project in which I am tracking number of clicks on a topic.
I am using mongodb and I have to group number of click by date( i want to group data for 15 days).
I am having data store in following format in mongodb
{
"_id" : ObjectId("4d663451d1e7242c4b68e000"),
"date" : "Mon Dec 27 2010 18:51:22 GMT+0000 (UTC)",
"topic" : "abc",
"time" : "18:51:22"
}
{
"_id" : ObjectId("4d6634514cb5cb2c4b69e000"),
"date" : "Mon Dec 27 2010 18:51:23 GMT+0000 (UTC)",
"topic" : "bce",
"time" : "18:51:23"
}
i want to group number of clicks on topic:abc by days(for 15 days)..i know how to group that but how can I group by date which are stored in my database
I am looking for result in following format
[
{
"date" : "date in log",
"click" : 9
},
{
"date" : "date in log",
"click" : 19
},
]
I have written code but it will work only if date are in string (code is here http://pastebin.com/2wm1n1ix)
...please guide me how do I group it
New answer using Mongo aggregation framework
After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework, which is now the better way to do this sort of query. This query is a little challenging because you want to group by date and the values stored are timestamps, so you have to do something to convert the timestamps to dates that match. For the purposes of example I will just write a query that gets the right counts.
db.col.aggregate(
{ $group: { _id: { $dayOfYear: "$date"},
click: { $sum: 1 } } }
)
This will return something like:
[
{
"_id" : 144,
"click" : 165
},
{
"_id" : 275,
"click" : 12
}
]
You need to use $match to limit the query to the date range you are interested in and $project to rename _id to date. How you convert the day of year back to a date is left as an exercise for the reader. :-)
10gen has a handy SQL to Mongo Aggregation conversion chart worth bookmarking. There is also a specific article on date aggregation operators.
Getting a little fancier, you can use:
db.col.aggregate([
{ $group: {
_id: {
$add: [
{ $dayOfYear: "$date"},
{ $multiply:
[400, {$year: "$date"}]
}
]},
click: { $sum: 1 },
first: {$min: "$date"}
}
},
{ $sort: {_id: -1} },
{ $limit: 15 },
{ $project: { date: "$first", click: 1, _id: 0} }
])
which will get you the latest 15 days and return some datetime within each day in the date field. For example:
[
{
"click" : 431,
"date" : ISODate("2013-05-11T02:33:45.526Z")
},
{
"click" : 702,
"date" : ISODate("2013-05-08T02:11:00.503Z")
},
...
{
"click" : 814,
"date" : ISODate("2013-04-25T00:41:45.046Z")
}
]
There are already many answers to this question, but I wasn't happy with any of them. MongoDB has improved over the years, and there are now easier ways to do it. The answer by Jonas Tomanga gets it right, but is a bit too complex.
If you are using MongoDB 3.0 or later, here's how you can group by date. I start with the $match aggregation because the author also asked how to limit the results.
db.yourCollection.aggregate([
{ $match: { date: { $gte: ISODate("2019-05-01") } } },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date"} }, count: { $sum: 1 } } },
{ $sort: { _id: 1} }
])
To fetch data group by date in mongodb
db.getCollection('supportIssuesChat').aggregate([
{
$group : {
_id :{ $dateToString: { format: "%Y-%m-%d", date: "$createdAt"} },
list: { $push: "$$ROOT" },
count: { $sum: 1 }
}
}
])
Late answer, but for the record (for anyone else that comes to this page): You'll need to use the 'keyf' argument instead of 'key', since your key is actually going to be a function of the date on the event (i.e. the "day" extracted from the date) and not the date itself. This should do what you're looking for:
db.coll.group(
{
keyf: function(doc) {
var date = new Date(doc.date);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear()+'';
return {'day':dateKey};
},
cond: {topic:"abc"},
initial: {count:0},
reduce: function(obj, prev) {prev.count++;}
});
For more information, take a look at MongoDB's doc page on aggregation and group: http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
This can help
return new Promise(function(resolve, reject) {
db.doc.aggregate(
[
{ $match: {} },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }, count: { $sum: 1 } } },
{ $sort: { _id: 1 } }
]
).then(doc => {
/* if you need a date object */
doc.forEach(function(value, index) {
doc[index]._id = new Date(value._id);
}, this);
resolve(doc);
}).catch(reject);
}
Haven't worked that much with MongoDB yet, so I am not completely sure. But aren't you able to use full Javascript?
So you could parse your date with Javascript Date class, create your date for the day out of it and set as key into an "out" property. And always add one if the key already exists, otherwise create it new with value = 1 (first click). Below is your code with adapted reduce function (untested code!):
db.coll.group(
{
key:{'date':true},
initial: {retVal: {}},
reduce: function(doc, prev){
var date = new Date(doc.date);
var dateKey = date.getFullYear()+''+date.getMonth()+''+date.getDate();
(typeof prev.retVal[dateKey] != 'undefined') ? prev.retVal[dateKey] += 1 : prev.retVal[dateKey] = 1;
},
cond: {topic:"abc"}
}
)
thanks for #mindthief, your answer help solve my problem today. The function below can group by day a little more easier, hope can help the others.
/**
* group by day
* #param query document {key1:123,key2:456}
*/
var count_by_day = function(query){
return db.action.group(
{
keyf: function(doc) {
var date = new Date(doc.time);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear();
return {'date': dateKey};
},
cond:query,
initial: {count:0},
reduce: function(obj, prev) {
prev.count++;
}
});
}
count_by_day({this:'is',the:'query'})
Another late answer, but still. So if you wanna do it in only one iteration and get the number of clicks grouped by date and topic you can use the following code:
db.coll.group(
{
$keyf : function(doc) {
return { "date" : doc.date.getDate()+"/"+doc.date.getMonth()+"/"+doc.date.getFullYear(),
"topic": doc.topic };
},
initial: {count:0},
reduce: function(obj, prev) { prev.count++; }
})
Also If you would like to optimize the query as suggested you can use an integer value for date (hint: use valueOf(), for the key date instead of the String, though for my examples the speed was the same.
Furthermore it's always wise to check the MongoDB docs regularly, because they keep adding new features all the time. For example with the new Aggregation framework, which will be released in the 2.2 version you can achieve the same results much easier http://docs.mongodb.org/manual/applications/aggregation/
If You want a Date oject returned directly
Then instead of applying the Date Aggregation Operators, instead apply "Date Math" to round the date object. This can often be desirable as all drivers represent a BSON Date in a form that is commonly used for Date manipulation for all languages where that is possible:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
Or if as is implied in the question that the grouping interval required is "buckets" of 15 days, then simply apply that to the numeric value in $mod:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24 * 15
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
The basic math applied is that when you $subtract two Date objects the result returned will be the milliseconds of differnce numerically. So epoch is represented by Date(0) as the base for conversion in whatever language constructor you have.
With a numeric value, the "modulo" ( $mod ) is applied to round the date ( subtract the remainder from the division ) to the required interval. Being either:
1000 milliseconds x 60 seconds * 60 minutes * 24 hours = 1 day
Or
1000 milliseconds x 60 seconds * 60 minutes * 24 hours * 15 days = 15 days
So it's flexible to whatever interval you require.
By the same token from above an $add operation between a "numeric" value and a Date object will return a Date object equivalent to the millseconds value of both objects combined ( epoch is 0, therefore 0 plus difference is the converted date ).
Easily represented and reproducible in the following listing:
var now = new Date();
var bulk = db.datetest.initializeOrderedBulkOp();
for ( var x = 0; x < 60; x++ ) {
bulk.insert({ "date": new Date( now.valueOf() + ( 1000 * 60 * 60 * 24 * x ))});
}
bulk.execute();
And running the second example with 15 day intervals:
{ "_id" : ISODate("2016-04-14T00:00:00Z"), "click" : 12 }
{ "_id" : ISODate("2016-03-30T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-03-15T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-29T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-14T00:00:00Z"), "click" : 3 }
Or similar distribution depending on the current date when the listing is run, and of course the 15 day intervals will be consistent since the epoch date.
Using the "Math" method is a bit easier to tune, especially if you want to adjust time periods for different timezones in aggregation output where you can similarly numerically adjust by adding/subtracting the numeric difference from UTC.
Of course, that is a good solution. Aside from that you can group dates by days as strings (as that answer propose) or you can get the beginning of dates by projecting date field (in aggregation) like that:
{'$project': {
'start_of_day': {'$subtract': [
'$date',
{'$add': [
{'$multiply': [{'$hour': '$date'}, 3600000]},
{'$multiply': [{'$minute': '$date'}, 60000]},
{'$multiply': [{'$second': '$date'}, 1000]},
{'$millisecond': '$date'}
]}
]},
}}
It gives you this:
{
"start_of_day" : ISODate("2015-12-03T00:00:00.000Z")
},
{
"start_of_day" : ISODate("2015-12-04T00:00:00.000Z")
}
It has some pluses: you can manipulate with your days in date type (not number or string), it allows you to use all of the date aggregation operators in following aggregation operations and gives you date type on the output.

Need to aggregate by hour and $avg not recognized

From a MongoDB collection storing data with time stamps I need to return a single record for each hour.
So far I have selected the set of records between two dates successfully, but I cant figure how to build the hourly record I need in the $group clause.
var myName = "CollectionName"
//schema for mongoose
var mySchema = new Schema({
dt: Date,
value: Number
});
var myDB = mongoose.createConnection('mongodb://localhost:27017/MYDB');
myDBObj = myDB.model(myName, evalSchema, myName);
The match in this aggregate call works fine, and the $hour creates a record for each hour in the day.. but I don't know how to recreate the a full date and get an error "unknown group operator $avg" ...
myDBObj.aggregate([
{
$match: { "dt": { $gt: new Date("October 13, 2010 12:00:00"), $lt: new Date("November 13, 2010 12:00:00") } }
},{
$group: {
"_id": { "dt": { "$hour": "$dt" } , "price": { "$avg": "$price" }}
}], function (err, data) { if (err) { return next(err); } res.json(data); });
I think I need to use $dayOfYear so there is different records for each hour of each day, and include a new Date() somewhere ...
Can someone help me do this correctly? any help is appreciated.
The $group pipeline stage works by "grouping" all data by the "key" specified for _id. Other fields you are actually aggregating are separate from the _id value and are their own field properties.
So your $group becomes this instead:
{ "$group": {
"_id": { "$hour": "$dt" },
"price": { "$avg": "$price" }
}}
Or if you want that broken by day then make a compound key:
{ "$group": {
"_id": {
"day": { "$dayOfYear": "$dt" },
"hour": { "$hour": "$dt" }
},
"price": { "$avg": "$price" }
}}
Or just use date math to produce Date objects rounded by hour:
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$dt", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$dt", new Date(0) ] },
1000 * 60 *60
]}
]},
new Date(0)
]
},
"price": { "$avg": "$price" }
}}
Where subrtacting another date object (epoch date) from another prodces a numeric value you can round ( 1000 milliseconds, 60 seconds, 60 minutes = 1 hour ) with the applied math, and adding a number to a date object produces a date corresponding to that value.
So your problem was you had everything in the _id, where the $avg accumulator is not recognised. All accumulators need to be specified outside of the grouping key. That is the intent.
If you want to make an accumulator value part of a grouping key ( does not seem relevant here though ), you instead follow with another group stage, referencing the field that was produced from the former.

MongoDB - Query all documents createdAt within last hours, and group by minute?

From reading various articles out there, I believe this should be possible, but I'm not sure where exactly to start.
This is what I'm trying to do:
I want to run a query, where it finds all documents createAt within the last hour, and groups all of them by minute, and since each document has a tweet value, like 5, 6, or 19, add them up for each one of those minutes and provides a sum.
Here's a sample of the collection:
{
"createdAt": { "$date": 1385064947832 },
"updatedAt": null,
"tweets": 47,
"id": "06E72EBD-D6F4-42B6-B79B-DB700CCD4E3F",
"_id": "06E72EBD-D6F4-42B6-B79B-DB700CCD4E3F"
}
Is this possible to do in mongodb?
#zero323 - I first tried just grouping the last hour like so:
db.tweetdatas.group( {
key: { tweets: 1, 'createdAt': 1 },
cond: { createdAt: { $gt: new Date("2013-11-20T19:44:58.435Z"), $lt: new Date("2013-11-20T20:44:58.435Z") } },
reduce: function ( curr, result ) { },
initial: { }
} )
But that just returns all the tweets within the timeframe, which technically is what I want, but now I want to group them all by each minute, and add up the sum of tweets for each minute.
#almypal
Here is the query that I'm using, based off your suggestion:
db.tweetdatas.aggregate(
{$match:{ "createdAt":{$gt: "2013-11-22T14:59:18.748Z"}, }},
{$project: { "createdAt":1, "createdAt_Minutes": { $minute : "$createdAt" }, "tweets":1, }},
{$group:{ "_id":"$createdAt_Minutes", "sum_tweets":{$sum:"$tweets"} }}
)
However, it's displaying this response:
{ "result" : [ ], "ok" : 1 }
Update: The response from #almypal is working. Apparently, putting in the date like I have in the above example does not work. While I'm running this query from Node, in the shell, I thought it would be easier to convert the var date to a string, and use that in the shell.
Use aggregation as below:
var lastHour = new Date();
lastHour.setHours(lastHour.getHours()-1);
db.tweetdatas.aggregate(
{$match:{ "createdAt":{$gt: lastHour}, }},
{$project: { "createdAt":1, "createdAt_Minutes": { $minute : "$createdAt" }, "tweets":1, }},
{$group:{ "_id":"$createdAt_Minutes", "sum_tweets":{$sum:"$tweets"} }}
)
and the result would be like this
{
"result" : [
{
"_id" : 1,
"sum_tweets" : 117
},
{
"_id" : 2,
"sum_tweets" : 40
},
{
"_id" : 3,
"sum_tweets" : 73
}
],
"ok" : 1
}
where _id corresponds to the specific minute and sum_tweets is the total number of tweets in that minute.