I have db schema that has string date format("date":"2020-09-01 16:07:45").
I need to search between given date range, I know this is possible if we're using ISO date format but I'm not sure if we can query with date format being string.
I have tried the following query, it doesn't seem to show accurate results.
db.daily_report_latest.find({"date":{$gte: "2021-01-01 00:00:00", $lte:"2021-03-01 00:00:00"}},{"date":1})
Is there any alternative to this? Appreciate your help.
You're right, you can't query a date field with a string, but you can just cast it to date type like so:
Mongo Shell:
db.daily_report_latest.find({
"date": {$gte: ISODate("2021-01-01T00:00:00Z"), $lte: ISODate("2021-03-01T00:00:00Z")}
}, {"date": 1});
For nodejs:
db.daily_report_latest.find({
"date": {$gte: new Date("2021-01-01 00:00:00"), $lte: new Date("2021-03-01 00:00:00")}
}, {"date": 1});
For any other language just check what the mongo driver date type is and do the same.
Note that the mongo shell isn't able to parse the string input in the format you provided, you should read here about the supported formats and transform your string pre-query like I did.
Another thing to consider for the nodejs usecase is timezones, the string will be parsed as the machine current timezone so again you need to adjust to that.
You can use $dateFromString feature of aggregation. (Documentation)
pipeline = []
pipeline.append({"$project": {document: "$$ROOT", "new_date" : { "$dateFromString": { "dateString": '$date', "timezone": 'America/New_York' }}}})
pipeline.append({"$match":{"new_date": {"$gte": ISODate("2021-01-01 00:00:00"), "$lte":ISODate("2021-03-01 00:00:00")}}})
data = db.daily_report_latest.aggregate(pipeline=pipeline)
So in the both the solutions, first typecast the date field in DB to date and then compare it with your input date range.
SOLUTION #1: For MongoDB Version >= 4.0 using $toDate.
db.daily_report_latest.find(
{
$expr: {
$and: [
{ $gte: [{ $toDate: "$date" }, new Date("2021-01-01 00:00:00")] },
{ $lte: [{ $toDate: "$date" }, new Date("2021-03-01 00:00:00")] }
]
}
},
{ "date": 1 }
)
SOLUTION #2: For MongoDb version >= 3.6 using $dateFromString.
db.daily_report_latest.find(
{
$expr: {
$and: [
{ $gte: [{ $dateFromString: { dateString: "$date" }}, new Date("2021-01-01 00:00:00")] },
{ $lte: [{ $dateFromString: { dateString: "$date" }}, new Date("2021-03-01 00:00:00")] }
]
}
},
{ "date": 1 }
)
I am working on a project in which I am tracking number of clicks on a topic.
I am using mongodb and I have to group number of click by date( i want to group data for 15 days).
I am having data store in following format in mongodb
{
"_id" : ObjectId("4d663451d1e7242c4b68e000"),
"date" : "Mon Dec 27 2010 18:51:22 GMT+0000 (UTC)",
"topic" : "abc",
"time" : "18:51:22"
}
{
"_id" : ObjectId("4d6634514cb5cb2c4b69e000"),
"date" : "Mon Dec 27 2010 18:51:23 GMT+0000 (UTC)",
"topic" : "bce",
"time" : "18:51:23"
}
i want to group number of clicks on topic:abc by days(for 15 days)..i know how to group that but how can I group by date which are stored in my database
I am looking for result in following format
[
{
"date" : "date in log",
"click" : 9
},
{
"date" : "date in log",
"click" : 19
},
]
I have written code but it will work only if date are in string (code is here http://pastebin.com/2wm1n1ix)
...please guide me how do I group it
New answer using Mongo aggregation framework
After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework, which is now the better way to do this sort of query. This query is a little challenging because you want to group by date and the values stored are timestamps, so you have to do something to convert the timestamps to dates that match. For the purposes of example I will just write a query that gets the right counts.
db.col.aggregate(
{ $group: { _id: { $dayOfYear: "$date"},
click: { $sum: 1 } } }
)
This will return something like:
[
{
"_id" : 144,
"click" : 165
},
{
"_id" : 275,
"click" : 12
}
]
You need to use $match to limit the query to the date range you are interested in and $project to rename _id to date. How you convert the day of year back to a date is left as an exercise for the reader. :-)
10gen has a handy SQL to Mongo Aggregation conversion chart worth bookmarking. There is also a specific article on date aggregation operators.
Getting a little fancier, you can use:
db.col.aggregate([
{ $group: {
_id: {
$add: [
{ $dayOfYear: "$date"},
{ $multiply:
[400, {$year: "$date"}]
}
]},
click: { $sum: 1 },
first: {$min: "$date"}
}
},
{ $sort: {_id: -1} },
{ $limit: 15 },
{ $project: { date: "$first", click: 1, _id: 0} }
])
which will get you the latest 15 days and return some datetime within each day in the date field. For example:
[
{
"click" : 431,
"date" : ISODate("2013-05-11T02:33:45.526Z")
},
{
"click" : 702,
"date" : ISODate("2013-05-08T02:11:00.503Z")
},
...
{
"click" : 814,
"date" : ISODate("2013-04-25T00:41:45.046Z")
}
]
There are already many answers to this question, but I wasn't happy with any of them. MongoDB has improved over the years, and there are now easier ways to do it. The answer by Jonas Tomanga gets it right, but is a bit too complex.
If you are using MongoDB 3.0 or later, here's how you can group by date. I start with the $match aggregation because the author also asked how to limit the results.
db.yourCollection.aggregate([
{ $match: { date: { $gte: ISODate("2019-05-01") } } },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date"} }, count: { $sum: 1 } } },
{ $sort: { _id: 1} }
])
To fetch data group by date in mongodb
db.getCollection('supportIssuesChat').aggregate([
{
$group : {
_id :{ $dateToString: { format: "%Y-%m-%d", date: "$createdAt"} },
list: { $push: "$$ROOT" },
count: { $sum: 1 }
}
}
])
Late answer, but for the record (for anyone else that comes to this page): You'll need to use the 'keyf' argument instead of 'key', since your key is actually going to be a function of the date on the event (i.e. the "day" extracted from the date) and not the date itself. This should do what you're looking for:
db.coll.group(
{
keyf: function(doc) {
var date = new Date(doc.date);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear()+'';
return {'day':dateKey};
},
cond: {topic:"abc"},
initial: {count:0},
reduce: function(obj, prev) {prev.count++;}
});
For more information, take a look at MongoDB's doc page on aggregation and group: http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
This can help
return new Promise(function(resolve, reject) {
db.doc.aggregate(
[
{ $match: {} },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }, count: { $sum: 1 } } },
{ $sort: { _id: 1 } }
]
).then(doc => {
/* if you need a date object */
doc.forEach(function(value, index) {
doc[index]._id = new Date(value._id);
}, this);
resolve(doc);
}).catch(reject);
}
Haven't worked that much with MongoDB yet, so I am not completely sure. But aren't you able to use full Javascript?
So you could parse your date with Javascript Date class, create your date for the day out of it and set as key into an "out" property. And always add one if the key already exists, otherwise create it new with value = 1 (first click). Below is your code with adapted reduce function (untested code!):
db.coll.group(
{
key:{'date':true},
initial: {retVal: {}},
reduce: function(doc, prev){
var date = new Date(doc.date);
var dateKey = date.getFullYear()+''+date.getMonth()+''+date.getDate();
(typeof prev.retVal[dateKey] != 'undefined') ? prev.retVal[dateKey] += 1 : prev.retVal[dateKey] = 1;
},
cond: {topic:"abc"}
}
)
thanks for #mindthief, your answer help solve my problem today. The function below can group by day a little more easier, hope can help the others.
/**
* group by day
* #param query document {key1:123,key2:456}
*/
var count_by_day = function(query){
return db.action.group(
{
keyf: function(doc) {
var date = new Date(doc.time);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear();
return {'date': dateKey};
},
cond:query,
initial: {count:0},
reduce: function(obj, prev) {
prev.count++;
}
});
}
count_by_day({this:'is',the:'query'})
Another late answer, but still. So if you wanna do it in only one iteration and get the number of clicks grouped by date and topic you can use the following code:
db.coll.group(
{
$keyf : function(doc) {
return { "date" : doc.date.getDate()+"/"+doc.date.getMonth()+"/"+doc.date.getFullYear(),
"topic": doc.topic };
},
initial: {count:0},
reduce: function(obj, prev) { prev.count++; }
})
Also If you would like to optimize the query as suggested you can use an integer value for date (hint: use valueOf(), for the key date instead of the String, though for my examples the speed was the same.
Furthermore it's always wise to check the MongoDB docs regularly, because they keep adding new features all the time. For example with the new Aggregation framework, which will be released in the 2.2 version you can achieve the same results much easier http://docs.mongodb.org/manual/applications/aggregation/
If You want a Date oject returned directly
Then instead of applying the Date Aggregation Operators, instead apply "Date Math" to round the date object. This can often be desirable as all drivers represent a BSON Date in a form that is commonly used for Date manipulation for all languages where that is possible:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
Or if as is implied in the question that the grouping interval required is "buckets" of 15 days, then simply apply that to the numeric value in $mod:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24 * 15
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
The basic math applied is that when you $subtract two Date objects the result returned will be the milliseconds of differnce numerically. So epoch is represented by Date(0) as the base for conversion in whatever language constructor you have.
With a numeric value, the "modulo" ( $mod ) is applied to round the date ( subtract the remainder from the division ) to the required interval. Being either:
1000 milliseconds x 60 seconds * 60 minutes * 24 hours = 1 day
Or
1000 milliseconds x 60 seconds * 60 minutes * 24 hours * 15 days = 15 days
So it's flexible to whatever interval you require.
By the same token from above an $add operation between a "numeric" value and a Date object will return a Date object equivalent to the millseconds value of both objects combined ( epoch is 0, therefore 0 plus difference is the converted date ).
Easily represented and reproducible in the following listing:
var now = new Date();
var bulk = db.datetest.initializeOrderedBulkOp();
for ( var x = 0; x < 60; x++ ) {
bulk.insert({ "date": new Date( now.valueOf() + ( 1000 * 60 * 60 * 24 * x ))});
}
bulk.execute();
And running the second example with 15 day intervals:
{ "_id" : ISODate("2016-04-14T00:00:00Z"), "click" : 12 }
{ "_id" : ISODate("2016-03-30T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-03-15T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-29T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-14T00:00:00Z"), "click" : 3 }
Or similar distribution depending on the current date when the listing is run, and of course the 15 day intervals will be consistent since the epoch date.
Using the "Math" method is a bit easier to tune, especially if you want to adjust time periods for different timezones in aggregation output where you can similarly numerically adjust by adding/subtracting the numeric difference from UTC.
Of course, that is a good solution. Aside from that you can group dates by days as strings (as that answer propose) or you can get the beginning of dates by projecting date field (in aggregation) like that:
{'$project': {
'start_of_day': {'$subtract': [
'$date',
{'$add': [
{'$multiply': [{'$hour': '$date'}, 3600000]},
{'$multiply': [{'$minute': '$date'}, 60000]},
{'$multiply': [{'$second': '$date'}, 1000]},
{'$millisecond': '$date'}
]}
]},
}}
It gives you this:
{
"start_of_day" : ISODate("2015-12-03T00:00:00.000Z")
},
{
"start_of_day" : ISODate("2015-12-04T00:00:00.000Z")
}
It has some pluses: you can manipulate with your days in date type (not number or string), it allows you to use all of the date aggregation operators in following aggregation operations and gives you date type on the output.
I have a MongoDB Analytics-style collection. It contains documents with a timestamp field and various data. Now I want to get a time series with the number of documents for a time period with a granularity parameter.
I'm currently using the aggregation framework like this (assuming that the granularity is DAY) :
db.collection.aggregate([{
$match: {
timestamp: {
$gte: start_time,
$lt: end_time
}
}
}, {
$group: {
_id: {
year: { $year: '$timestamp' },
month: { $month: '$timestamp' },
day: { $dayOfMonth: '$timestamp' }
},
count: { $sum: 1 }
}
}, {
$sort: {
_id: 1
}
}])
This way I have a count value for every day.
The problem is that the counts will depend on the timezone used when computing the $dayOfMonth part (each count is from 00:00:000 UTC to 23:59:999 UTC).
I would like to be able to achieve this without being dependant on the timezone, but relying on the start_time.
For example, if I use a start_time at 07:00 UTC, I will get counts for every day at 07:00 UTC to the next day at 07:00 UTC.
TL;DR : I want something like this : https://dev.twitter.com/ads/reference/get/stats/accounts/%3Aaccount_id/campaigns
Any idea on how to perform this ?
I found a solution that works pretty good. It's not very natural but anyway.
The idea is to compute a "normalized" date based on the startDate and the date of the row. I use the $mod operator on the startDate to get the milliseconds + seconds + hours (in the case of a DAY granularity), and then I use $subtract to subtract it from the date of the row.
Here is an example for a DAY granularity :
var startDate = ISODate("2015-08-25 13:30:00.000Z")
var endDate = ISODate("2015-08-27 13:30:00.000Z")
db.collection.aggregate([{
$match: {
timestamp: {
$gte: startDate,
$lt: endDate
}
}, {
$project: {
timestamp_normalized: {
$subtract: [
"$timestamp",
{
$mod: [
{ "$subtract": [ startDate, new Date("1970-01-01") ] },
1000 * 60 * 60 * 24
]
}
]
}
}
}, {
// now $group with $dayOfMonth
}])
The $mod part computes the hours + seconds + milliseconds of the startDate after 00:00 UTC, in milliseconds.
The $subtract retrieves these milliseconds from the original timestamp.
Now I can use $dayOfMonth operator on my normalized_timestamp field to get the day if we consider intervals from 13:30 to 13:30 the next day, and use $group to get count values for these intervals.
EDIT: It's even easier to compute the value to remove from the timestamp for normalization before creating the query, using :
(startDate - new Date(0)) % (1000 * 60 * 60 * 24)
(for a DAY granularity)
Then subtract directly this value from timestamp instead of using $mod.
I was searching for this one but I couldn't find anything useful to solve my case. What I want is to get the unix timestamp in seconds out of MongoDB ISODate during aggregation. The problem is that I can get the timestamp out of ISODate but it's in milliseconds. So I would need to cut out those milliseconds. What I've tried is:
> db.data.aggregate([
{$match: {dt:2}},
{$project: {timestamp: {$concat: [{$substr: ["$md", 0, -1]}, '01', {$substr: ["$id", 0, -1]}]}}}
])
As you can see I'm trying to get the timestamp out of 'md' var and also concatenate this timestamp with '01' and the 'id' number. The above code gives:
{
"_id" : ObjectId("52f8fc693890fc270d8b456b"),
"timestamp" : "2014-02-10T16:20:56011141"
}
Then I improved the command with:
> db.data.aggregate([
{$match: {dt:2}},
{$project: {timestamp: {$concat: [{$substr: [{$subtract: ["$md", new Date('1970-01-01')]}, 0, -1]}, '01', {$substr: ["$id", 0, -1]}]}}}
])
Now I get:
{
"_id" : ObjectId("52f8fc693890fc270d8b456b"),
"timestamp" : "1392049256000011141"
}
What I really need is 1392049256011141 so without the 3 extra 000. I tried with $subtract:
> db.data.aggregate([
{$match: {dt:2}},
{$project: {timestamp: {$concat: [{$substr: [{$divide: [{$subtract: ["$md", new Date('1970-01-01')]}, 1000]}, 0, -1]}, '01', {$substr: ["$id", 0, -1]}]}}}
])
What I get is:
{
"_id" : ObjectId("52f8fc693890fc270d8b456b"),
"timestamp" : "1.39205e+009011141"
}
Not exactly what I would expect from the command. Unfortunately the $substr operator doesn't allow negative length. Does anyone have any other solution?
I'm not sure why you think you need the value in seconds rather than milliseconds as generally both forms are valid and within most language implementations the milliseconds is actually preferred. But generally speaking, trying to coerce this into a string is the wrong way to go around this, and generally you just do the math:
db.data.aggregate([
{ "$project": {
"timestamp": {
"$subtract": [
{ "$divide": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
1000
]},
{ "$mod": [
{ "$divide": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
1000
]},
1
]}
]
}
}}
])
Which returns you an epoch timestamp in seconds. Basically derived from when one BSON date object is subtracted from another one then the result is the time interval in milliseconds. Using the initial epoch date of "1970-01-01" results in essentially extracting the milliseconds value from the current date value. The $divide operator essentially takes off the milliseconds portion and the $mod does the modulo to implement rounding.
Really though you are better off doing the work in the native language for your application as all BSON dates will be returned there as a native "date/datetime" type where you can extract the timestamp value. Consider the JavaScript basics in the shell:
var date = new Date()
( date.valueOf() / 1000 ) - ( ( date.valueOf() / 1000 ) % 1 )
Typically with aggregation you want to do this sort of "math" to a timestamp value for use in something like aggregating values within a time period such as a day. There are date operators available to the aggregation framework, but you can also do it the date math way:
db.data.aggregate([
{ "$group": {
"_id": {
"$subtract": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
{ "$mod": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
1000 * 60 * 60 * 24
]}
]
},
"count": { "$sum": 1 }
}}
])
That form would be more typical to emit a timestamp rounded to a day, and aggregate the results within those intervals.
So your purposing of the aggregation framework just to extract a timestamp does not seem to be the best usage or indeed it should not be necessary to convert this to seconds rather than milliseconds. In your application code is where I think you should be doing that unless of course you actually want results for intervals of time where you can apply the date math as shown.
The methods are there, but unless you are actually aggregating then this would be the worst performance option for your application. Do the conversion in code instead.