Mongodb WeekofMonth? - mongodb

I am stuck with achieving weekofMonth instead of WeekofYear. Can somebody guide me on how to get this right?
db.activity.aggregate([
{
$group:{
_id: {
week: { $week: "$createdAt" },
month: { $month: "$createdAt" },
year: { $year: "$createdAt" }
},
count: { $sum: 1 }
}
},
{ $match : { "_id.year" : 2016, "_id.month" : 5 } }
])
Output
/* 1 */
{
"_id" : {
"week" : 19,
"month" : 5,
"year" : 2016
},
"count" : 133.0
}
/* 2 */
{
"_id" : {
"week" : 18,
"month" : 5,
"year" : 2016
},
"count" : 1.0
}
In the above shown data, it is actually not displaying weekofMonth. How can I get this given week 18 is the first week of Month?

The $week operator gives you the week of year as described in the docs.
The week of month can be calculated by getting the day of month and dividing by 7.
db.activity.aggregate([
{$project: {
"year": {$year: "$createdAt"},
"month": {$month: "$createdAt"},
"weekOfMonth": {$floor: {$divide: [{$dayOfMonth: "$createdAt"}, 7]}}
}},
{$group: {
"_id": {"year": "$year", "month": "$month", "weekOfMonth": "$weekOfMonth"},
count: { $sum: 1 }
}},
{$match : { "_id.year" : 2016, "_id.month" : 5}}
])
Note that the week of month here is 0 based. If you want it to start at 1 just $add 1. Also, the $floor operator is new in version 3.2.
Edit
You can simulate the floor using $mod (which exists in version 3.0)
"weekOfMonth": {$subtract: [{$divide: [{$dayOfMonth: "$createdAt"}, 7]}, {$mod: [{$divide: [{$dayOfMonth: "$createdAt"}, 7]}, 1]}]},

Related

How to use $dayOfYear aggregation with epoch timestamps [duplicate]

I am trying to aggregate records in a MongoDB collection by hour and need to convert date stored as timestamp (milliseconds) to ISODate so that I can use aggregate framework's built-in date operators ($hour, $month, etc.)
Records are stored as
{
"data" : { "UserId" : "abc", "ProjId" : "xyz"},
"time" : NumberLong("1395140780706"),
"_id" : ObjectId("532828ac338ed9c33aa8eca7")
}
I am trying to use an aggregate query of following type:
db.events.aggregate(
{
$match : {
"time" : { $gte : 1395186209804, $lte : 1395192902825 }
}
},
{
$project : {
_id : "$_id",
dt : {$concat : (Date("$time")).toString()} // need to project as ISODate
}
},
// process records further in $project or $group clause
)
which produces results of the form:
{
"result" : [
{
"_id" : ObjectId("5328da21fd207d9c3567d3ec"),
"dt" : "Fri Mar 21 2014 17:35:46 GMT-0400 (EDT)"
},
{
"_id" : ObjectId("5328da21fd207d9c3567d3ed"),
"dt" : "Fri Mar 21 2014 17:35:46 GMT-0400 (EDT)"
},
...
}
I want to extract hour, day, month, and year from the date but since time is projected forward as string I am unable to use aggregate framework's built-in date operators ($hour, etc.).
How can I convert time from milliseconds to ISO date to do sometime like the following:
db.events.aggregate(
{
$match : {
"time" : { $gte : 1395186209804, $lte : 1395192902825 }
}
},
{
$project : {
_id : "$_id",
dt : <ISO date from "$time">
}
},
{
$project : {
_id : "$_id",
date : {
hour : {$hour : "$dt"}
}
}
}
)
Actually, it is possible, the trick is to add your milliseconds time to a zero-milliseconds Date() object using syntax similar to:
dt : {$add: [new Date(0), "$time"]}
I modified your aggregation from above to produce the result:
db.events.aggregate(
{
$project : {
_id : "$_id",
dt : {$add: [new Date(0), "$time"]}
}
},
{
$project : {
_id : "$_id",
date : {
hour : {$hour : "$dt"}
}
}
}
);
The result is (with one entry of your sample data):
{
"result": [
{
"_id": ObjectId("532828ac338ed9c33aa8eca7"),
"date": {
"hour": 11
}
}
],
"ok": 1
}
I assume there's no way to do it. Because aggregation framework is written in native code. not making use of the V8 engine. Thus everything of JavaScript is not gonna work within the framework (And that's also why aggregation framework runs much faster).
Map/Reduce is a way to work this out, but aggregation framework definitely got much better performance.
About Map/Reduce performance, read this thread.
Another way to work it out would be get a "raw" result from aggregation framework, put it into an JSON array. Then do the conversion by running JavaScript. Sort of like:
var results = db.events.aggregate(...);
reasult.forEach(function(data) {
data.date = new Date(data.dateInMillionSeconds);
// date is now stored in the "date" property
}
To return a valid BSON date all you need is a little date "maths" using the $add operator. You need to add new Date(0) to the timestamp. The new Date(0) represents the number of milliseconds since the Unix epoch (Jan 1, 1970) and is a shorthand for new Date("1970-01-01").
db.events.aggregate([
{ "$match": { "time": { "$gte" : 1395136209804, "$lte" : 1395192902825 } } },
{ "$project": {
"hour": { "$hour": { "$add": [ new Date(0), "$time" ] } },
"day": { "$dayOfMonth": { "$add": [ new Date(0), "$time" ] } },
"month": { "$month": { "$add": [ new Date(0), "$time" ] } },
"year": { "$year": { "$add": [ new Date(0), "$time" ] } }
}}
])
Which yields:
{
"_id" : ObjectId("532828ac338ed9c33aa8eca7"),
"hour" : 11,
"day" : 18,
"month" : 3,
"year" : 2014
}
Starting Mongo 4.0, there is a new $toDate aggregation operator which can convert from various types to a date (in this case from a long):
// { time: NumberLong("1395140780706") }
db.collection.aggregate({ $set: { time: { $toDate: "$time" } } })
// { time: ISODate("2014-03-18T11:06:20.706Z") }
And to get the hour out of it:
// { time: NumberLong("1395140780706") }
db.collection.aggregate({ $project: { hour: { $hour: { $toDate: "$time" } } } })
// { hour: 11 }
use this if {$add: [new Date(0), "$time"]} function returning string type not an ISO date type
I use all of that option but still fail, because my new date from $project return a string type like '2000-11-2:xxxxxxx' not date type like ISO('2000-11-2:xxxxxxx') for anyone who have same problem with me use this.
db.events.aggregate(
{
$project : {
_id : "$_id",
dt : {$add: [new Date(0), "$time"]}
}
},
{
$project : {
_id : "$_id",
"year": { $substr: [ "$dt", 0, 4 ] },
"month": { $substr: [ "$dt", 5, 2] },
"day": { $substr: [ "$dt", 8, 2 ] }
}
}
);
the result will be
{ _id: '59f940eaea87453b30f42cf5',
year: '2017',
month: '07',
day: '04'
},
you can get hours or minute if you want depending on which string you want to subset, then you can group that again according to same date,month or year

How to handle date between India Time on Client side and Server date in US Time via mongo

I have an app where the client location will be in India. My application has to aggregate data based on the daterange client has given. So if the client gives 14-Dec-2016 to 21-Dec-2016. It should search on from 14-Dec-2016 00:00:00am to 21-Dec-2016 23:59:59pm.
Now as soon as I send my date to my server it get converted to
Dec 13 2016 18:30:00 GMT+0000 (UTC)
Dec 21 2016 18:29:59 GMT+0000 (UTC)
Now I write my aggregation query as
let cursor = Trip.aggregate([{
$match: {
startTime: {
$gte: startDate.toDate(),
$lte: endDate.toDate()
},
}
},{
$group: {
_id: {
date: {
$dayOfMonth: "$startTime"
},
month: {
$month: "$startTime"
},
year: {
$year: "$startTime"
}
},
count: {
$sum: 1
}
}
}]);
Which results in following output
[ { _id: { date: 17, month: 12, year: 2016 }, count: 2 },
{ _id: { date: 16, month: 12, year: 2016 }, count: 2 },
{ _id: { date: 13, month: 12, year: 2016 }, count: 2 } ]
The actual time the trip took place was
"startTime" : ISODate("2016-12-13T20:10:20.381Z")
"startTime" : ISODate("2016-12-13T19:54:56.855Z")
Which actually took place on 14-12-2016 01:40:20am and 14-12-2016 01:24:56am
I want all things to be in one time-range but MongoDB does not allow to store data in any other time range other than UTC and it is getting difficult to manage different times in client-side query and database.
How should I go about solving it?
You can approach the following way. You can save the records with offset millis. So your collection will look like something below.
{
"_id": ObjectId("585a97dcaceaaa5d2254aeb5"),
"start_date": ISODate("2016-12-17T00:00:00Z"),
"offsetmillis": -19080000
} {
"_id": ObjectId("585a97dcaceaaa5d2254aeb6"),
"start_date": ISODate("2016-11-17T00:00:00Z"),
"offsetmillis": -19080000
} {
"_id": ObjectId("585a97dcaceaaa5d2254aeb7"),
"start_date": ISODate("2016-11-13T00:00:00Z"),
"offsetmillis": -19080000
}
And you can update the aggregation query to include the offset millis while processing.
aggregate([{
$match: {
start_date: {
$gte: new ISODate("2016-01-01"),
$lte: new ISODate("2016-12-31")
},
}
}, {
$group: {
_id: {
date: {
$dayOfMonth: {
$add: ["$start_date", "$offsetmillis"]
}
},
month: {
$month: {
$add: ["$start_date", "$offsetmillis"]
}
},
year: {
$year: {
$add: ["$start_date", "$offsetmillis"]
}
}
},
count: {
$sum: 1
}
}
}]);
Sample Response
{ "_id" : { "date" : 12, "month" : 11, "year" : 2016 }, "count" : 1 }
{ "_id" : { "date" : 16, "month" : 11, "year" : 2016 }, "count" : 1 }
{ "_id" : { "date" : 16, "month" : 12, "year" : 2016 }, "count" : 1 }
You can optimize it more but I think this will give you an idea.

Group and count distinct occurrences

I am trying to derive a query to get a count of distinct values and display the relevant fields. The grouping is done by the tempId and the date where the tempId can occur one-to-many times within a single day and within a time frame.
following is my approach,
db.getCollection('targetCollection').aggregate(
{
$match:{
"user.vendor": 'vendor1',
tool: "tool1",
date: {
"$gte": ISODate("2016-04-01"),
"$lt": ISODate("2016-04-04")
}
}
},
{
$group:{
_id: {
tempId: '$tempId',
month: { $month: "$date" },
day: { $dayOfMonth: "$date" },
year: { $year: "$date" }
},
count: {$sum : 1}
}
},
{
$group:{
_id: 1,
count: {$sum : 1}
}
})
This query generates the following output,
{
"_id" : 1,
"count" : 107
}
Which is correct but, I would like to show them separated by the date and with the particular count for that date. For example something like this,
{
"date" : 2016-04-01
"count" : 50
},
{
"date" : 2016-04-02
"count" : 30
},
{
"date" : 2016-04-03
"count" : 27
}
P.S. I am not sure how to put this question together as I am quite new to this technology. Please let me know if refinements are required in the question.
Following is the sample data of the mongodb collection that I am trying to query,
{
"_id" : 1,
"tempId" : "temp1",
"user" : {
"_id" : "user1",
"email" : "user1#email.com",
"vendor" : "vendor1"
},
"tool" : "tool1",
"date" : ISODate("2016-03-09T08:30:42.403Z")
},...
I have come up with the solution myself. What i did was,
I first grouped by the tempId and the date
Then I grouped by the date
This printed out the daily distinct count of tempId, the result I want. The query is as follows,
db.getCollection('targetCollection').aggregate(
{
$match:{
"user.vendor": 'vendor1',
tool: "tool1",
date: {
"$gte": ISODate("2016-04-01"),
"$lt": ISODate("2016-04-13")
}
}
},
{
$group:{
_id: {
tempId: "$tempId",
month: { $month: "$date" },
day: { $dayOfMonth: "$date" },
year: { $year: "$date" }
},
count: {$sum : 1}
}
},
{
$group:{
_id: {
month:"$_id.month" ,
day: "$_id.day" ,
year: "$_id.year"
},
count: {$sum : 1}
}
})
group them via date
db.getCollection('targetCollection').aggregate([
{
$match:{
"user.vendor": 'vendor1',
tool: "tool1",
date: {
"$gte": ISODate("2016-04-01"),
"$lt": ISODate("2016-04-04")
}
}
},
{
$group: {
_id: {
date: "$date",
tempId: "$tempId"
},
count: { $sum: 1 }
}
}
]);

Get distinct ISO dates by days, months, year

I want to get a distinct set of years and months for all document objects in my MongoDB.
For example, if documents have dates:
2015/08/11
2015/08/11
2015/08/12
2015/09/14
2014/10/30
2014/10/30
2014/08/11
Return unique months and years for all documents, ex:
2015/08
2015/09
2014/10
2014/08
Schema snippet:
var myObjSchema = mongoose.Schema({
date: Date,
request: {
...
I tried using distinct against schema field date:
db.mycollection.distinct('date', {}, {})
But this gave duplicate dates. Output snippet:
ISODate("2015-08-11T20:03:42.122Z"),
ISODate("2015-08-11T20:53:31.135Z"),
ISODate("2015-08-11T21:31:32.972Z"),
ISODate("2015-08-11T22:16:27.497Z"),
ISODate("2015-08-11T22:41:58.587Z"),
ISODate("2015-08-11T23:28:17.526Z"),
ISODate("2015-08-11T23:38:45.778Z"),
ISODate("2015-08-12T06:21:53.898Z"),
ISODate("2015-08-12T13:25:33.627Z"),
ISODate("2015-08-12T14:46:59.763Z")
So the question is:
a: How can I accomplish the above?
b: Is it possible to specify which part of the date you want distinct? Like distinct('date.month'...)?
EDIT: I've found you can get these dates and such with the following query, however the results are not distinct:
db.mycollection.aggregate(
[
{
$project : {
month : {
$month: "$date"
},
year : {
$year: "$date"
},
day: {
$dayOfMonth: "$date"
}
}
}
]
);
Output: duplicates
{ "_id" : "", "month" : 7, "year" : 2015, "day" : 14 }
{ "_id" : "", "month" : 7, "year" : 2015, "day" : 15 }
{ "_id" : "", "month" : 7, "year" : 2015, "day" : 15 }
You need to group your document after the projection and use $addToSet accumulator operator
db.mycollection.aggregate([
{ "$project": {
"year": { "$year": "$date" },
"month": { "$month": "$date" }
}},
{ "$group": {
"_id": null,
"distinctDate": { "$addToSet": { "year": "$year", "month": "$month" }}
}}
])
Indeed, you can distinct values via a $group/_id: null/$addToSet stage.
I'm also including here the use of dateToString that formats your dates into "%Y-%m" (e.g. 2021-12).
// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-08") }
// { date: ISODate("2022-04-05") }
// { date: ISODate("2022-12-14") }
db.collection.aggregate([
{ $group: {
_id: null,
months: { $addToSet: { $dateToString: { date: "$date", format: "%Y-%m" } } }
}}
])
// { _id: null, months: ["2021-12", "2022-04", "2022-12"] }
db.mycollection.aggregate(
[
{
"$project": {
"year": { "$year": "$date" },
"month": { "$month": "$date" }
}
},{ $group : {
"_id" :{"year" : "$year" }
}
},
{
$sort: {'_id': -1
}
}
])

Mongo aggregation within intervals of time

I have some log data stored in a mongo collection that includes basic information as a request_id and the time it was added to the collection, for example:
{
"_id" : ObjectId("55ae6ea558a5d3fe018b4568"),
"request_id" : "030ac9f1-aa13-41d1-9ced-2966b9a6g5c3",
"time" : ISODate("2015-07-21T16:00:00.00Z")
}
I was wondering if I could use the aggregation framework to aggregate some statistical data. I would like to get the counts of the objects created within each interval of N minutes for the last X hours.
So the output which I need for 10 minutes intervals for the last 1 hour should be something like the following:
{ "_id" : 0, "time" : ISODate("2015-07-21T15:00:00.00Z"), "count" : 67 }
{ "_id" : 0, "time" : ISODate("2015-07-21T15:10:00.00Z"), "count" : 113 }
{ "_id" : 0, "time" : ISODate("2015-07-21T15:20:00.00Z"), "count" : 40 }
{ "_id" : 0, "time" : ISODate("2015-07-21T15:30:00.00Z"), "count" : 10 }
{ "_id" : 0, "time" : ISODate("2015-07-21T15:40:00.00Z"), "count" : 32 }
{ "_id" : 0, "time" : ISODate("2015-07-21T15:50:00.00Z"), "count" : 34 }
I would use that to get data for graphs.
Any advice is appreciated!
There are a couple of ways of approaching this depending on which output format best suits your needs. The main note is that with the "aggregation framework" itself, you cannot actually return something "cast" as a date, but you can get values that are easily reconstructed into a Date object when processing results in your API.
The first approach is to use the "Date Aggregation Operators" available to the aggregation framework:
db.collection.aggregate([
{ "$match": {
"time": { "$gte": startDate, "$lt": endDate }
}},
{ "$group": {
"_id": {
"year": { "$year": "$time" },
"dayOfYear": { "$dayOfYear": "$time" },
"hour": { "$hour": "$time" },
"minute": {
"$subtract": [
{ "$minute": "$time" },
{ "$mod": [ { "$minute": "$time" }, 10 ] }
]
}
},
"count": { "$sum": 1 }
}}
])
Which returns a composite key for _id containing all the values you want for a "date". Alternately if just within an "hour" always then just use the "minute" part and work out the actual date based on the startDate of your range selection.
Or you can just use plain "Date math" to get the milliseconds since "epoch" which can again be fed to a date contructor directly.
db.collection.aggregate([
{ "$match": {
"time": { "$gte": startDate, "$lt": endDate }
}},
{ "$group": {
"_id": {
"$subtract": [
{ "$subtract": [ "$time", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$time", new Date(0) ] },
1000 * 60 * 10
]}
]
},
"count": { "$sum": 1 }
}}
])
In all cases what you do not want to do is use $project before actually applying $group. As a "pipeline stage", $project must "cycle" though all documents selected and "transform" the content.
This takes time, and adds to the execution total of the query. You can simply just apply to the $group directly as has been shown.
Or if you are really "pure" about a Date object being returned without post processing, then you can always use "mapReduce", since the JavaScript functions actually allow recasting as a date, but slower than the aggregation framework and of course without a cursor response:
db.collection.mapReduce(
function() {
var date = new Date(
this.time.valueOf()
- ( this.time.valueOf() % ( 1000 * 60 * 10 ) )
);
emit(date,1);
},
function(key,values) {
return Array.sum(values);
},
{ "out": { "inline": 1 } }
)
Your best bet is using aggregation though, as transforming the response is quite easy:
db.collection.aggregate([
{ "$match": {
"time": { "$gte": startDate, "$lt": endDate }
}},
{ "$group": {
"_id": {
"year": { "$year": "$time" },
"dayOfYear": { "$dayOfYear": "$time" },
"hour": { "$hour": "$time" },
"minute": {
"$subtract": [
{ "$minute": "$time" },
{ "$mod": [ { "$minute": "$time" }, 10 ] }
]
}
},
"count": { "$sum": 1 }
}}
]).forEach(function(doc) {
doc._id = new Date(doc._id);
printjson(doc);
})
And then you have your interval grouping output with real Date objects.
Something like this?
pipeline = [
{"$project":
{"date": {
"year": {"$year": "$time"},
"month": {"$month": "$time"},
"day": {"$dayOfMonth": "$time"},
"hour": {"$hour": "$time"},
"minute": {"$subtract": [
{"$minute": "$time"},
{"$mod": [{"$minute": "$time"}, 10]}
]}
}}
},
{"$group": {"_id": "$date", "count": {"$sum": 1}}}
]
Example:
> db.foo.insert({"time": new Date(2015, 7, 21, 22, 21)})
> db.foo.insert({"time": new Date(2015, 7, 21, 22, 23)})
> db.foo.insert({"time": new Date(2015, 7, 21, 22, 45)})
> db.foo.insert({"time": new Date(2015, 7, 21, 22, 33)})
> db.foo.aggregate(pipeline)
and output:
{ "_id" : { "year" : 2015, "month" : 8, "day" : 21, "hour" : 20, "minute" : 40 }, "count" : 1 }
{ "_id" : { "year" : 2015, "month" : 8, "day" : 21, "hour" : 20, "minute" : 20 }, "count" : 2 }
{ "_id" : { "year" : 2015, "month" : 8, "day" : 21, "hour" : 20, "minute" : 30 }, "count" : 1 }
a pointer in lieu of a concrete answer. you can very easily do it for minutes, hours and given periods using the date aggregations . every 10 minutes will be a bit trickier but likely possible with some wrangling. nevertheless, the aggregation will be slow as nuts on large data sets.
i would suggest extracting the minutes post-insert
{
"_id" : ObjectId("55ae6ea558a5d3fe018b4568"),
"request_id" : "030ac9f1-aa13-41d1-9ced-2966b9a6g5c3",
"time" : ISODate("2015-07-21T16:00:00.00Z"),
"minutes": 16
}
and even though it sounds utterly absurd adding quartiles and sextiles or whatever that N might be.
{
"_id" : ObjectId("55ae6ea558a5d3fe018b4568"),
"request_id" : "030ac9f1-aa13-41d1-9ced-2966b9a6g5c3",
"time" : ISODate("2015-07-21T16:00:00.00Z"),
"minutes": 16,
"quartile: 1,
"sextile: 2,
}
first try doing a $div on the minutes. doesnt do ceil and floor. but check out
Is there a floor function in Mongodb aggregation framework?