I want to aggregate entries whose date is less than 1 week old from today.
{
"collection": "my-lovely-collection",
"aggregate": [
{"$match": {"some_field": { "$regex": "awesome*"}}},
{"$match": {"created": {"$lt":
{"$dateToString":
{"date":
{"$dateSubtract":
{"startDate": {"$currentDate": {"$type": "date"}},
"unit": "day",
"amount": 7
}}}}}}},
{"$group": {"_id": "$some_field", "count": {"$sum": 1 }}},
{"$sort": [{"name": "count", "direction": 1}]}
]
}
When I use a hard-coded date for today everything works fine (but that's not what I want)
{"$match": {"created": {"$lt": "2022-08-29"}}}
You could use $dateSubtract:
{
$match: {
$expr: {
$lt: [
"$created",
{
$dateToString: {
format: "%Y-%m-%d",
date: {
$dateSubtract: {
startDate: "$$NOW",
unit: "week",
amount: 1
}
}
}
}
]
}
}
}
Mongo Playground: https://mongoplayground.net/p/QWT4ar4gbt0
Documentation: https://www.mongodb.com/docs/manual/reference/operator/aggregation/dateSubtract/
Related
I have a series of documents gathered by aggregation grouping. This is the result for one document:
{
"_id": {
"ip": "79.xxx.xxx.117",
"myDate": "2022-10-19"
},
"date": "2022-10-19",
"allVisitedPages": [
{
"page": "/",
"time": {
"time": "2022-10-19T11:35:44.655Z",
"tz": "-120",
"_id": "634fe1100a011986b7137da0"
}
},
{
"page": "/2",
"time": {
"time": "2022-10-19T12:14:29.536Z",
"tz": "-120",
"_id": "634fea257acb264f23d421f1"
}
},
{
"page": "/",
"time": {
"time": "2022-10-19T15:37:30.002Z",
"tz": "-120",
"_id": "634fea266001ea364eeb38ea"
}
},
],
"visitedPages": 3,
"createdAt": "2022-10-19T11:35:44.920Z"
},
I want to get this (in this case 2 documents as the time difference between array position 2 and 3 is greater than 2 hours):
{
"_id": {
"ip": "79.xxx.xxx.117",
"myDate": "2022-10-19"
},
"date": "2022-10-19",
"allVisitedPages": [
{
"page": "/",
"durationInMinutes": "39",
"time": {
"time": "2022-10-19T11:35:44.655Z",
"tz": "-120",
"_id": "634fe1100a011986b7137da0"
}
},
{
"page": "/2",
"durationInMinutes": "2",
"time": {
"time": "2022-10-19T12:14:29.536Z",
"tz": "-120",
"_id": "634fea257acb264f23d421f1"
}
}
],
"visitedPages": 2,
},
{
"_id": {
"ip": "79.xxx.xxx.117",
"myDate": "2022-10-19"
},
"date": "2022-10-19",
"allVisitedPages": [
{
"page": "/",
"durationInMinutes": "2",
"time": {
"time": "2022-10-19T15:37:30.002Z",
"tz": "-120",
"_id": "634fea266001ea364eeb38ea"
}
},
],
"visitedPages": 1,
},
I want to get a new grouping document if the time between an array position and the following array position is greater than 2 hours. On the last array position it show always show "2".
I tried $divide and $datediff. But this is not possible on the group stage as it's an unary operator. An approach I tried is to calculate the sum of start and end time by dividing. But how to execute this on an array level on the group stage? Maybe someone could point me in the right direction if possible at all?
You can group and then reduce, but another option is to use $setWindowFields to calculate your grouping index before grouping:
db.collection.aggregate([
{$setWindowFields: {
partitionBy: {$concat: ["$ip", "$date"]},
sortBy: {"time.time": 1},
output: {prevtime: {
$push: "$time.time",
window: {documents: [-1, "current"]}
}}
}},
{$addFields: {
minutesDiff: {
$toInt: {
$dateDiff: {
startDate: {$first: "$prevtime"},
endDate: {$last: "$prevtime"},
unit: "minute"
}
}
}
}},
{$addFields: {deltaIndex: {$cond: [{$gt: ["$minutesDiff", 120]}, 1, 0]}}},
{$setWindowFields: {
partitionBy: {$concat: ["$ip", "$date"]},
sortBy: {"time.time": 1},
output: {
groupIndex: {
$sum: "$deltaIndex",
window: {documents: ["unbounded", "current"]}
},
duration: {
$push: "$minutesDiff",
window: {documents: ["current", 1]}
}
}
}
},
{$set: {
duration: {
$cond: [
{$and: [
{$eq: [{$size: "$duration"}, 2]},
{$lte: [{$last: "$duration"}, 120]}
]},
{$last: "$duration"},
2
]
}
}},
{$group: {
_id: {ip: "$ip", myDate: "$date", groupIndex: "$groupIndex"},
date: {$first: "$date"},
allVisitedPages: {$push: {page: "$page", time: "$time", duration: "$duration"}},
visitedPages: {$sum: 1}
}},
{$unset: "_id.groupIndex"}
])
See how it works on the playground example
I'm doing an aggregation where I sum all the sales by month (createdAt), and I'm trying to calculate the variation between the prior value.
How to compare value with the prior value of same field in MongoDB?
[
{"$addFields": { "createdAt": {"$convert": { "input": "$createdAt", "to": "date", "onError": null}}}},
{"$addFields": {"createdAt": {"$cond": {"if": { "$eq": [{"$type": "$createdAt" }, "date"]},
"then": "$createdAt", "else": null}}}},
{"$addFields": {"__alias_0": {"year": {"$year": "$createdAt" }, "month": {"$subtract": [{ "$month": "$createdAt"}, 1] } } }},
{ "$group": { "_id": { "__alias_0": "$__alias_0" }, "__alias_1": {"$sum": 1 }}},
{ "$project": {"_id": 0, "__alias_0": "$_id.__alias_0", "__alias_1": 1}},
{ "$project": {"group": "$__alias_0", "value": "$__alias_1", "_id": 0 }}
I have documents like these:
{name: 'doc1', date: "2015-01-01T02:00:12+01:00"},
{name: 'doc2', date: "2015-01-01T03:02:12+01:00"},
{name: 'doc3', date: "2015-01-01T02:17:55+01:00"}
Is it possible to count them by time-intervals (for example: 15 minutes) and get result like this:
{startDate: "2015-01-01T02:00:12+01:00", count: 15},
{startDate: "2015-01-01T02:15:12+01:00", count: 11},
{startDate: "2015-01-01T02:30:12+01:00", count: 21},
...`
You can't get an "actual" date object returned this way but you can get a timestamp value which can be used to construct a date object. I'ts just a simple matter of date math:
db.collection.aggregate([
{ "$group":
"_id": {
"$subtract": [
{ "$subtract": [ "$date", new Date("1970-01-01") ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date("1970-01-01") ] },
1000 * 60 * 15
]}
]
},
"count": { "$sum": 1 }
}}
])
Subtracting a "date object" with the epoch date will result in the current timestamp value as a number. The basic math is the difference from the modulo at a 15 minute interval ( 1000 milis * 60 secs * 15 minutes ).
If you prefer there are actually Date Aggregation Operators which can split up the date as well. Same case is that these are numbers and not a date, but you can re-construct a date object from the values there.
db.collection.aggregation([
{ "$group": {
"_id": {
"year": { "$year": "$date" },
"month": { "$month": "$date" },
"dayOfMonth": { "$dayOfMonth": "$date" },
"hour": { "$hour" },
"minute": {
"$subtract": [
{ "$minute": "$date" },
{ "$mod": [
{ "$minute": "$date" },
15
]}
]
}
},
"count": { "$sum": 1 }
}}
])
I have collection containing date field. I'm Grouping records by week and other related fields.
This is my aggregation query:
db.raw.aggregate([
{ "$match" : {
"Timestamp":{
"$gte": new Date("2012-05-30"),
"$lt": new Date("2014-07-31")
}
}},
{ "$group" : {
"_id":{
"ApplicationId": "$ApplicationId",
"Country": "$Country",
"week":{ "$week": "$Timestamp" }
},
"Date":{ "$first": "$Timestamp" },
"Visits": { "$sum": 1 }
}}
])
I want to Project : Visits and Start Date of week from week number.
For mongo >= v3.4, look at weekStart.
The idea is to substruct milliseconds from given Timestamp
db.raw.aggregate([
// stage 1
{ "$match" : {
"Timestamp":{
"$gte": ISODate("2012-05-30"),
"$lt": ISODate("2014-07-31")
}
}},
// stage 2
{ "$project" : {
ApplicationId: 1,
Country: 1,
week: {$isoWeek: "$Timestamp"},
// [TRICK IS HERE] Timestamp - dayOfWeek * msInOneDay
weekStart: { $dateToString: { format: "%Y-%m-%d", date: { // convert date
$subtract: ["$Timestamp", {$multiply: [ {$subtract:[{$isoDayOfWeek: "$Timestamp"},1]}, 86400000]}]
}}},
// stage 3
{ "$group" : {
"_id":{
"ApplicationId": "$ApplicationId",
"Country": "$Country",
"week": "$week"
},
"Date":{ "$first": "$weekStart" },
"Visits": { "$sum": 1 }
}}
])
You seem to want a "date value" representing the date at the start of the week. Your best approach is "date math" with a little help from the aggregation operator $dayOfWeek:
db.raw.aggregate([
{ "$match" : {
"Timestamp":{
"$gte": new Date("2012-05-30"),
"$lt": new Date("2014-07-31")
}
}},
{ "$group" : {
"_id":{
"ApplicationId": "$ApplicationId",
"Country": "$Country",
"weekStart":{
"$subtract": [
{ "$subtract": [
{ "$subtract": [ "$Timestamp", new Date("1970-01-01") ] },
{ "$cond": [
{ "$eq": [{ "$dayOfWeek": "$Timestamp" }, 1 ] },
0,
{ "$multiply": [
1000 * 60 * 60 * 24,
{ "$subtract": [{ "$dayOfWeek": "$Timestamp" }, 1 ] }
]}
]}
]},
{ "$mod": [
{ "$subtract": [
{ "$subtract": [ "$Timestamp", new Date("1970-01-01") ] },
{ "$cond": [
{ "$eq": [{ "$dayOfWeek": "$Timestamp" }, 1 ] },
0,
{ "$multiply": [
1000 * 60 * 60 * 24,
{ "$subtract": [{ "$dayOfWeek": "$Timestamp" }, 1 ] }
]}
]}
]},
1000 * 60 * 60 * 24
]}
]
}
},
"Date":{ "$first": "$Timestamp" },
"Visits": { "$sum": 1 }
}}
])
Or a little cleaner with $let from MongoDB 2.6 and upwards:
db.raw.aggregate([
{ "$match" : {
"Timestamp":{
"$gte": new Date("2012-05-30"),
"$lt": new Date("2014-07-31")
}
}},
{ "$group" : {
"_id":{
"ApplicationId": "$ApplicationId",
"Country": "$Country",
"weekStart":{
"$let": {
"vars": {
"dayMillis": 1000 * 60 * 60 * 24,
"beginWeek": {
"$subtract": [
{ "$subtract": [ "$Timestamp", new Date("1970-01-01") ] },
{ "$cond": [
{ "$eq": [{ "$dayOfWeek": "$Timestamp" }, 1 ] },
0,
{ "$multiply": [
1000 * 60 * 60 * 24,
{ "$subtract": [{ "$dayOfWeek": "$Timestamp" }, 1 ] }
]}
]}
]
}
},
"in": {
"$subtract": [
"$$beginWeek",
{ "$mod": [ "$$beginWeek", "$$dayMillis" ]}
]
}
}
}
},
"Date":{ "$first": "$Timestamp" },
"Visits": { "$sum": 1 }
}}
])
The resulting value in the "grouping" is the epoch milliseconds that represents the start of the day at the start of the week. The "start of the week" is generally considered to be "Sunday", so if you intend another day then you would need to adjust by the appropriate amount. The $add operator with the $dayMillis variable value can be used here to apply "Monday" for example.
It's not a date object, but something that you can easily feed to another method to construct a date object in post processing.
Also note that other things you are using such as $first usually require that the documents are sorted in a particular order, or generally by your "Timestamp" values. If those documents are not already ordered then you either $sort first or use an operator such as $min to get the first actual timestamp in the range.
With MongoDB 3.6
{
'$project' : {
'firstDateOfWeek': {
'$dateFromString': {
'dateString': {
'$concat': [
{
'$toString': '$_id.year'
},
'-',
{
'$toString': '$_id.week'
}
]
},
'format': "%G-%V"
}
}
}
}
From mongo 3.6
https://docs.mongodb.com/manual/reference/operator/aggregation/dateFromParts/
db.raw.aggregate([
{
"$match": {
"Timestamp": {
"$gte": new Date("2012-05-30"),
"$lt": new Date("2014-07-31")
}
}
},
{
"$group": {
"_id": {
"ApplicationId": "$ApplicationId",
"Country": "$Country",
"week": {
"$isoWeek": "$Timestamp"
},
"year": {
"$year": "$Timestamp"
}
},
"Visits": {
"$sum": 1
}
}
},
{
"$addFields": {
"Date": {
$dateFromParts: {
isoWeekYear: '$_id.year',
isoWeek: '$_id.week',
isoDayOfWeek: 1
}
}
}
}
])
For MongoDB >= v5.0 there is an even easier option now with the $dateTrunc operator, e.g.
$project: {
weekStart: {
$dateTrunc: {
date: "$Timestamp",
unit: "week",
startOfWeek: "Monday",
}
},
}
Here is my query, I would like to combine $_id to YYYY-MM-DD? or any function like Mysql DATE() to convert DATETIME format to DATE format?
db.event.aggregate([
{
$project: {
"created": {$add: ["$created", 60*60*1000*8]},
}
},
{
$group: {
"_id": {
"year": {"$year": "$created"},
"month": {"$month": "$created"},
"day": {"$dayOfMonth": "$created"}
},
"count": { $sum: 1 }
}
}
])
You basically already are by using the date aggregation operators to split up the components into your compound _id key, and this is probably the best way to handle it. You can actually alter this though with the $substr operator and use of $concat:
db.event.aggregate([
{ "$project": {
"created": {$add: ["$created", 60*60*1000*8]},
}},
{ "$group": {
"_id": {
"year": {"$year": "$created"},
"month": {"$month": "$created"},
"day": {"$dayOfMonth": "$created"}
},
"count": { "$sum": 1 }
}},
{ "$project": {
"_id": { "$concat": [
{ "$substr": [ "$_id.year", 0, 4 ] },
"-",
{ "$cond": [
{ "$lte": [ "$_id.month", 9 ] },
{ "$concat": [
"0",
{ "$substr": [ "$_id.month", 0, 2 ] }
]},
{ "$substr": [ "$_id.month", 0, 2 ] }
]},
"-",
{ "$cond": [
{ "$lte": [ "$_id.day", 9 ] },
{ "$concat": [
"0",
{ "$substr": [ "$_id.day", 0, 2 ] }
]},
{ "$substr": [ "$_id.day", 0, 2 ] }
]}
]},
"count": 1
}}
])
So there is a bit of coercion of the values from the date parts to strings there as well as padding out any values under two didgits with a leading 0 just like in a "YYYY-MM-DD" format.
Noting that it can be done, and has been able to be done for some time, but it is notably missing from the manual page description of the $substr operator.
Not to sure about your "date math" at the start there. I would say you would be better off using the aggregation operators and then working on the values that you wanted to adjust by, or if indeed it was something like a "timezone" correction, then again you would probably be better off processing that client side.