mongodb group every 2 weeks - mongodb

This is how to group data weekly :
transactions.aggregate({
[
{$match: {status: "committed"}},
{$group: {
_id: {
$year: "$date",
$week: "$date"
},
count: {$sum: 1},
start_date: {$min: "$date"},
end_date: {$max: "$date"}
}}
]
});
The question is, how about grouping every 2 weeks and get the count?
Thank you.
CORRECT WAY ACCORDING TO ACCEPTED ANSWER
Thanks to Mzzl, this works for grouping every 2 weeks. Below is the complete version.
db.transactions.aggregate([
{$match: {status: "committed"}},
{$project: {
twoweekperiod: {
$subtract: [
{$week: "$date"}, {$mod: [{$week: "$date"}, 2] }
]
},
date:1,
status:1
}},
{$group: {
_id: {
year: {$year: "$date"},
twoweek: "$twoweekperiod"
},
count: {$sum: 1},
start_date: {$min: "$date"},
end_date: {$max: "$date"}
}}
])

Assuming $week contains the week number, you can create an 'every two weeks' number, and group by that. Try something like this:
{
$project: {
twoweekperiod: {
$subtract: [
'$week', {$mod: ['$week', 2] }
]
}, status:1, date:1, etc...
}
}
I don't know of a way to do an integer divide in a mongo query, so instead I subtract weeknumber mod 2 from the weeknumber, to get a number that changes every other week instead of every week. You could then try grouping by this number.

Related

Filter nested objects

I have a collection of docs like
{'id':1, 'score': 1, created_at: ISODate(...)}
{'id':1, 'score': 2, created_at: ISODate(...)}
{'id':2, 'score': 1, created_at: ISODate(...)}
{'id':2, 'score': 20, created_at: ISODate(...)}
etc.
Does anyone know how to find docs that were created within the past 24hrs where the difference of the score value between the two most recent docs of the same id is less than 5?
So far I can only find all docs created within the past 24hrs:
[{
$project: {
_id: 0,
score: 1,
created_at: 1
}
}, {
$match: {
$expr: {
$gte: [
'$created_at',
{
$subtract: [
'$$NOW',
86400000
]
}
]
}
}
}]
Any advice appreciated.
Edit: By the two most recent docs, the oldest of the two can be created more than 24hrs ago. So the most recent doc would be created within the past 24hrs, but the oldest doc could be created over 24hrs ago.
If I understand you correctly, you want something like:
db.collection.aggregate([
{$match: {$expr: {$gte: ["$created_at", {$subtract: ["$$NOW", 86400000]}]}}},
{$sort: {created_at: -1}},
{$group: {_id: "$id", data: {$push: "$$ROOT"}}},
{$project: {pair: {$slice: ["$data", 0, 2]}, scores: {$slice: ["$data.score", 0, 2]}}},
{$match: {$expr: {
$lte: [{$abs: {$subtract: [{$first: "$scores"}, {$last: "$scores"}]}}, 5]
}}},
{$unset: "scores"}
])
See how it works on the playground example
EDIT:
according to you comment, one option is:
db.collection.aggregate([
{$setWindowFields: {
partitionBy: "$id",
sortBy: {created_at: -1},
output: {data: {$push: "$$ROOT", window: {documents: ["current", 1]}}}
}},
{$group: {
_id: "$id",
created_at: {$first: "$created_at"},
pair: {$first: "$data"}
}},
{$match: {$expr: {$and: [
{$gte: ["$created_at", {$dateAdd: {startDate: "$$NOW", unit: "day", amount: -1}},
{$eq: [{$size: "$pair"}, 2]},
{$lte: [{$abs: {$subtract: [{$first: "$pair.score"},
{$last: "$pair.score"}]}}, 5]}
]}}},
{$project: {_id: 0, pair: 1}}
])
See how it works on the playground example
If I've understood correctly you can try this query:
First the $match as you have to get documents since a day ago.
Then $sort by the date to ensure the most recent are on top.
$group by the id, and how the most recent were on top, using $push will be the two first elements in the array.
So now you only need to $sum these two values.
And filter again with these one that are less than ($lt) 5.
db.collection.aggregate([
{
$match: {
$expr: {
$gte: [
"$created_at",
{
$subtract: [
"$$NOW",
86400000
]
}
]
}
}
},
{
"$sort": {
"created_at": -1
}
},
{
"$group": {
"_id": "$id",
"score": {
"$push": "$score"
}
}
},
{
"$project": {
"score": {
"$sum": {
"$firstN": {
"n": 2,
"input": "$score"
}
}
}
}
},
{
"$match": {
"score": {
"$lt": 5
}
}
}
])
Example here
Edit: $firstN is new in version 5.2. Other way you can use $slice in this way.

How to I group by date while splitting multi-day records into multiples?

I have a mongodb collection that has the processName, start and stop fields. Each document thus describes a process that has occurred during a time interval. Each time interval can range from minutes to months, and there can be multiples.
I need to find out the total time each process ran each day. That, I think, requires splitting a process which ran for, say, 30 days, into 30 records, and then grouping per day per processName. Ideally, I wouldn't count processes running simultaneously twice, but if it's very hard to do, I can sacrifice that for the sake of readability.
One option is:
Create a list of dates according to the days difference
Use $zip to create array of timestamps for each day
$unwind
Calculate each document's key using $dateToString and working minutes using $dateDiff
Group and format
db.collection.aggregate([
{$set: {
days: {$map: {
input: {$range: [1,
{$add: [{$dateDiff: {startDate: "$start", endDate: "$stop", unit: "day"}}, 1]}]},
in: {
$dateAdd: {startDate: {$dateTrunc: {date: "$start", unit: "day"}},
unit: "day",
amount: "$$this"
}}
}}
}},
{$project: {
_id: 0,
processName: 1,
timestamp: {$zip: {
inputs: [
{$concatArrays: [["$start"], "$days"]},
{$concatArrays: ["$days", ["$stop"]]}
]
}}
}},
{$unwind: "$timestamp"},
{$project: {
processName: 1,
key: {$dateToString: {
date: {$dateTrunc: {date: {$first: "$timestamp"}, unit: "day"}},
format: "%Y-%m-%d"
}},
minutes: {$dateDiff: {
startDate: {$first: "$timestamp"},
endDate: {$last: "$timestamp"},
unit: "minute"
}}
}},
{$group: {
_id: {processName: "$processName", key: "$key"},
totalDurationHrs: {$sum: "$minutes"}
}},
{$project: {
_id: 0,
totalDurationHrs: {$round: [{$divide: ["$totalDurationHrs", 60]}, 1]},
date: "$_id.key",
processName: "$_id.processName"}}
])
See how it works on the playground example

How to get number of working days between two dates in mongodb aggregation

I have two dates Start Date: 2019-08-13 and End Date: 2019-08-20.I want the difference of end date to start date in terms of working days(excluding holidays) .so my results would be 4 as 2019-08-15 is a holiday and 2019-08-17, 2019-08-18 are weekends.
try this query:
db.collection.aggregate([
{$addFields: {days_in_millis: { $add: [{$subtract: ["$end_date", "$start_date"]}, 86400000] } }},
{$project: {end_date: 1, start_date: 1, millis_range: {$range: [0, "$days_in_millis", 86400000 ] } } },
{$project: {dates_in_between_inclusive: {
$map: {
input: "$millis_range",
as: "millis_count",
in: {$add: ["$start_date", "$$millis_count"]}
}
}}},
{$unwind: "$dates_in_between_inclusive"},
{$project: {date: "$dates_in_between_inclusive", is_not_weekend: {
$cond: {
if: { $in: [ {$dayOfWeek: "$dates_in_between_inclusive"}, [1, 7] ]},
then: 0,
else: 1
}}}},
{$match: {date: {$nin: holidays_dates_list}}},
{$group: {
_id: "$_id",
days: {$sum: "$is_not_weekend"}
}}
])
Assumptions:
1. every document has at least start_date and end_date fields which are mongodb dates.
2. "holidays_dates_list" is an array of dates which has holidays (may or may not include weekends)
Above query itself filters weekends. So, "holidays_dates_list" need not have weekends.

Mongodb need count for specific member id per day

Collection name : activity
What I need is activity count
of "memberId" = 123
where activity "type" = 'xxx'
per day
between "11/01/2015" and "11/15/2015" // from date and to date range
Expected Output:
[
{date:"2015-02-22",count:10},
{date:"2015-02-22",count:5},
]
I have no idea how to perform aggregate between dates and for a specific member id
where I am at is far far away from the solution :
db.activity.aggregate(
{ $project: {
date: {
years: {$year: '$dateInserted'},
months: {$month: '$dateInserted'},
days: {$dayOfMonth: '$dateInserted'},
},
memberId: '$memberId'
}},
{ $group: {
_id: { memberId: '$memberId', date: '$date' },
number: { $sum: 1}
}})
db.activity.aggregate([
{$match : { memberId : "xxx",item:"xyz",dateInserted: {$gte: ISODate("2013-01-01T00:00:00.0Z"), $lt: ISODate("2016-02-01T00:00:00.0Z")}}},
{$project: {day: {day: {$dayOfMonth: '$dateInserted'}, month: {$month: '$dateInserted'}, year: {$year: '$dateInserted'}}}},
{$group: { _id: { day: '$day' }, count: { $sum: 1} }},{ $sort:{_id:1}}
]);
Try that
You may need add $match stage before $project:
$match:
{dateInserted:{{$gte:new Date("2015-01-11T00:00:00 -02:00"),{$lte:new Date("2015-01-15T23:59:59 -02:00")}}},
{memberId:123},
{type:"xxx"}

Is it possible to type cast data inside an aggregation pipeline on MongoDB?

When I need to aggregate things by date using the aggregate command on MongoDB, I usually do this:
db.mycollection.aggregate(
{
$project: {
day: {$dayOfMonth: "$date"},
mon: {$month: "$date"},
year: {$year: "$date"},
}
},
{
$group: {
_id : {day: "$day", mon: "$mon", year: "$year"},
count: {$sum: 1}
}
}
)
and eventually concatenate the day, mon, and year fields into a date string in the application. For many reasons though, sometimes I want to concatenate the fields before leaving the database, so I initially tried:
db.mycollection.aggregate(
{
$project: {
day: {$dayOfMonth: "$date"},
mon: {$month: "$date"},
year: {$year: "$date"},
}
},
$project: {
datestr: {
$concat : ["$year", "-", "$month", "-", "$day"]
}
}
},
{
$group: {
_id : {day: "$day", mon: "$mon", year: "$year"},
count: {$sum: 1}
}
}
)
This won't work because $concat expects strings and day, mon and year are integers. So, my question is: can I type cast a field with the $project operation?
Yes, you can use $substr for casting a number to a string. Your missing link would look like:
{
$project: {
day_s: { $substr: [ "$day", 0, 2 ] },
mon_s: { $substr: [ "$month", 0, 2 ] },
year_s: { $substr: [ "$year", 0, 4 ] }
}
}