Issues aggregating mongo records by the last 7 days - mongodb

I'm trying to count the occurences from each id_atendentes on the last 7 days/week.
I have the following query:
db.atendimentos.aggregate([
{'$group' :
{'_id' :
{'id_atendente':'$id_atendente', 'date':
{ '$gte': new Date((new Date().getTime() - (7 * 24 * 60 * 60 * 1000))) }
}},
'sum': {'$sum': 1} }
])
I thought that it would work, but it didn't.
I'm aware of the $week operator but I don't think that it does what I want to do.
I've got the following error: A pipeline stage specification object must contain exactly one field.
I guess that it may be something with my 'date': { '$gte' }... part.
Hope to get some help, thanks!

So if I understand right, you would like to get last week documents, and then group them by the id_atendente, and count the amount each atendente occurred during the last week.
If that is the case, you first need to filter out documents from the last week with a $match stage, and then follow it with a $group stage to group by the atendente id.
I think the following code will do the job:
db.atendimentos.aggregate([
{
'$match': {
'date': {'$gte': new Date((new Date().getTime() - (7 * 24 * 60 * 60 * 1000)))}
},
},
{
'$group':
{
'_id': "$id_atendente",
'sum': {'$sum': 1}
},
}
])

Related

Get number of daily records, per month

I am trying to get some data visualization for an application I am making and I am currently having an issue.
The current query I am using to get the documents grouped by month is the following:
# Generating our pipeline
pipeline = [
{"$match": query_match
},
{"$group": {
'_id': {
'$dateTrunc': {
'date': "$date", 'unit': "month"
}
},
"total": {
"$sum": 1
}
}
},
{'$sort': {
'_id': 1
}
}
]
This however, will return me the sum of documents for each month.
I want to take this a step further and calculate the average number of documents per day. but ONLY for the days which I have collections for.
As an example, the above query currently returns something like this:
Index _id total_documents
0 2022-07-01 10425
1 2022-08-01 27981
2 2022-09-01 24872
3 2022-10-01 1633
What I want is, for 2022-7 for example, I have documents submitted for 20 of the 31 days that the month has, so I want to return 10452 / 20, instead of 10452 / 31 which would technically be the daily average for that month.
Is there a way to do this in a single aggregation or would I have to use an additional query to determine how many days I have documents for first?
Thanks

How to retrieve documents with createdAt more than 48 hours in mongoose

I need to retrieve documents where the createdAt timestamp is more than 48 hours in mongoose.
Here's my sample code below but it doesn't retrieve any documents even though there're documents that match the condition.
Model.find({
createdAt: { $lt: new Date(Date.now() - 2 * 24 * 60 * 60 * 1000) },
});
NB: The createdAt field is the default in mongoose when timestamp is enabled { timestamps: true }
I would really appreciate it if anyone can help out, thanks in advance.
Try
var days= 2;
var date = new Date(date.setDate(date.getDate() - days));
Model.find({createdAt : {$lt : date}}).count());
With MongoDB aggegation framework, we have access $$NOW (Standalone) | $$CLUSTER_TIME (Cluster) variable which returns current date.
If we subtract 172800000 miliseconds (48 hours) from current date and use $expr operator, we can get desired result.
Try this one:
Model.aggregate([
{
$match: {
$expr: {
$gte: [
"$createdAt",
{
$toDate: {
$subtract: [
{
$toLong: "$$CLUSTER_TIME"
},
172800000 // 2 x 24 x 60 x 60 x 1000
]
}
}
]
}
}
}
]).exec();
MongoPlayground
Thanks to everyone that tried to help out.
My solution above works, I wasn't getting any records at that time cos unknowing to me my env file was pointing to another MongoDB server.

MongoDB Date range query for past hour

I'm trying to write MongoDB query which will be return data from one hour ago.
There is a column time with timestamps ("time" : NumberLong("1471953787012")) and this is how it looks in SQL:
select name from table
where time between (NOW() - INTERVAL 1 HOUR) AND (NOW())
How do I write a MongoDB query to find a date range from one hour ago?
I'm trying with new Date() function but it doesn't work.
Does anyone know what I'm doing wrong?
db.coll.find({
"time" : {
$lt: new Date(),
$gte: new Date(new Date().setDate(new Date().getDate()-1))
}
})
db.entity.find({ $and:[
{
"timestamp": {
$gte: new Date(ISODate().getTime() - 1000 * 60 * 60)
}},
{
"timestamp": {
$lte: ISODate()
}}
]})
Hope this helps...
db.coll.find({
"time": { // 60 minutes ago (from now)
$gte: new Date(ISODate().getTime() - 1000 * 60 * 60)
}
})

How do you get DayHours from Mongo date field?

I am trying to group by DayHours in a mongo aggregate function to get the past 24 hours of data.
For example: if the time of an event was 6:00 Friday the "DayHour" would be 6-5.
I'm easily able to group by hour with the following query:
db.api_log.aggregate([
{ '$group': {
'_id': {
'$hour': '$time'
},
'count': {
'$sum':1
}
}
},
{ '$sort' : { '_id': -1 } }
])
I feel like there is a better way to do this. I've tried concatenation in the $project statement, however you can only concatenate strings in mongo(apparently).
I effectively just need to end up grouping by day and hour, however it gets done. Thank You.
I assume that time field contains ISODate.
If you want only last 24 hours you can use this:
var yesterday = new Date((new Date).setDate(new Date().getDate() - 1));
db.api_log.aggregate(
{$match: {time: {$gt: yesterday}}},
{$group: {
_id: {
hour: {$hour: "$time"},
day: {$dayOfMonth: "$time"},
},
count: {$sum: 1}
}}
)
If you want general grouping by day-hour you can use this:
db.api_log.aggregate(
{$group: {
_id: {
hour: {$hour: "$time"},
day: {$dayOfMonth: "$time"},
month: {$month: "$time"},
year: {$year: "$time"}
},
count: {$sum: 1}
}}
)
Also this is not an answer per se (I do not have mongodb now to come up with the answer), but I think that you can not do this just with aggregation framework (I might be wrong, so I will explain myself).
You can obtain date and time information from mongoId using .getTimestamp method. The problem that you can not output this information in mongo query (something like db.find({},{_id.getTimestamp}) does not work). You also can not search by this field (except of using $where clause).
So if it is possible to achieve, it can be done only using mapreduce, where in reduce function you group based on the output of getTimestamp.
If this is the query you are going to do quite often I would recommend actually adding date field to your document, because using this field you will be able properly aggregate your data and also you can use indeces not to scan all your collection (like you are doing with $sort -1, but to $match only the part which is bigger then current date - 24 hours).
I hope this can help even without a code. If no one will be able to answer this, I will try to play with it tomorrow.

MongoDB - Querying between a time range of hours

I have a MongoDB datastore set up with location data stored like this:
{
"_id" : ObjectId("51d3e161ce87bb000792dc8d"),
"datetime_recorded" : ISODate("2013-07-03T05:35:13Z"),
"loc" : {
"coordinates" : [
0.297716,
18.050614
],
"type" : "Point"
},
"vid" : "11111-22222-33333-44444"
}
I'd like to be able to perform a query similar to the date range example but instead on a time range. i.e. Retrieve all points recorded between 12AM and 4PM (can be done with 1200 and 1600 24 hour time as well).
e.g.
With points:
"datetime_recorded" : ISODate("2013-05-01T12:35:13Z"),
"datetime_recorded" : ISODate("2013-06-20T05:35:13Z"),
"datetime_recorded" : ISODate("2013-01-17T07:35:13Z"),
"datetime_recorded" : ISODate("2013-04-03T15:35:13Z"),
a query
db.points.find({'datetime_recorded': {
$gte: Date(1200 hours),
$lt: Date(1600 hours)}
});
would yield only the first and last point.
Is this possible? Or would I have to do it for every day?
Well, the best way to solve this is to store the minutes separately as well. But you can get around this with the aggregation framework, although that is not going to be very fast:
db.so.aggregate( [
{ $project: {
loc: 1,
vid: 1,
datetime_recorded: 1,
minutes: { $add: [
{ $multiply: [ { $hour: '$datetime_recorded' }, 60 ] },
{ $minute: '$datetime_recorded' }
] }
} },
{ $match: { 'minutes' : { $gte : 12 * 60, $lt : 16 * 60 } } }
] );
In the first step $project, we calculate the minutes from hour * 60 + min which we then match against in the second step: $match.
Adding an answer since I disagree with the other answers in that even though there are great things you can do with the aggregation framework, this really is not an optimal way to perform this type of query.
If your identified application usage pattern is that you rely on querying for "hours" or other times of the day without wanting to look at the "date" part, then you are far better off storing that as a numeric value in the document. Something like "milliseconds from start of day" would be granular enough for as many purposes as a BSON Date, but of course gives better performance without the need to compute for every document.
Set Up
This does require some set-up in that you need to add the new fields to your existing documents and make sure you add these on all new documents within your code. A simple conversion process might be:
MongoDB 4.2 and upwards
This can actually be done in a single request due to aggregation operations being allowed in "update" statements now.
db.collection.updateMany(
{},
[{ "$set": {
"timeOfDay": {
"$mod": [
{ "$toLong": "$datetime_recorded" },
1000 * 60 * 60 * 24
]
}
}}]
)
Older MongoDB
var batch = [];
db.collection.find({ "timeOfDay": { "$exists": false } }).forEach(doc => {
batch.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": {
"timeOfDay": doc.datetime_recorded.valueOf() % (60 * 60 * 24 * 1000)
}
}
}
});
// write once only per reasonable batch size
if ( batch.length >= 1000 ) {
db.collection.bulkWrite(batch);
batch = [];
}
})
if ( batch.length > 0 ) {
db.collection.bulkWrite(batch);
batch = [];
}
If you can afford to write to a new collection, then looping and rewriting would not be required:
db.collection.aggregate([
{ "$addFields": {
"timeOfDay": {
"$mod": [
{ "$subtract": [ "$datetime_recorded", Date(0) ] },
1000 * 60 * 60 * 24
]
}
}},
{ "$out": "newcollection" }
])
Or with MongoDB 4.0 and upwards:
db.collection.aggregate([
{ "$addFields": {
"timeOfDay": {
"$mod": [
{ "$toLong": "$datetime_recorded" },
1000 * 60 * 60 * 24
]
}
}},
{ "$out": "newcollection" }
])
All using the same basic conversion of:
1000 milliseconds in a second
60 seconds in a minute
60 minutes in an hour
24 hours a day
The modulo from the numeric milliseconds since epoch which is actually the value internally stored as a BSON date is the simple thing to extract as the current milliseconds in the day.
Query
Querying is then really simple, and as per the question example:
db.collection.find({
"timeOfDay": {
"$gte": 12 * 60 * 60 * 1000, "$lt": 16 * 60 * 60 * 1000
}
})
Of course using the same time scale conversion from hours into milliseconds to match the stored format. But just like before you can make this whatever scale you actually need.
Most importantly, as real document properties which don't rely on computation at run-time, you can place an index on this:
db.collection.createIndex({ "timeOfDay": 1 })
So not only is this negating run-time overhead for calculating, but also with an index you can avoid collection scans as outlined on the linked page on indexing for MongoDB.
For optimal performance you never want to calculate such things as in any real world scale it simply takes an order of magnitude longer to process all documents in the collection just to work out which ones you want than to simply reference an index and only fetch those documents.
The aggregation framework may just be able to help you rewrite the documents here, but it really should not be used as a production system method of returning such data. Store the times separately.