Find all documents within last n days - mongodb

My daily collection has documents like:
..
{ "date" : ISODate("2013-01-03T00:00:00Z"), "vid" : "ED", "san" : 7046.25, "izm" : 1243.96 }
{ "date" : ISODate("2013-01-03T00:00:00Z"), "vid" : "UA", "san" : 0, "izm" : 0 }
{ "date" : ISODate("2013-01-03T00:00:00Z"), "vid" : "PAL", "san" : 0, "izm" : 169.9 }
{ "date" : ISODate("2013-01-03T00:00:00Z"), "vid" : "PAL", "san" : 0, "izm" : 0 }
{ "date" : ISODate("2013-01-03T00:00:00Z"), "vid" : "CTA_TR", "san" : 0, "izm" : 0 }
{ "date" : ISODate("2013-01-04T00:00:00Z"), "vid" : "CAD", "san" : 0, "izm" : 169.9 }
{ "date" : ISODate("2013-01-04T00:00:00Z"), "vid" : "INT", "san" : 0, "izm" : 169.9 }
...
I left off _id field to spare the space here.
My task is to "fetch all documents within last 15 days". As you can see I need somehow to:
Get 15 unique dates. The newest one should be taken as the newest document in collection (what I mean that it isn't necessary the today's date, it's just the latest one in collection based on date field), and the oldest.. well, maybe it's not necessary to strictly define the oldest day in query, what I need is some kind of top15 starting from the newest day, if you know what I mean. Like 15 unique days.
db.daily.find() all documents, that have date field in that range of 15 days.
In the result, I should see all documents within 15 days starting from the newest in collection.

I just tested the following query against your data sample and it worked perfectly:
db.datecol.find(
{
"date":
{
$gte: new Date((new Date().getTime() - (15 * 24 * 60 * 60 * 1000)))
}
}
).sort({ "date": -1 })

Starting in Mongo 5, it's a nice use case for the $dateSubtract operator:
// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-02") }
// { date: ISODate("2021-12-02") }
// { date: ISODate("2021-11-28") } <= older than 5 days
db.collection.aggregate([
{ $match: {
$expr: {
$gt: [
"$date",
{ $dateSubtract: { startDate: "$$NOW", unit: "day", amount: 5 } }
]
}
}}
])
// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-02") }
// { date: ISODate("2021-12-02") }
With $dateSubtract, we create the oldest date after which we keep documents, by subtracting 5 (amount) "days" (unit) out of the current date $$NOW (startDate).
And you can obviously add a $sort stage to sort documents by date.

You need to run the distinct command to get all the unique dates. Below is the example. The "values" array has all the unique dates of the collection from which you need to retrieve the most recent 15 days on the client side
db.runCommand ( { distinct: 'datecol', key: 'date' } )
{
"values" : [
ISODate("2013-01-03T00:00:00Z"),
ISODate("2013-01-04T00:00:00Z")
],
"stats" : {
"n" : 2,
"nscanned" : 2,
"nscannedObjects" : 2,
"timems" : 0,
"cursor" : "BasicCursor"
},
"ok" : 1
}
You then use the $in operator with the most recent 15 dates from step 1. Below is an example that finds all documents that belong to one of the mentioned two dates.
db.datecol.find({
"date":{
"$in":[
new ISODate("2013-01-03T00:00:00Z"),
new ISODate("2013-01-04T00:00:00Z")
]
}
})

Related

Convert month from number to string question in Mongodb query

I am trying to get some avg number per month in the financial year. The collection is called test and the month data comes from CreateDate field. I want to get the avg price per month. The collection data is like below:
{
"_id" : ObjectId("5fd289a93f7cf02c36837ca7"),
"ClientName" : "John",
"OrderNumber" : "12345A",
"Price" : 10,
"CreateDate" : ISODate("2020-09-20T06:00:00.000Z"),
}
{
"_id" : ObjectId("5fd289a93f7cf02c36837cc7"),
"ClientName" : "John",
"OrderNumber" : "12345",
"Price" : 20,
"CreateDate" : ISODate("2020-09-12T06:00:00.000Z"),
}
So I am writing the query to get the avg number per month by the following within the financial year (from Sep to Aug):
db.test.aggregate([
{
$match: {
"CreateDate": {
$lt: ISODate("2021-08-31T00:00:00.000Z"),
$gte: ISODate("2020-09-01T00:00:00.000Z")
}
}
},
{
$group: {
_id: {$month: "$CreateDate"},
"AvgPrice": {
"$avg": "$Price",
}
}
},
{ $project:{ _id : 0 , Month: '$_id' , "AvgPrice ": '$AvgPrice' } }
])
The result I am getting is with the following format:
{
"Month" : 9,
"AvgPrice " : 15.0
}
{
"Month" : 10,
"AvgPrice " : 18.6666666666667
}
How can I display of the month converting to a string instead of the number. For example, the following is the ideal return:
{
"Month" : Sep,
"AvgPrice" : 15.0
}
{
"Month" : Oct,
"AvgPrice" : 18.6666666666667
}
I also have two more questions:
I am using the Mongodb 3.6 version, is there any way to round up the avg price to two digit after the decimal point? For example, above will be "18.67" instead of "18.66666". Mongo 4.2 has something called $round but 3.6 seems doesn't have this function.
If I want to break down by client, has the returning result like below:
{
"ClientName": "John",
"Month" : Sep,
"AvgPrice" : 15.0
}
{
"ClientName" : "Mary"
"Month" : Oct,
"AvgPrice" : 18.6666666666667
}
How do I put another level of the group to breakdown to the client level and then month level?
Any help will be appreciated!
If I want to break down by client
You can add ClientName field in _id,
{
$group: {
_id: {
ClientName: "$ClientName",
month: { $month: "$CreateDate" }
},
AvgPrice: { $avg: "$Price" }
}
},
How can I display of the month converting to a string instead of the number.
There is no any straight way to get month name in mongodb, but if you prepare array of months in string and access it by index,
$arrayElemAt to select month by its number
{
$project: {
_id: 0,
ClientName: "$_id.ClientName",
Month: {
$arrayElemAt: [
["","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"],
"$_id.month"
]
},
AvgPrice: 1
}
}
Playground
I am using the Mongodb 3.6 version, is there any way to round up the avg price to two digit after the decimal point?
There is no any option in mongodb 3.6 or below, you already know there is a option $round in mongodb 4.2.
You can refer this question Rounding to 2 decimal places using MongoDB aggregation framework
, there are many tricks.

MongoDB - aggregate $group by every seven days from a date

Suppose you have any number of documents in a collection with the following structure:
{
"_id" : "1",
"totalUsers" : NumberInt(10000),
"iosUsers" : NumberInt(5000),
"androidUsers" : NumberInt(5000),
"creationTime" : ISODate("2017-12-04T06:14:21.529+0000")
},
{
"_id" : "2",
"totalUsers" : NumberInt(12000),
"iosUsers" : NumberInt(6000),
"androidUsers" : NumberInt(6000),
"creationTime" : ISODate("2017-12-04T06:14:21.529+0000")
},
{
"_id" : "3",
"totalUsers" : NumberInt(14000),
"iosUsers" : NumberInt(7000),
"androidUsers" : NumberInt(7000),
"creationTime" : ISODate("2017-12-04T06:14:21.529+0000")
}
And want to write a query that returns results between two given dates (ie: startDate and endDate) and then group the results every seven days:
db.collection.aggregate(
{ $match: {$gte: startDate, $lte: endDate } },
{ $group: { _id: { --- every seven days from endDate --- } }
)
How can I do this?
First get boundries
var boundries = [];
vat sd= ISODate("2017-10-18T20:41:33.602+0000"),ed=ISODate("2017-11-22T12:41:36.348+0000");
boundries.push(sd);
var i = sd;
while(i<=ed){
//push ISODate(i + 7 days) in boundries
}
//also push ISODate(ed+1day) because upper bound is exclusive
//use $bucket aggregation
db.collection.aggregate({$match:{creationTime:{$gte:stDate,$lte:endDate}}},{
$bucket:{
groupBy: "$creationTime",
boundaries:boundries ,
}
})

Remove redundant data from sensors by using date and value

I'm developing an application that collects data from sensors and I need to reduce the amount of data that is stored in a mongodb database by using a value (temperature) and a date (timestamp).
The document have the following format:
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:37:50.370Z")
sensorCode:"SENSOR_A1"
}
The problem is that sensors sent data too much frequently so there are too many documents with redudant data in a short period of time (let's say 10 minutes). I meant it is not useful to have multiple equal values in a very short period of time.
Example: here there are data from a sensor that is reporting temperature is 10
// collection: datasensors
[
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:37:50.370Z")
sensorCode:"SENSOR_A1"
},
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:38:50.555Z")
sensorCode:"SENSOR_A1"
},
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:38:51.654Z")
sensorCode:"SENSOR_A1"
}
,
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:50:20.335Z")
sensorCode:"SENSOR_A1"
}
]
Because a minute precission is not required, I would like to remove all documents from 2016-04-29T14:37:50.370Z to 2016-04-29T14:38:51.32Z except one. So the result should be this:
[
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:38:51.654Z")
sensorCode:"SENSOR_A1"
},
{
temperature: 10,
timestamp: ISODate("2016-04-29T14:50:20.335Z")
sensorCode:"SENSOR_A1"
}
]
The remove operation I want to perform should "reduce" equal temperatures in time ranges less than 10 minutes to one value.
Is there any technique to achieve this?
I simplified my solution and decided to keep every unique measurement received in 10 minutes time window.
Mongo 3.2 is required for that
adding a time mark will separate measurements in 10 minutes time groups
Then we are preserving first record in group and storing all ids for futher process
Then removing id of document we want to keep from an array of all ids (let say documents to delete)
Finally as forEach loop we are deleting not needed ids - this line is commented :-)
Copy code below to mongo console, execute and verify ids to delete, then un-comment and GO!
var addTimeMark = {
$project : {
_id : 1,
temperature : 1,
timestamp : 1,
sensorCode : 1,
yearMonthDay : {
$substr : [{
$dateToString : {
format : "%Y%m%d%H%M",
date : "$timestamp"
}
}, 0, 11]
}
}
}
var getFirstRecordInGroup = {
// take only first record froum group
$group : {
_id : {
timeMark : "$yearMonthDay",
sensorCode : "$sensorCode",
temperature : "$temperature"
},
id : {
$first : "$_id"
},
allIds : {
$push : "$_id"
},
timestamp : {
$first : "$timestamp"
},
totalEntries : {
$sum : 1
}
}
}
var removeFirstIdFromAllIds = {
$project : {
_id : 1,
id : 1,
timestamp : 1,
totalEntries : 1,
allIds : {
$filter : {
input : "$allIds",
as : "item",
cond : {
$ne : ["$$item", "$id"]
}
}
}
}
}
db.sensor.aggregate([
addTimeMark,
getFirstRecordInGroup,
removeFirstIdFromAllIds,
]).forEach(function (entry) {
printjson(entry.allIds);
// db.sensor.deleteMany({_id:{$in:entry.allIds}})
})
below document outlook after each step:
{
"_id" : ObjectId("574b5d8e0ac96f88db507209"),
"temperature" : 10,
"timestamp" : ISODate("2016-04-29T14:37:50.370Z"),
"sensorCode" : "SENSOR_A1",
"yearMonthDay" : "20160429143"
}
2:
{
"_id" : {
"timeMark" : "20160429143",
"sensorCode" : "SENSOR_A1",
"temperature" : 10
},
"id" : ObjectId("574b5d8e0ac96f88db507209"),
"allIds" : [
ObjectId("574b5d8e0ac96f88db507209"),
ObjectId("574b5d8e0ac96f88db50720a"),
ObjectId("574b5d8e0ac96f88db50720b")
],
"timestamp" : ISODate("2016-04-29T14:37:50.370Z"),
"totalEntries" : 3
}
and last;
{
"_id" : {
"timeMark" : "20160429143",
"sensorCode" : "SENSOR_A1",
"temperature" : 10
},
"id" : ObjectId("574b5d8e0ac96f88db507209"),
"allIds" : [
ObjectId("574b5d8e0ac96f88db50720a"),
ObjectId("574b5d8e0ac96f88db50720b")
],
"timestamp" : ISODate("2016-04-29T14:37:50.370Z"),
"totalEntries" : 3
}

Group and sum day by day

This is how my collection structure looks like:
{
"_id" : ObjectId("57589d2a9108dace306602b8"),
"IDproject" : NumberLong(53),
"email" : "john.doe#gmail.com",
"dc" : ISODate("2016-06-06T22:33:13.000Z")
}
{
"_id" : ObjectId("57589d2a9108dace306602b8"),
"IDproject" : NumberLong(53),
"email" : "david.doe#gmail.com",
"dc" : ISODate("2016-06-07T22:33:13.000Z")
}
{
"_id" : ObjectId("57589d2a9108dace306602b8"),
"IDproject" : NumberLong(53),
"email" : "elizabeth.doe#gmail.com",
"dc" : ISODate("2016-06-078T22:33:13.000Z")
}
As you can see, there are two customers added on June 7th and one on June 6th. I would like to group and sum these results for the last 30 days.
It should looks something like this:
{
"dc" : "2016-06-05"
"total" : 0
}
{
"dc" : "2016-06-06"
"total" : 1
}
{
"dc" : "2016-06-07"
"total" : 2
}
As, you can see, there are no records on June 6th, so it's zero. It should be zero for June 5th, etc.
That would be the case #1, and the case #2 are following results:
{
"dc" : "2016-06-05"
"total" : 0
}
{
"dc" : "2016-06-06"
"total" : 1
}
{
"dc" : "2016-06-07"
"total" : 3
}
I've tried this:
db.getCollection('customer').aggregate([
{$match : { IDproject : 53}},
{ $group: { _id: "$dc", total: { $sum: "$dc" } } }, ]);
But seems complicated. I'm first time working with noSQL database.
Thanks.
Here's how you will get daily counts (the common idiom for row count is {$sum: 1}).
However, you cannot obtain zeros for days that are lacking data – because there is no data that would give the grouping key for these days. You must handle these cases in PHP by generating a list of desided dates and then looking if there's data for that each date.
db.getCollection('customer').aggregate([
{$match : { IDproject : 53}},
{$group: {
_id: {year: {$year: "$dc"}, month: {$month: "$dc"}, day: {$dayOfMonth: "$dc"}}},
total: {$sum: 1}
}},
]);
Note that MongoDB only operates in the UTC timezone; there are no aggregation pipeline operators that can convert timestamps to local timezones reliably. The $year, $month and $dayOfMonth operators give the date in UTC which may not be the same as in the local timezone. Solutions include:
saving timestamps in the local timezone (= lying to MongoDB that they are in UTC),
saving the timezone offset with the timestamp,
saving the local year, month and dayOfMonth with the timestamp.

MongoDb aggregation Group by Date

I'm trying to group by timestamp for the collection named "foo" { _id, TimeStamp }
db.foos.aggregate(
[
{$group : { _id : new Date (Date.UTC({ $year : '$TimeStamp' },{ $month : '$TimeStamp' },{$dayOfMonth : '$TimeStamp'})) }}
])
Expecting many dates but the result is just one date. The data i'm using is correct (has many foo and different dates except 1970). There's some problem in the date parsing but i can not solve yet.
{
"result" : [
{
"_id" : ISODate("1970-01-01T00:00:00.000Z")
}
],
"ok" : 1
}
Tried this One:
db.foos.aggregate(
[
{$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 } }},
{$project : { parsedDate : new Date('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])
Result :
uncaught exception: aggregate failed: {
"errmsg" : "exception: disallowed field type Date in object expression (at 'parsedDate')",
"code" : 15992,
"ok" : 0
}
And that one:
db.foos.aggregate(
[
{$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 } }},
{$project : { parsedDate : Date.UTC('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])
Can not see dates in the result
{
"result" : [
{
"count" : 412
},
{
"count" : 1702
},
{
"count" : 422
}
],
"ok" : 1
}
db.foos.aggregate(
[
{ $project : { day : {$substr: ["$TimeStamp", 0, 10] }}},
{ $group : { _id : "$day", number : { $sum : 1 }}},
{ $sort : { _id : 1 }}
]
)
Group by date can be done in two steps in the aggregation framework, an additional third step is needed for sorting the result, if sorting is desired:
$project in combination with $substr takes the first 10 characters (YYYY:MM:DD) of the ISODate object from each document (the result is a collection of documents with the fields "_id" and "day");
$group groups by day, adding (summing) the number 1 for each matching document;
$sort ascending by "_id", which is the day from the previous aggregation step - this is optional if sorted result is desired.
This solution can not take advantage of indexes like db.twitter.ensureIndex( { TimeStamp: 1 } ), because it transforms the ISODate object to a string object on the fly. For large collections (millions of documents) this could be a performance bottleneck and more sophisticated approaches should be used.
It depends on whether you want to have the date as ISODate type in the final output. If so, then you can do one of two things:
Extract $year, $month, $dayOfMonth from your timestamp and then reconstruct a new date out of them (you are already trying to do that, but you're using syntax that doesn't work in aggregation framework).
If the original Timestamp is of type ISODate() then you can do date arithmetic to subtract the hours, minutes, seconds and milliseconds from your timestamp to get a new date that's "rounded" to the day.
There is an example of 2 here.
Here is how you would do 1. I'm making an assumption that all your dates are this year, but you can easily adjust the math to accommodate your oldest date.
project1={$project:{_id:0,
y:{$subtract:[{$year:"$TimeStamp"}, 2013]},
d:{$subtract:[{$dayOfYear:"$TimeStamp"},1]},
TimeStamp:1,
jan1:{$literal:new ISODate("2013-01-01T00:00:00")}
} };
project2={$project:{tsDate:{$add:[
"$jan1",
{$multiply:["$y", 365*24*60*60*1000]},
{$multiply:["$d", 24*60*60*1000]}
] } } };
Sample data:
db.foos.find({},{_id:0,TimeStamp:1})
{ "TimeStamp" : ISODate("2013-11-13T19:15:05.600Z") }
{ "TimeStamp" : ISODate("2014-02-01T10:00:00Z") }
Aggregation result:
> db.foos.aggregate(project1, project2)
{ "tsDate" : ISODate("2013-11-13T00:00:00Z") }
{ "tsDate" : ISODate("2014-02-01T00:00:00Z") }
This is what I use in one of my projects :
collection.aggregate(
// group results by date
{$group : {
_id : { date : "$date" }
// do whatever you want here, like $push, $sum...
}},
// _id is the date
{$sort : { _id : -1}},
{$orderby: { _id : -1 }})
.toArray()
Where $date is a Date object in mongo. I get results indexed by date.