Find closest date in one query - mongodb

I'm currently trying to figure out a way to find the closest date of a entry in mongoDB to the on i'm looking for.
Currently i solved the problem by using 2 queries. One using $gte and limit(1) to look for the next larger date and then $lte - limit(1) to see if there is a closer on that might be lower.
I was wondering, if there might be a way to find the closest date in just one query, but was not able to find anything on that matter.
Hope you can help me with this, or at least tell me for sure that this is the only way to do so.
db.collection.find({"time":{$gte: isoDate}}).sort({"time":1}).limit(1)
db.collection.find({"time":{$lte: isoDate}}).sort({"time":-1}).limit(1)
But I am looking for a way to do this in one query so i dont have to subtract the results to find the closest one.

I solved a similar problem using an aggregation.
Sample data:
{
"_id" : ObjectId("5e365a1655c3f0bea76632a0"),
"time" : ISODate("2020-02-01T00:00:00Z"),
"description" : "record 1"
}
{
"_id" : ObjectId("5e365a1655c3f0bea76632a1"),
"time" : ISODate("2020-02-01T00:05:00Z"),
"description" : "record 2"
}
{
"_id" : ObjectId("5e365a1655c3f0bea76632a2"),
"time" : ISODate("2020-02-01T00:10:00Z"),
"description" : "record 3"
}
{
"_id" : ObjectId("5e365a1655c3f0bea76632a3"),
"time" : ISODate("2020-02-01T00:15:00Z"),
"description" : "record 4"
}
{
"_id" : ObjectId("5e365a1655c3f0bea76632a4"),
"time" : ISODate("2020-02-01T00:20:00Z"),
"description" : "record 5"
}
{
"_id" : ObjectId("5e365a1655c3f0bea76632a5"),
"time" : ISODate("2020-02-01T00:25:00Z"),
"description" : "record 6"
}
And I'm looking for the record nearest to ISODate('2020-02-01T00:18:00.000Z').
db.test_collection.aggregate([
{
$match:
{
time:
{
$gte: ISODate('2020-02-01T00:13:00.000Z'),
$lte: ISODate('2020-02-01T00:23:00.000Z')
}
}
},
{
$project:
{
time: 1,
description: 1,
time_dist: {$abs: [{$subtract: ["$time", ISODate('2020-02-01T00:18:00.000Z')]}]}}
},
{
$sort: {time_dist: 1}
},
{
$limit: 1
}])
The $match stage sets up a "time window". I used 5 minutes for this example.
The $project stage adds a time distance field. This is the time in milliseconds each record is from the query time of ISODate('2020-02-01T00:18:00.000Z').
Then I sorted on the time_dist field and limit the results to 1 to return the record with time closest to ISODate('2020-02-01T00:18:00.000Z').
The result of the aggregation:
{
"_id" : ObjectId("5e365a1655c3f0bea76632a4"),
"time" : ISODate("2020-02-01T00:20:00Z"),
"description" : "record 5",
"time_dist" : NumberLong(120000)
}

check this one
db.collection.find({"time":{$gte: isoDate,$lt: isoDate}}).sort({"time":1}).limit(1)
Please use the same format what mongodb support like following
ISODate("2015-10-26T00:00:00.000Z")

In Pymongo, I used the following function. The idea is to take a datetime object, subtract some days from it and add some days to it, then find a date between those two dates. If there are no such records, increase the date span:
import datetime, dateutil
def date_query(table, date, variance=1):
'''Run a date query using closest available date'''
try:
date_a = date - dateutil.relativedelta.relativedelta(days=variance)
date_b = date + dateutil.relativedelta.relativedelta(days=variance)
result = db[table].find({'date': {'$gte': date_a, '$lt': date_b}}).sort([('date', 1)])
result = list(result)
assert len(result) >= 1
return result[len(result)//2] # return the result closest to the center
except:
return date_query(table, date, variance=variance*2)

accourding to https://stackoverflow.com/a/33351918/4885936
don't need ISODate
simple easy solution is:
if you want 1 hour left to due date just simply :
const tasks = await task.find({
time: {
$gt: Date.now(),
$lt: Date.now() + 3600000 // one hour to miliseconds
}
})
this code get tasks from now to upcoming one hour later.

Related

Iterating over a list in MongoDB

I'm a complete novice in MongoDB and I'm trying to delete some entries from a collection day by day. I have to do it day by day coz the collection is huge and removing by month times out. Here's an example code I have:
days = ['2018-04-01-day','2018-04-02-day','2018-04-03-day','2018-04-04-day','2018-04-05-day','2018-04-06-day','2018-04-07-day','2018-04-08-day','2018-04-09-day','2018-04-10-day','2018-04-11-day','2018-04-12-day','2018-04-13-day','2018-04-14-day','2018-04-15-day','2018-04-16-day','2018-04-17-day','2018-04-18-day','2018-04-19-day','2018-04-20-day','2018-04-21-day','2018-04-22-day','2018-04-23-day','2018-04-24-day','2018-04-25-day','2018-04-26-day','2018-04-27-day','2018-04-28-day','2018-04-29-day','2018-04-30-day']
var day;
for(day of days)
{
print(day)
db.<colln>.remove
(
{ 'time_bucket': day },
{ 'URL':/https:\/\/abc.com/}
)
}
The above code executes, but only gives me the following:
2018-04-06-day
WriteResult({ "nRemoved" : 0 })
I would have expected to atleast see all the dates printed, but even that's not happening.
I tried other methods using UTC date methods and they didn't seem to work as well.
I am able to make the following code work on a smaller collection:
db.<small colln>.remove(
{ 'time_bucket': '2018-04-month' },
{ 'URL':/https:\/\/abc.com/}
)
But the above code (removing by month) won't work for a larger collection, which is why I'm forced to do it day by day, by creating an array for multiple days. I know it's not the most efficient method, but I just need to make it work anyhow.
Any help would be much appreciated.
for(var day in days) will iterate through indexes of your array, producing
for(var day in days){ print (day) }
0
1
2
3
4
5
I believe you meant to use for(var day of days):
for(var day of days){ print (day) }
2018-04-01-day
2018-04-02-day
2018-04-03-day
2018-04-04-day
2018-04-05-day
2018-04-06-day
Just wanted to add a few more details. I have tested the following on my local MongoDB 4.2 on this collection:
{ "_id" : ObjectId("6053e2f7acf8d9b7cc48adf0"), "name" : "test 4", "time_bucket" : "2018-04-03-day" }
{ "_id" : ObjectId("6053e4ccacf8d9b7cc48adf1"), "name" : "test", "time_bucket" : "2018-04-01-day" }
{ "_id" : ObjectId("6053e4d3acf8d9b7cc48adf2"), "name" : "test 1", "time_bucket" : "2018-04-01-day" }
{ "_id" : ObjectId("6053e4ddacf8d9b7cc48adf3"), "name" : "test 34", "time_bucket" : "2018-04-02-day" }
const days = ['2018-04-01-day','2018-04-02-day']
for(let day of days){ db.testcol.remove( { 'time_bucket': day }) }
After executing it, the collection looks like this:
{ "_id" : ObjectId("6053e2f7acf8d9b7cc48adf0"), "name" : "test 4", "time_bucket" : "2018-04-03-day" }
So everything appears to work as intended.

How to group dates in mongoDB by first or second half of the month (fortnights)

With the following data structure, using mongoDB's (v3.4) aggregation framework how do you group information every 15 days?
{
"_id" : ObjectId("5cb10a201e20af7503305fea"),
"user" : ObjectId("5b21240c4e71161fdd40b27c"),
"version" : NumberLong(2),
"value" : 42,
"itemRef" : ObjectId("5cb10a201e20af7503305fe9"),
"status" : "ACCEPTED",
"date" : ISODate("2019-04-13T11:00:00.466Z")
}
the required output would be:
[date: 2019/01/01, totalValue:15],
[date: 2019/01/16, totalValue:5],
[date: 2019/02/01, totalValue:25],
[date: 2019/02/16, totalValue:30]
The way I found to resolve this problem with mongoDB 3.4 was using $cond + $dayOfMonth to define in which part of the month this date is.
db.contract.aggregate(
[
{$match:{...queryGoesHere...}},
{$project:
{dateText:
{$cond:
[
{$lte:[{$dayOfMonth:$date},15]},
['$dateToString': ['format': '%Y-%m-01', 'date': '$date']],
['$dateToString': ['format': '%Y-%m-16', 'date': '$date']]
]
}
value:'$value'
}
},
{$group:
{
_id:'$dateText',
total:{'$sum':1}
}
}
]
The solution is in the projection of the "dateText", it first uses the $cond to determine if the date is in the first or second part of the month. It determines this using the '$dayOfMonth' which returs the day in the month. If it is less or equal to 15, it uses the '$dateToString' to format the date by year-month-01 else it formats it to year-month-16.
Hope this can help someone in the future.

MongoDB - Get aggregated difference between two date fields

I have one collection called lists with following fields:
{ "_id" : ObjectId("5a7c9f60c05d7370232a1b73"), "created_date" : ISODate("2018-11-10T04:40:11Z"), "processed_date" : ISODate("2018-11-10T04:40:10Z") }
{ "_id" : ObjectId("5a7c9f85c05d7370232a1b74"), "created_date" : ISODate("2018-11-10T04:40:11Z"), "processed_date" : ISODate("2018-11-10T04:41:10Z") }
{ "_id" : ObjectId("5a7c9f89c05d7370232a1b75"), "created_date" : ISODate("2018-11-10T04:40:11Z"), "processed_date" : ISODate("2018-11-10T04:42:10Z") }
{ "_id" : ObjectId("5a7c9f8cc05d7370232a1b76"), "created_date" : ISODate("2018-11-10T04:40:11Z"), "processed_date" : ISODate("2018-11-10T04:42:20Z") }
I need to find out aggregated result in the following format (the difference between processed_date and created_date):
[{
"30Sec":count_for_diffrence_1,
"<=60Sec":count_for_diffrence_2,
"<=90Sec":count_for_diffrence_3
}]
One more thing if we can find out how may item took 30 sec, 60 sec and so on, also make sure that the result for <=60 Sec should not come in <=90Sec.
Any help will be appreciated.
You can try below aggregation query in 3.6 version.
$match with $expr to limit the documents where the time difference is 90 or less seconds.
$group with $sum to count different time slices occurences.
db.collection.aggregate([
{"$match":{"$expr":{"$lte":[{"$subtract":["$processed_date","$created_date"]},90000]}}},
{"$group":{
"_id":null,
"30Sec":{"$sum":{"$cond":{"if":{"$eq":[{"$subtract":["$processed_date","$created_date"]},30000]},"then":1,"else":0}}},
"<=60Sec":{"$sum":{"$cond":{"if":{"$lte":[{"$subtract":["$processed_date","$created_date"]},60000]},"then":1,"else":0}}},
"<=90Sec":{"$sum":{"$cond":{"if":{"$lte":[{"$subtract":["$processed_date","$created_date"]},90000]},"then":1,"else":0}}}
}}
])
Note if the created date is greater than processed date you may want to add a condition to look only for values where difference is between 0 and your requested time slice.
Something like
{$and:[{"$gte":[{"$subtract":["$processed_date","$created_date"]},0]}, {"$lte":[{"$subtract":["$processed_date","$created_date"]},60000]}]}

Mongodb how to find how much days ago from a timestamp field

I was trying to find the number of days ago using the timestamp but i dont know how to do that ?
{
"_id" : ObjectId("5504cc9ddd5af617caae30b3"),
"session_id" : 1,
"Timestamp" : "2014-04-07T10:51:09.277Z",
"Item_ID" : 214536502,
"Category" : 0
}
How I can calculate the number of days ago using the field "Timestamp" ?
You may use aggregate, $project with new Date() on the Timestamp field, then do the calculation, something like this:
pipe = {
"$project" : {
"_id" : 1,
"daySince" : {
"$divide" : [
{
"$subtract" : [
new Date(),
new Date("$Timestamp")
]
},
86400000
]
}
}
}
To calculate:
db.collection.aggregate(pipeline=pipe)
Since Timestamp isn't a ISODate object, you just need to convert it to one, then subtract to current date, and divide the result by 60*60*24*1000, then it will be the number of days since today.
You can also change the new Date() to what you need to be compared.
Updated:
Since I believe the Timestamp format might be malformed, alternatively you may use mapReduce functions to calculate this:
// in your mongo shell using the db
var mapTimestamp = function() {
daySince = parseInt(new Date() - new Date(this.Timestamp) / 86400000);
emit(this._id, daySince);
}
// since you map reduce only on one field, there's really no need for this
var reduceTimestamp = function (key, value) { return value; }
db.collection.mapReduce(mapTimestamp, reduceTimestamp, {out: "sample"})
To show the results:
db.sample.find()

mongo query select only first of month

is it possible to query only the first (or last or any single?) day of the month of a mongo date field.
i use the $date aggregation operators regularly but within a $group clause.
basically i have field that is already aggregated (averaged) for each day of the month. i want to select only one of these days (with the value as a representative of the entire month.)
following is a sample of a record set from jan 1, 2014 to feb 1, 2015 with price as the daily price and 28day_avg as the trailing monthly average for 28 days.
{ "date" : ISODate("2014-01-01T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 59.23, "28day_avg": 54.21}
{ "date" : ISODate("2014-01-02T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 58.75, "28day_avg": 54.15}
...
{ "date" : ISODate("2015-02-01T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 123.50, "28day_avg": 122.25}
method 1.
im currently running an aggregation using $month data (and summing the price) but one issue is im seeking to retrieve the underlying date value ISODate("2015-02-01T00:00:00Z") versus the 0,1,2 value that comes with several of the date aggregations (that loop at the first of the week, month, year). mod(28) on a date?
method 2
i'd like to simply pluck out a single record of the 28day_avg as representative of the period. the 1st of the month would be adequate
the desired output is...
_id: ISODate("2015-02-01T00:00:00Z"), value: 122.25,
_id: ISODate("2015-01-01T00:00:00Z"), value: 120.78,
_id: ISODate("2014-12-01T00:00:00Z"), value: 118.71,
...
_id: ISODate("2014-01-01T00:00:00Z"), value: 53.21,
of course, the value will vary from method 1 to method 2 but that is fine. one is 28 days trailing while the other will account for 28, 30, 31 day months...dont care about that so much.
A non-agg is ok but also doesnt work. aka {"date": { "$mod": [ 28, 0 ]} }
To pick the first of the month for each month (method 2), use the following aggregation:
db.test.aggregate([
{ "$project" : { "_id" : "$date", "day" : { "$dayOfMonth" : "$date" }, "28day_avg" : 1 } },
{ "$match" : { "day" : 1 } }
])
You can't use an index for the match, so this is not efficient. I'd suggest adding another field to each document that holds the $dayOfMonth value, so you can index it and do a simple find:
{
"date" : ISODate("2014-01-01T00:00:00Z"),
"price" : 59.23,
"28day_avg" : 54.21,
"dayOfMonth" : 1
}
db.test.ensureIndex({ "dayOfMonth" : 1 })
db.test.find({ "dayOfMonth" : 1 }, { "_id" : 0, "date" : 1, "28day_avg" : 1 })