Filtering & sorting by dates in MongoDB's oplog - mongodb

I'm trying to filter oplog.rs to view operations that were logged after a certain date. But comparison operators don't seem to work with dates in MongoDB:
db['oplog.rs'].find({ts: {$gte: ISODate("2014-12-15T00:00:00Z")}})
What's confusing is a lot of online sources are saying to do it this way but it is obviously not working. I want to return results where the ts field is at least December 15th or more recent, but I am getting results that are clearly before this period. If I switch $gte with $lte no results at all are displayed, even though there are definitely entries both before and after this specified date.
Also, I can't seem to sort the results either. If I try this:
db['oplog.rs'].find().sort({ts: -1})
I get this:
error: {
"$err" : "Runner error: Overflow sort stage buffered data usage of 33566005 bytes exceeds internal limit of 33554432 bytes",
"code" : 17144
}
If I could filter the results to make them more recent, I am hoping it would overcome this sorting error, but I can't even do that with MongoDB's basic operators. How can I filter the results of a find operation by date?

Comparion operators work just fine with date, but ts is a Timestamp object - not a date.
Your query needs to look something like this:
db['oplog.rs'].find({ts: {$gte: Timestamp(ISODate("2014-12-15T00:00:00Z").getTime(),0)}})
This creates a Timestamp object based on an ISODate and uses it in the query.

Related

Mongo Atlas filter by date is not working

I am using the the following query to filter dates greater than the 25th of May, but it does not work. In the UI I chose the $match operator
{
query: {$date:{ $gte:ISODate('2022-05-25')}}
}
In fact in the resulting collection it returns also dates prior to the one that I have specified in the filter condition.
Can you point me to the exact query?

why does $eq comparison is not working on mongodb with dates

I have the following query on mongo
db.getCollection('someCollection').find({"status":"failed","start_date": {$gt: new Date("2019/03/01")}})
that will retrieve all the records that have a failed status and the start_date is equal or greater than "2019/03/01".
But when i try to only retrieve records specifically for "2019/03/01":
db.getCollection('someCollection').find({"status":"failed","start_date": {$eq: new Date("2019/03/01")}})
It doesn't retrieve anything.
$lt and $gt queries work is just $eq that doesn't work. is that the correct way to use the $eq?
thank you
When you use new Date("2019/03/01") the actual date being searched is 2019/03/01 00:00:00.00. That is, the Date is exactly midnight 2019/03/01 (down to the millisecond). Unless your record also has the exact same date recorded, it will not match. That's why you need to use {$gt: new Date("2019/03/01"), $lt: new Date("2019/03/02")}. Just think of it as that 1 "day" is actually a range of 24 hours. Which is why you need to specify it as a range, and not as a simple $eq comparison.
The reason why your first query works is because when you search for $gt 2019/03/01 you're really searching for $gt 2019/03/01 00:00:00.00. And so, of course every record started on 2019/03/01 will match.
One thing to take note of: technically, if you have a document made exactly at midnight i.e. with a timestamp of 2019/03/01 00:00:00.00, it won't match. So you really should use {$gte: new Date ('2019/03/01')} (note the extra 'e'). $gte is greater than or equal. This may sound trivial but is actually important, because if you don't use actual timestamps and record just the date when you create the record (i.e. you insert with {start_date: new Date('2019/03/01')}, all those timestamps will actually have 00:00:00.00 as their time component, and they won't match with a $gt comparison, only a $gte comparison. You probably use new Date() when creating the record, so you're getting the full timestamp, which is why it hasn't bitten you yet. But to be semantically correct, you should use $gte in your first query. Does that make sense?

View last N documents using MongoDB Compass

I wish to view in MongoDB Compass the last N documents in a very large collection; too many to scroll through.
I could .skip(total - N) if I knew the syntax for that within Compass.
Alternatively, I have a date field and could use $gte with a date if I knew how to express a date in a manner acceptable to Compass.
Suggestion/example how to do this, please?
MongoDB Compass 1.6.1(Stable)
For date comparison you need to use $date operator with a string that represents a date in ISO-8601 date format.
{"date": {"$gte": {"$date": "2017-03-13T09:51:26.317Z"}}}
In my case the values of date field in Compass and mongo shell are different. So firstly I query the documents in the shell and then copy the "2017-03-13T09:51:26.317Z" from the result to the Compass filter line. In mongo shell it look like:
{
...
"date" : ISODate("2017-03-13T09:51:26.317Z"),
...
}
MongoDB Compass 1.7.0-beta.0 (Beta)
This version have an advanced query bar that lets you input not just the filter (as before), but also project, sort, skip and limit
(#Oleksandr I learned from your effective answer; thank you.)
I've also been shown that the Compass Schema tab allows one to drag a date range on the _id field to apply a filter query for that range. That range can be successively narrowed as desired.
Skip is descibed here
https://docs.mongodb.com/compass/current/query/skip/
In the Query Bar, click Options.
Enter an integer representing the number of documents to skip into the Skip field
Click Find to run the query and view the updated results.

MongoDB query: Using Limit together with $near skips few documents

I am currently developing an app which gets the specific number of documents from a collection if their location cordinates falls within certain range of distance. I am using a active record library for Codeigniter and the query that is generated is as follows
db.updates.find({locs: { $near: [72.844102008984, 19.130207090604 ], $maxDistance: 5000 }, posted_on : { $lt :1398425538.1942 },}).sort( { posted_on: -1 } ).limit(10).toArray()
The problem I am facing is that the above query skips few documents which should actually get pulled. But if I remove the limit(10) from the above query then proper documents gets pulled.
I am not sure, but does using limit() in MongoDB omit few results ? or does it limits to only the closest(nearest) documents?
P.S - The documents skipped using the limit are not always the same & random results are generated
I suspect you are running into problems with the special nature of the $near query. $near performs both a limit() and a sort() on the cursor returning the results -
Specifies a point for which a geospatial query returns the closest documents first. The query sorts the documents from nearest to farthest.
By default, queries that use a 2d index return a limit of 100 documents; however you may use limit() to change the number of results.
http://docs.mongodb.org/manual/reference/operator/query/near/
While the documentation does specifically discuss overriding the limit of 100 with your own limit call
You can further limit the number of results using cursor.limit().
It is silent on adding your own sort() or both sorting and overriding the limit at the same time. I suspect you are running into side effects of doing both. Note that it's not incorrect to do both - it just may not produce the results you are looking for. I'd suggest trying the same query using $geoWithin
http://docs.mongodb.org/manual/reference/operator/query/geoWithin/
$geoWithin does not apply a sort or a limit on the results, so it gives you something of a more raw result set.
Do you have any identical posted_on dates in the system? I recommend sorting by a second key, perhaps _id. If the sort order is non-deterministic the system may skip documents in a non-deterministic manor. Adding the _id field to your sort order is generally not that expensive if you have an index on the other fields as they will already be very close to the correct order and _id is part of all indexes. ("By default, all collections have an index on the _id field, and applications and users may add additional indexes to support important queries and operations." http://docs.mongodb.org/manual/core/index-single/ )

Should I use the timestamp in "_id"?

I need monitor the time of the records been created, for further query and modify.
first thing flashed in my mind is give the document a "createDateTime" field, with the default value of "new Date()", but Mongodb said the document _id has a timestamp embedded with, and the id was generated when the document was created, so it sounds dummy to add a new field for that.
for too many times, I've seen people set a "createDateTime" for their data, and I don't know if they know about the details of mongodb's _id.
I want know should I use the _id as a "createDateTime" field? what is the best practice?
and the pros and cons.
thanks for any tips.
I'd actually say it depends on how you want to use the date.
For example, it's not actionable using the aggregation framework Date operators.
This will fail for example:
db.test.aggregate( { $group : { _id: { $year: "$_id" } } })
The following error occurs:
"errmsg" : "exception: can't convert from BSON type OID to Date"
(The date cannot be extracted from the ObjectId.)
So, operations that are normally simple date operations become much more complex if you wanted to do any sort of date math in an aggregation. It would be far easier to have a createDateTime stamp. Counting the number of documents created in a particular year and month would be simple using aggregation with a distinct createdDateTime field.
You can sort on an ObjectId, to some degree. The remaining 8 bytes of the ObjectId aren't sortable in a meaningful way. Most MongoDB drivers default to creating the ObjectId within the driver and not on the database. So, if you've got multiple clients (like web servers for example) creating new documents (and new ObjectIds), the time stamps will only be as accurate as the various servers.
Also, depending the precision you'd need, an ISODate value is stored using 8 bytes, rather than the 4 used in an ObjectId.
Yes, you should. There is no reason not to do, besides the human readability while directly looking into the database. See also here and here.
If you want to use the aggregation framework to group by the date within _id, this is not possible yet as WiredPrairie correctly said. There is an open jira ticket for that, you might watch. But of course you can do this with Map-Reduce and ObjectID.getTimestamp(). An example for that can be found here.