Mongodb: How to find TTL index are deleting Data from particular Days? - mongodb

How we can find out data is deleting from all collections after particular Days using TTL Index?

Say your TTL is 10 days, you can do a count of the number of documents where the date is more than 10 days ago. To do such a count:
db.myCollection.count({"date":{"$lt":ISODate("2017-02-07T00:00:00.000Z")}})
If TTL is working, you'd expect the count to be 0. Note that you need to give some extra time before deletion-by-TTL takes place. Deletion is not exactly at the specified moment because deletion is via a periodically-run background task.

Related

How to keep the Lived documents in MongoDB as per the TTL index?

I know that the TTL index in MongoDB set a time to live for its documents. When they live for the specified time then those documents gets deleted. So what if I want to archive those documents without losing them. is there any way to do it?
I am sorry If I asked it in the wrong place.
You can use a change stream to capture documents inserted and archive them somewhere. See also this answer on stackoverflow.
At the start you have a document with "column" (timestamp) that is in the TTL index. To prevent its deletion, you can move that timestamp to the future, or remove that "column" from the document.

Add Mongo TTL Index to Large Collection

I have a large collection in Mongo. Around 1.7 billion records that take up around 5TB of storage space. I no longer need to keep this data indefinitely so I'm looking at options for getting rid of most of the data, preferably based on "createdAt".
I'm wondering what to expect if I add a ttl index to only keep records around for a month at the most. I have the following index currently:
{
"v" : 1,
"key" : {
"createdAt" : 1
},
"name" : "createdAt_1",
"ns" : "someNS.SomeCollection",
"background" : true
}
How quickly would mongo be able to delete all that data? From what I've read, the ttl process runs every 60 seconds. How much data does it delete each time around?
Adding a TTL index to a large collection like that can really impact performance. If you need to continue querying this collection while creating the TTL, you might consider initially creating the TTL index far in the past so that no documents would actually be expired. Once an index has been created with a TTL, you can later adjust how long documents are meant to stay around for.
Once you've created that index, you can either manually run queries to delete the old data until you're close to up-to-date and able to adjust the TTL, or bump up the TTL slowly so that you're able to control the performance impact.
(Source: advice from mlab on adding a TTL to a 1TB collection. If you don't need to maintain access to data while removing old documents, completely ignore this advice)
Timing of the Delete Operation
When you build a TTL index in the background, the TTL thread can begin deleting documents while the index is building. If you build a TTL index in the foreground, MongoDB begins removing expired documents as soon as the index finishes building.
The TTL index does not guarantee that expired data will be deleted immediately upon expiration. There may be a delay between the time a document expires and the time that MongoDB removes the document from the database.
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a collection during the period between the expiration of the document and the running of the background task.
Because the duration of the removal operation depends on the workload
of your mongod instance, expired data may exist for some time beyond
the 60 second period between runs of the background task.

Can mongo do autoremove collections?

I heared, that mongo can do it, but I can't find how.
Can mongo create collections, which will be autoremove in future, from time, which i can setup? Or Mongo can't do this magic?
mongodb cannot auto remove collections but it can auto remove BSON records. You just need to set ttl(Time to live) index on a date field that exists in BSON record .
You can read more here MongoDb: Expire Data from Collections by Setting TTL
Collections are auto created on the first write operation (insert, upsert, index creation). So this magic is covered.
If your removal is based on time, you could use cron or at to run this little script
mongo yourDBserver/yourDB --eval 'db.yourCollection.drop()’
As Sammaye pointed out, creating indices is a costly operation. I would assume there is something wrong with your data model. For semantically distinguishing documents, I'd rather create a field on them which does that and set an expiration date or a creation date and a time frame in which the documents are valid and use TTL indices to remove all of those documents.
For using an expiration date, you have to set a field to an ISODate and create a TTL index without a duration:
db.yourColl.ensureIndex({"yourExpirationDateField":1},{expireAfterSeconds:0})
In the case you want the documents to be valid for let's say a week after they are created, you would use the following:
db.yourColl.ensureIndex({"yourCreationDate":1},{expireAfterSeconds:604800})
Either way, here is what happens: Once every minute a background thread called TTLMonitor wakes up, gets all TTL indices for the server and starts processing them. It scans the TTL index, looking for the date values, adds the value given for "expireAfterSeconds" and deletes all documents which it determined to be invalid by now. This process takes some time, so don't expect the documents to be deleted on the very second they expire.
The big advantage of that approach: you don't need any triggering logic to be maintained, the deletes are done automagically in the background and you don't put any load on your application. Plus, using an expiration date, your have very granular control over when a document expires.
The drawback is ... ... Well, if I want to find one it would be that you have to insert a creation date for every document or calculate and insert an expiration date. And you have to send an administrative command to the mongod/mongos once in the application lifetime...

How can I perform time range query and skip documents at a particular interval?

I have a collection of 10 million documents each having a field 'timeOfReceipt' which is milliseconds, I would like to perform a range query (24 hour period) and then further filter that by skipping documents at a specified interval (10 second interval for example). The range query is simple enough I just don't understand how I can skip using an interval.
Assuming you wish to keep this to one cursor/query for this operation here is now way to programatically iterate only a certain selection of documents for each batch you pull from the server.
The best way to achieve this would be to cache on your client side until you have all the documents you would need and then return that batch the end user. You would need to cache any leftover documents and then retrieve them at the start of the next batch.

When will a mongodb document expire after it is updated?

I have a collections of documents in mongodb, with the expireAfterSeconds property set on a date-type index.
For the sake of argument, the documents are set to expire after one hour.
When I update a document in this collection, which one of the following will happen?
a) The document will expire one hour after the original creation time.
b) The document will expire one hour after the update time.
c) The document will expire one hour after the indexed variable's time, whatever that may be.
d) None of the above
I think that it's c, but cannot find the reference to confirm it. Am I correct? Where is this documented?
[edit]: To clarify, the situation is that I'm storing password reset codes (that should expire.) and I want the old codes to stop working if a new code is requested. It's not very relevant though, since I can ensure the behaviour I want is always respected by simply deleting the old transaction. This question is not about my current problem, but about Mongo's behaviour.
The correct answer is c)
The expireAfterSeconds property always requires an index on a field which contains a BSON date, because the content of this date field is used to select entries for removal.
When you want an update of a document to reset the time-to-live, also update the indexed date field to the current time.
When you want an update to not affect the TTL, just don't update the date.
However, keep in mind that expireAfterSeconds doesn't guarantee immediate deletion of the document. The deletions are done by a background job which runs every minute. This job is low-priority and can be postponed by MongoDB when the current load is high. So when it's important for your use-case that the expire times are respected accurately to the second, you should add an additional check on the application level.
This feature is documented here: http://docs.mongodb.org/manual/tutorial/expire-data/
If you don't want to rely on mongo demon process for expiring the collection, then better to create an additional createdOn field on collection and compare it with the current timestamp to decide whether to use that document or not.