Is document expiration field set with date-time or seconds or milliseconds?
As I know it's seconds:
"expiration": 1543086426,
So 1543086426 = Saturday, November 24, 2018 7:07:06 PM
Then why the document is being removed by Cloudant?
If it's milliseconds then:
1543086426 = Sunday 18 January 1970 12:38:06
that explains. So how do I set proper expiration value for cloudant document e.g. in 1 month from now?
Also, which Cloudant task is responsible for document removal? And how often it starts?
As pointed out in the CouchDB document
Time to live (TTL) is the amount of time until a document expires in
Couchbase Server. By default, all documents have a TTL of zero, which
indicates the document is kept indefinitely. Typically when you add,
set, or replace information, you establish a custom TTL by passing it
as a parameter to your method call. As part of normal maintenance
operations, Couchbase Server periodically removes all items with
expiration times that have passed.
Depending on the amount of time you want to specify for the document
to live, you provide a TTL value as a relative number of seconds into
the future or in Unix time . Unix time represents a specific date and
time expressed as the number of seconds that have elapsed since
Thursday, 1 January 1970 at 00:00:00 Coordinated Universal Time (UTC)
. For example, the value 1421454149 represents Saturday, 17 January
2015 at 00:22:29 UTC.
But, Cloudant does not support Time to Live functions.
The reason is that IBM Cloudant documents are only 'soft' deleted, not deleted. The soft deletion involves replacing the original document with a smaller record. This small record or 'tombstone' is required for replication purposes; it helps ensure that the correct revision to use can be identified during replication.
If the TTL capability was available in IBM Cloudant, the resulting potential increase in short-lived documents and soft deletion records would mean that the database size might grow in an unbounded fashion.
For more info, refer this link on TTL
Related
Is there a possible way to check for changes in Firestore with google cloud functions at the end of the day?
From the documentation I only can take the classic Firestore triggers(Oncreate, onchange, ondelete, onwrite).
In my app, the documents are updated many times throughout the day.
At the end of the day the documents are ready for further processing.
So my attention is to save invocations by only looking for changed documents every 24 hours.
There is no built in functionality to either trigger Cloud Functions daily with all documents that were modified, or even in the Firestore API to get all documents that were modified in a certain time interval.
What you can do is:
Add a lastModified field to each document that you update whenever you update that document, setting it to the server timestamp.
Create a Cloud Function that gets triggered every day.
In the function run a query to Firestore that gets you all documents where lastModified is in the past 24 hours.
How we can find out data is deleting from all collections after particular Days using TTL Index?
Say your TTL is 10 days, you can do a count of the number of documents where the date is more than 10 days ago. To do such a count:
db.myCollection.count({"date":{"$lt":ISODate("2017-02-07T00:00:00.000Z")}})
If TTL is working, you'd expect the count to be 0. Note that you need to give some extra time before deletion-by-TTL takes place. Deletion is not exactly at the specified moment because deletion is via a periodically-run background task.
I'm using morphia to connect to mongoDB. I'm collecting daily mileage for cars. Right now, all daily mileage for all cars are stored in 1 collection with the following attribute:
plateNumber, date, mileage
we want to store the daily mileages from all the way back in 1990 onwards. Right now, we're already maintaining around 4500+ cars (that's roughly 1.3 mil records a year). We're trying with one year worth of data, and the performance is already slagging really badly. I was thinking of splitting the storage into multiple collections based on the plate number. so each plate number will have its own collection named after the plate number. I need some ideas. Is there any other way to solve this?
Adding details:
How we'll use the data: we want to query mileages of multiple cars (sometimes per department, or per geographic area, per make/model, etc) at any given date range.
So, lets just say we want to monitor mileages in a suburb, we'll take all plate numbers' mileages operating in that suburb from 01 Jan 2014 to 23 Jun 2014 and perform calculation on the data.
thanks.
Depending on what is your configuration you can try Sharding or you may attempt to Partition your db -- though this approach is hybrid, meaning that you would mimic partitioning from sql database systems (Oracle, Sql Server, etc.).
Also note that if you insert (basically append) a lot of entries to a single file it will gradually become slow since mongo needs to update the primary key (mongoID) that needs to be unique + if you defined other indexes on the collection those also need to be updated.
If you can provide more information on how you intend to use the collected data and in what time intervals + are these operations online or offline I'll update my answer.
I heared, that mongo can do it, but I can't find how.
Can mongo create collections, which will be autoremove in future, from time, which i can setup? Or Mongo can't do this magic?
mongodb cannot auto remove collections but it can auto remove BSON records. You just need to set ttl(Time to live) index on a date field that exists in BSON record .
You can read more here MongoDb: Expire Data from Collections by Setting TTL
Collections are auto created on the first write operation (insert, upsert, index creation). So this magic is covered.
If your removal is based on time, you could use cron or at to run this little script
mongo yourDBserver/yourDB --eval 'db.yourCollection.drop()’
As Sammaye pointed out, creating indices is a costly operation. I would assume there is something wrong with your data model. For semantically distinguishing documents, I'd rather create a field on them which does that and set an expiration date or a creation date and a time frame in which the documents are valid and use TTL indices to remove all of those documents.
For using an expiration date, you have to set a field to an ISODate and create a TTL index without a duration:
db.yourColl.ensureIndex({"yourExpirationDateField":1},{expireAfterSeconds:0})
In the case you want the documents to be valid for let's say a week after they are created, you would use the following:
db.yourColl.ensureIndex({"yourCreationDate":1},{expireAfterSeconds:604800})
Either way, here is what happens: Once every minute a background thread called TTLMonitor wakes up, gets all TTL indices for the server and starts processing them. It scans the TTL index, looking for the date values, adds the value given for "expireAfterSeconds" and deletes all documents which it determined to be invalid by now. This process takes some time, so don't expect the documents to be deleted on the very second they expire.
The big advantage of that approach: you don't need any triggering logic to be maintained, the deletes are done automagically in the background and you don't put any load on your application. Plus, using an expiration date, your have very granular control over when a document expires.
The drawback is ... ... Well, if I want to find one it would be that you have to insert a creation date for every document or calculate and insert an expiration date. And you have to send an administrative command to the mongod/mongos once in the application lifetime...
I have a collections of documents in mongodb, with the expireAfterSeconds property set on a date-type index.
For the sake of argument, the documents are set to expire after one hour.
When I update a document in this collection, which one of the following will happen?
a) The document will expire one hour after the original creation time.
b) The document will expire one hour after the update time.
c) The document will expire one hour after the indexed variable's time, whatever that may be.
d) None of the above
I think that it's c, but cannot find the reference to confirm it. Am I correct? Where is this documented?
[edit]: To clarify, the situation is that I'm storing password reset codes (that should expire.) and I want the old codes to stop working if a new code is requested. It's not very relevant though, since I can ensure the behaviour I want is always respected by simply deleting the old transaction. This question is not about my current problem, but about Mongo's behaviour.
The correct answer is c)
The expireAfterSeconds property always requires an index on a field which contains a BSON date, because the content of this date field is used to select entries for removal.
When you want an update of a document to reset the time-to-live, also update the indexed date field to the current time.
When you want an update to not affect the TTL, just don't update the date.
However, keep in mind that expireAfterSeconds doesn't guarantee immediate deletion of the document. The deletions are done by a background job which runs every minute. This job is low-priority and can be postponed by MongoDB when the current load is high. So when it's important for your use-case that the expire times are respected accurately to the second, you should add an additional check on the application level.
This feature is documented here: http://docs.mongodb.org/manual/tutorial/expire-data/
If you don't want to rely on mongo demon process for expiring the collection, then better to create an additional createdOn field on collection and compare it with the current timestamp to decide whether to use that document or not.