Automatically remove document from the collection when array field becomes empty - mongodb

I have a collection with documents which contains array fields with even triggers and when there are no more triggers left i want to remove this document. As i understand mongo doesn't have triggers support. Is there any way i can delegate this job to Mongo?

You are correct, there is no triggers in mongo. So there is no normal way to do this with mogno. You have to use application logic to achieve this. One way would be to do cleaning every n minutes. Where you remove documents which have array of size zero. Another way (which I like a more) is after each update to the document, remove it if it has empty array.

The only feature I know MongoDB provides to expire data is by using an expiration index.
Expire data

Related

MongoDB TTL but to do other stuff

I have a requirement that when a date attribute field is passed, that we would like to trigger two things:
to move the record to be deleted to another table.
to call a function to do other actions.
I understand TTL is only to delete a record when the date field is tripped. Can I hook extra logic to it?
Thanks!
Depending on the requirements there could be quite a few ways to do this.
One way is to execute a script periodically, and run a query to filter documents that have passed certain date value. For each of the documents, perform a document migration to another table and extra actions.
Alternatively is to use MongoDB Change Streams. The trick however, is that delete events from change stream do not return the document itself (because it's already been deleted).
Instead if you were to update a field for documents that have passed certain date value you could listen for the update events. For example, sets a field value to expired:true.
Worth mentioning that if you're going down the route of change streams update events, you could utilise MongoDB Stitch Triggers (relying on change streams). MongoDB Stitch database triggers allow you to automatically execute Stitch functions in response to changes in your MongoDB database.
I suggest write a function and call it via scheduler. That will be the better option to do it.

MONGODB - Add duplicate field with different value

Is there a way to write a script that updates a document by adding a duplicate field with a different value? I cannot use set as that replaces the existing value. I cannot use push as the field is in an object, not an array. I even tried creating the new field with a different name and renaming it which also replaces the existing field.
You cannot have duplicate fields in a Mongo record. A Mongo collection is a collection of documents, otherwise known as objects. You cannot have a duplicate field in an object and Mongo is no different.
MongoDB (and any other database that I have come across so far) is built around the idea that individual fields are identifiable so they can be filtered by, grouped by, sorted by, etc... That also explains why MongoDB does not provide support for the scenario you're facing. That being said, MongoDB can be used as a dumb datastore for arbitrary JSON data. And the JSON specification does not say anything about duplicate field names which is probably why you can actually store such a document in MongoDB in the first place.
Anyway, there is no way to achieve what you want without loading the entire document, changing it (by adding the duplicate field(s)) and then replacing the whole document. That, however, will work.
I personally cannot think of a reasonable scenario where this sort of document could make sense, though. So I would strongly suggest you revisit your document structure.

Keep only n documents in collection - Meteor

I want to keep only the latest n documents in my activityFeed collection, in order to speed up the application. I know that I could subscribe only to n activityFeed elements in my iron-router configs, but it is not necessary to keep all the entries.
How can I do this?
Do I need to check on every insert, or is there a better way to do this?
Any help would be greatly appreciated.
As you point out, your subscription could handle this on the client side, but if you also want to purge your DB there are two obvious solutions:
Use Collection hooks to delete the oldest item on every insert.
Use a cron job to find the nth oldest element every so often (15 minutes or whatever), and then delete everything older than that. I'd use synced-cron for this.
Better would be to create "capped collection": https://docs.mongodb.org/manual/core/capped-collections/
In meteor you can use it like this:
var coll = new Meteor.Collection("myCollection");
coll._createCappedCollection(numBytes, maxDocuments);
Where maxDocuments is maximal number of documents to store in collection and numBytes is the max size of collection. If one of these limits will be reached then automatically mongo will drop the oldest inserted documents from collection.
No need for additional scripts and cron jobs.

MongoDB cursor and write operations

I am using MongoDB to save data about products. After writing the initial large data-set (24mio items) I would like to change all the items in the collection.
Therefore I use a cursor to iterate over the whole collection. Then I want to add a "row" or field to every item in the collection. With large data-sets this is not working. There were only 180000 items updated. On a small scale it is working. Is that normal behavior?
Is MongoDB not supposed to support writes while iterating with a cursor over the whole collection?
What would be a good practice to do that instead?
For larger collections, you might run into snapshotting problems. When you add data to the object and save it, it will grow, forcing mongodb to move the document around. Then you might find the object twice.
You can either use $snapshot in your query, or use a stable order such as sort({"_id":1}). Note that you can't use both.
Also make sure to use at least acknowledged write concern.
When we had a similar problem, we fetched data in 100k(with some test) chunks. It's a quick and simple solution.

How can i retrieve modified documents after an update operation in mongodb with pymongo?

I'm using an update operation with upsert. I want to retrieve all documents that have been modified after an update.
for key in categories_links:
collection.update({"name" : key}, {"name": key ,"url" : categories_links[key]}, True)
You should use a timestamp field in your documents if you ever need to find which ones where updated and when. There is a BSON type for that.
To my knowledge, pymongo will not return a list of all of the records which have been modified by an update.
However, if you are using a replicaset, you might be able to accomplish this by looking at the oplog.
According to the documentation:
The oplog must translate multi-updates into individual operations in
order to maintain idempotency. This can use a great deal of oplog
space without a corresponding increase in data size or disk use.
If you want to keep track of each element being updated, you might instead do a find(), and then loop through those to do an individual update() on each. Obviously this would be much slower, but perhaps a tradeoff for your specific use case.