I am aware that with a combination of capped collections and tailable cursors Mongo clients can subscribe to additions to the collection. This however introduces a few limitations:
When the collection is full, the oldest members are removed.
Existing members cannot be changed if they are not the same size. Cannot change the size of a document in a capped collection
Is there something more generic (such as RDBMS triggers) I can employ to listen to changes of all sorts happening to a Mongo collection?
Related
I am curious about the benefits and drawbacks of using a tailable cursor versus a change stream with filter for insert operations over a capped collection both in usability and performance-wise.
I have a capped collection that contains update objects. The only possible operation for this collection is insertion of new updates(as new records) which I need to relay to my application backend and distribute through SSE/WebSockets in real-time. I'll use a single parameter in the query for subscription and it is a timestamp which is part of the object's properties. The collection has an index on that field. I'll also do some basic filtering over the newly added records, so the aggregation framework of Change Streams would be helpful.
I've read: What is the difference between a changeStream and tailable cursor in MongoDB which summarizes the differences between tailable cursors and change streams in general.
Also, the mongodb article on Capped Collections just states the Tailable cursor approach and the article on Tailable cursors state that they do not use indexes and you should use a normal cursor to fetch the results.
In short, I need a reliable stream of collection updates based entirely on the insertion of new records in the collection. Which approach should I go for?
Tailable cursors and capped collections were created to support copying the oplog from a primary node to a secondary node. For any other activity the recommended approach is a ChangeStream. Change streams are integrated with the MongoDB authentication model and a change stream can be take at the collection, database, cluster and sharded cluster level. Tailable cursors only work on a single collection and are not resumable. Change streams are supported by all official MongoDB Drivers.
I want to intercept chunk migration events of a particular (sharded) collection.
The changeLogs collection (from the Config server database) is not capped and we can't define a tailable cursor over it. One solution could be to copy the events of interest from the changelog collection into a capped one and query it with a tailable cursor. I would like to know if there is a better way to do this.
I'm trying to understand what is the capped collection is, specifically in context of MongoDB, and what would be the difference in compare with queue?
Capped collection will remove oldest document when it reaches it limit, so that could be an issue if there is a need to process ALL documents from capped collection.
from mongo: Capped collections work in a way similar to circular
buffers: once a collection fills its allocated space, it makes room
for new documents by overwriting the oldest documents in the
collection.
comparing to queue:
queue will not remove records when full (it could throw an exception
like out of memory)
queue can remove record when dequeued - in capped collection you need to delete it on your own
capped collection cleanup: if capped collection size is 40 documents - then when 41st document is added -> the 1st entry is removed
I think this the most important things - any comments welcome!
CAPPED collection in mongodb is implementation of circular buffer.
From official documentation
Capped collections are fixed-size collections that support high-throughput operations that insert and retrieve documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection.
I wonder if capped collections keep indexes for expired documents?
Removing documents from normal collection keeps indexes.
Capped collections remove documents by timer and do not allow db.collection.remove() at all.
I could not find any word in docs what happens with indexes for capped collections and would appreciate any help from ones who know.
TL;DR: The only way to remove documents from a capped collection is to drop the entire collection, that will also remove the indexes themselves from the collection.
I wonder if capped collections keep indexes for expired documents?
No. Documents that are no longer stored never remain in the index.
Removing documents from normal collection keeps indexes.
This is a bit misleading. Removing all documents from a normal collection by using db.collection.remove() removes both the documents from the collection and also deletes those documents from the index. It does not, however, remove the indexes of the collection, i.e. once you add new documents they are being added to the respective indexes again (i.e. removing the index itself is different from deleting documents from the index).
Capped collections remove documents by timer and do not allow db.collection.remove() at all.
The TTL-feature you linked has nothing to do with capped collections, in fact, the documentation says:
You cannot create a TTL index on a capped collection, because MongoDB cannot remove documents from a capped collection.
A collection with a TTL index does allow db.collection.remove.
A capped collection, on the other hand, has a fixed size (in terms of data size) and the oldest documents of the collection are automatically overwritten once the collection is full. This is not based on time, but purely on the size of the collection. Capped collections are always kept in insertion order (natural order).
Since the only way to remove documents from a capped collection is to drop the entire collection, that will also remove the indexes themselves from the collection.
I am new to mongoDB. I am working on building an application which requires implementing LRU policy on the collection. In the mongoDB site i see capped collections support FIFO. Is there any other collection which support LRU. Throught the documentation i see only capped collections in the site. Are there any other collections in mongoDb.
Are collections by default capped in mongodb?
MongoDB capped collections are the only kilobyte size-limited collections supported by the database. There is no built-in support for LRU or FIFO based on limiting to a particular number of documents.
Collections are not capped by default in MongoDB - a capped collection is a special case.