I have 100+ worker threads, which are going to poll database, looking for a new job.
To take a job, a thread need to change status of the bunch of documents from NEW to IN_PROGRESS, so no other threads can peek the same job.
This can be solved perfectly fine in PostgreSQL with SELECT FOR UPDATE SKIP LOCKED WHERE status = "NEW" statement.
Is there a way to do such atomic update in MongoDB for a single document? For a batch?
There's a findAndModify method, which works exactly as you've described for a single document.
For a batch, it's not possible right now, as
In MongoDB, write operations, e.g. db.collection.update(), db.collection.findAndModify(), db.collection.remove(), are atomic on the level of a single document.
It will be possible in MongoDB 4.0 though, with transactions.
Related
I'm confused about how MongoDB updates works.
In the following docs: https://docs.mongodb.com/manual/core/write-operations-atomicity/ says:
In MongoDB, a write operation is atomic on the level of a single
document, even if the operation modifies multiple embedded documents
within a single document.
When a single write operation modifies multiple documents, the
modification of each document is atomic, but the operation as a whole
is not atomic and other operations may interleave.
I guess it means: if I'm updating all fields of a document I will be unable to see a partial update:
If I get the document before the update I will see it without any change
If I get the document after the update I will see it with all the changes
For a multiple elements the same behavior happens for each document. I guess we could say there is a transaction for each document update instead of a big one for all of them.
But let's say there are a lots of documents on the multiple update, and it takes a while to update all of them. What happen with the queries by other threads during the update?
They will see the old version? Or they will be blocked until the update finishes?
Other updates to same documents are possible during this big update? If so, could this intermediate update exclude some document from the big update?
They will see the old version? Or they will be blocked until the update finishes?
I guess other threads may see the old version of a document or the new, depending on whether they query the document before or after the update is finished, but they will never see a partial update on a document (i.e. one field changed and another not changed).
Other updates to same documents are possible during this big update? If so, could this intermediate update exclude some document from the big update?
Instead of big or small updates, think of 2 threads doing an update on the same document. Thread 1 sets fields {a:1, b:2} and thread 2 sets {b:3, c:4}. If the original document is {a:0, b:0, c:0} then we can have two scenarios:
Update 1 is executed before update 2:
The document will finally be {a:1, b:3, c:4}.
Update 2 is executed before update 1:
The document will finally be {a:1, b:2, c:4}.
In Vapor MongoKitten there is an update method which accept array of documents. Is update atomically executed or it is method only for convenient use? MongoDB doc says:
When a single write operation modifies multiple documents, the
modification of each document is atomic, but the operation as a whole
is not atomic and other operations may interleave.
https://docs.mongodb.com/manual/core/write-operations-atomicity/
I'm assuming you're pointing towards the update(bulk method for multiple update queries. This function executes a single batch query with multiple update statements. Each update is executed separately but they are submitted in a single batch to MongoDB to limit network load and increase app performance. It's primarily intended for migrations, although there sure are other use cases.
EDIT: For the record, this is true for MongoKitten 4. The upcoming MongoKitten 5 release may not support this in the initial version of the high level APIs.
I understand you cannot do transactions in MongoDB and the thinking is that its not needed because everything locks the whole database or collection, I am not sure which. However how then do you perform the following?
How do I chain together multiple insert, update, delete or select queries in mongodb so that other queries that might operate on the same data wait until these queries finish? An analogy would be serialization transaction isolation in ms sql server.
more..
I want to insert/update record into collection A and update a record in collection B and then read Collection A and B but I don't want anyone (process or thread) to read or write to collection A or B until BOTH A and B have been updated or inserted by the first queries.
Yes, that's absolutely possible.
It is called ordered bulk operations on planet Mongo and works like this in the mongo shell:
bulk = db.emptyCollection.initializeOrderedBulkOp()
bulk.insert({name:"First document"})
bulk.find({name:"First document"})
.update({$set:{name:"First document, updated"}})
bulk.execute()
bulk.findOne()
> {_id: <someObjectId>, name:"First document, updated"}
Please read the manual regarding Bulk Write Operations for details.
Edit: Somehow is misread your question. It isn't possible for two collections. Remember though, that you can have different documents in one collection. Some ODMs even allow to have different models saved to the same collection. Exploiting this, you should be able to achieve what you want using the above bulk operations. You may want to combine this with locking to prevent writing. But preventing reading and writing would be the same as an transaction in terms of global and possibly distributed locks.
I have a collection in my database
1.I want to lock my collection when the User Updating the Document
2.No operations are Done Expect Reads while Updating the collection for another Users
please give suggestions how to Lock the collection in MongoDB
Best Regards
GSY
MongoDB implements a writer greedy database level lock already.
This means that when a specific document is being written to:
The User collection would be locked
No reads will be available until the data is written
The reason that no reads are available is because MongoDB cannot do a consistent read while writing (darn you physics, you win again).
It is good to note that if you wish for a more complex lock, spanning multiple rows, then this will not be available in MongoDB and there is no real way of implementing such a thing.
MongoDB locking already does that for you. See what operations acquire which lock and what does each lock mean.
See the MongoDB documentation on write operations paying special attention to this section:
Isolation of Write Operations
The modification of a single document is always atomic, even if the write operation modifies >multiple sub-documents within that document. For write operations that modify multiple >documents, the operation as a whole is not atomic, and other operations may interleave.
No other operations are atomic. You can, however, attempt to isolate a write operation that >affects multiple documents using the isolation operator.
To isolate a sequence of write operations from other read and write operations, see Perform >Two Phase Commits.
What type of locking does findAndModify() offer? Is is a write lock only, or read/write? Does it prevent simultaneous updates on the same record?
MongoDB has a global (per-instance) write lock, which serializes all updates across all data in the server (though different servers in a sharded cluster will each have their own independent locks). This means that at any given instant in time, only one update is taking place on any document, and therefore only one update for any given document.
findAndModify doesn't do anything different in this regard than an ordinary update -- it just returns the document to you.
According to the MongoDB docs for MongoDB: findAndModify() for under MongoDB: Atomic Operations it should be.