Let's say I have MongoDB collection A and B, and they are in the same database.
I'm renaming B to A with deleting target.
I know renaming takes really short time.
But what if I send a query to A while mongo is still renaming B to A?
----------|--------------------------------|---------------------------|-------------------
rename B to A begin Send query to A rename B to A done
Am I gonna get the result right away? or wait till the end of rename?
Per the documentation
The db.collection.renameCollection() method and renameCollection command will invalidate open cursors which interrupts queries that are currently returning data.
So you might end up with errors of dead or killed cursors while renaming your collection.
Since MongoDB 4.2, all operations will wait on the rename to be finished.
renameCollection() obtains an exclusive lock on the source and target collections for the duration of the operation. All subsequent operations on the collections must wait until renameCollection() completes. Prior to MongoDB 4.2, renaming a collection within the same database with renameCollection required obtaining an exclusive database lock.
Related
I want to convert the MongoDB local Oplog file into an actual real query so I can execute that query and get the exact copy database.
Is there any package, file, build-in tools, or script for it?
It's not possible to get the exact query from the oplog entry because MongoDB doesn't save the query.
The oplog has an entry for each atomic modification performed. Multi-inserts/updates/deletes performed on the mongo instance using a single query are converted to multiple entries and written to the oplog collection. For example, if we insert 10,000 documents using Bulk.insert(), 10,000 new entries will be created in the oplog collection. Now the same can also be done by firing 10,000 Collection.insertOne() queries. The oplog entries would look identical! There is no way to tell which one actually happened.
Sorry, but that is impossible.
The reason is that, that opLog doesn't have queries. OpLog includes only changes (add, update, delete) to data, and it's there for replication and redo.
To get an exact copy of DB, it's called "replication", and that is of course supported by the system.
To "replicate" changes to f.ex. one DB or collection, you can use https://www.mongodb.com/docs/manual/changeStreams/.
You can get the query from the Oplogs. Oplog defines multiple op types, for instance op: "i","u", "d" etc, are for insert, update, delete. For these types, check the "o"/"o2" fields which have corresponding data and filters.
Now based on the op types call the corresponding driver APIs db.collection.insert()/update()/delete().
In my app, I am doing following with mongodb.
Start a mongodb session and start a transaction
Read a document
Do some calculations based on values in the document and some other arguments
Update the document that was read in step 2 with the results of the calculations in step 3
Commit transaction and end session
Above procedure is executed with retries on TransientTransactionError, so if the transaction is failed due to a concurrency issue, procedure is retried.
If two concurrent invocations were made on above procedure, if both invocations read the document before any of them writes to the document, I need only one invocation to be able to successfully write to the document and other to fail. If this doesn't happen, I don't get the expected result what I am trying to achieve with this.
Can I expect mongodb to fail one invocation in this scenario, so the procedure will be retried on the updated picture of the document?
MongoDB multi-document transactions are atomic (i.e. provide an “all-or-nothing” proposition). When a transaction commits, all data changes made in the transaction are saved and visible outside the transaction. That is, a transaction will not commit some of its changes while rolling back others.
This is also elaborated further in In-progress Transactions and Write Conflicts:
If a transaction is in progress and a write outside the transaction
modifies a document that an operation in the transaction later tries
to modify, the transaction aborts because of a write conflict.
If a transaction is in progress and has taken a lock to modify a
document, when a write outside the transaction tries to modify the
same document, the write waits until the transaction ends.
See also Video: How and When to Use Multi-Document Transactions on Write Conflicts section to understand multi-document transactions more (i.e. write locks, etc).
If you are writing to the same document that you read in both transactions then yes, one will roll back. But do make sure that your writes actually change the document as MongoDB is smart enough to not update if nothing has changed.
This is to prevent the lost updates.
Please see the source: https://www.mongodb.com/blog/post/how-to-select--for-update-inside-mongodb-transactions
In fact, I have the same implementation in one of my projects and it works as expected but I have multi-documents being read etc. But in your specific example, that is not the case.
Even if you did not have transactions, you could use findAndModify with an appropriate query part (such as the example for update operation here: https://www.mongodb.com/docs/manual/core/write-operations-atomicity/) to guarantee the behavior you expect.
I have 100+ worker threads, which are going to poll database, looking for a new job.
To take a job, a thread need to change status of the bunch of documents from NEW to IN_PROGRESS, so no other threads can peek the same job.
This can be solved perfectly fine in PostgreSQL with SELECT FOR UPDATE SKIP LOCKED WHERE status = "NEW" statement.
Is there a way to do such atomic update in MongoDB for a single document? For a batch?
There's a findAndModify method, which works exactly as you've described for a single document.
For a batch, it's not possible right now, as
In MongoDB, write operations, e.g. db.collection.update(), db.collection.findAndModify(), db.collection.remove(), are atomic on the level of a single document.
It will be possible in MongoDB 4.0 though, with transactions.
I'd like to "rebuild" my collection atomically, which means delete all existing documents and populate it from scratch.
The thing is, since transactions are not supported there is a small time gap that the collection is empty, which is what I want to avoid.
Is there a way to perform such action in an atomically matter? so there will be no point where the collection is empty?
You can build a new collection with a different name and then use rename command to rename the new collection and drop the existing collection (using dropTarget=True option).
There are several caveats though:
The command will invalidate open cursors which interrupts queries that
are currently returning data.
renameCollection blocks all database activity for the duration of the operation.
renameCollection is not compatible with sharded collections.
If the renameCollection operation does not complete, the target collection and indexes will not be usable and will require manual intervention to clean up.
You can find more info in the official docs.
I am learning about mongodb. If I create a bulk write is this transaction all or nothing? I have a scenario where my users can delete who they are friends with.
FRIEND 1 | FRIEND 2
User B USER A
User A USER B
For this to happen I need to delete from both bidirectional relationships. For consistency I need these to occur as a all or nothing because I wouldn't want only 1 of the 2 operations to succeed as this would cause bad data. Reading the docs I could not find the answer:
https://docs.mongodb.org/manual/core/bulk-write-operations/
db.collection.initializeOrderedBulkOp()
"If an error occurs during the processing of one of the write operations, MongoDB will return without processing any remaining write operations in the list."
No mention of rollback ops, simply stops inserting the remaining.
db.collection.insert() method
"The insert() method, when passed an array of documents, performs a bulk insert, and inserts each document atomically."
you can roll your own . but use acknowledged write concern which would have to be via your chosen driver. shell is acknowledged but perhaps driver is not.
https://docs.mongodb.org/manual/core/write-concern/
try
insert 1
catch
delete
try
insert 2
catch
delete 1
delete 2