How to manage concurrency in MongoDB? - mongodb

I am new to MongoDB database and for one of my application planning some portion of it to move to MongoDB. Where we need to handle optimistic concurrency. What are the best practices available with MongoDB.
Is MongoDB right choice for application which requires concurrency?

Yes MongoDB would be right choice for concurrency.
MongoDB Locking is Different than the locking in RDBMS.
MongoDB uses multi-granularity locking(see wired tiger) that allows operations to lock at the global, database or collection level, and allows for individual storage engines to implement their own concurrency control below the collection level (e.g., at the document-level in WiredTiger).
MongoDB uses reader-writer locks that allow concurrent readers shared access to a resource, such as a database or collection, but in MMAPv1, give exclusive access to a single write operation.
WiredTiger uses optimistic concurrency control. WiredTiger uses only intent locks at the global, database and collection levels. When the storage engine detects conflicts between two operations, one will incur a write conflict causing MongoDB to transparently retry that operation.
MongoDB has a reader/writer latch for each database.
The latch is multiple-reader, single-writer, and it is writer-greedy, so we can have a unlimited number of simultaneous readers on a database,
but there can be only one writer at a time on any collection in any one database.
"writer-greedy", gives priority to write, so when we get a write request, all the read requests are blocked until the write is completed.
The lock here is called as latch since it's lighter than a lock and it performs the job within microseconds.
MongoDB is capable of running as many simultaneous queries.
Hope it Helps!!
References
https://docs.mongodb.com/manual/faq/concurrency/
https://docs.mongodb.com/manual/reference/command/findAndModify/

Related

MongoDB FindAndModify with concurrent requests

I understand that this is atomic.
Does Mongo throw a lock exception if there are concurrent requests?
Stumbled into this question while working on MongoDB upgrades. Unlike at the time, this question was asked, now MongoDB supports document-level locking out of the box.
From: http://docs.mongodb.org/manual/faq/concurrency/
"How granular are locking in MongoDB?
Changed in version 3.0.
Beginning with version 3.0, MongoDB ships with the WiredTiger storage engine, which uses optimistic concurrency control for most read and write operations. WiredTiger uses only intent locks at the global, database, and collection levels. When the storage engine detects conflicts between two operations, one will incur a write conflict causing MongoDB to transparently retry that operation."

Can we perform concurrent read and write operation in mongodb?

I ha have a query regarding MongoDB. I read the documentation but did not get a clear idea about that, so asking here. I have one collection on which I need to perform 24/7 read and write operations from different sources. So the query is May we perform concurrent read and write operations on the same collection at the same time? If no then what is the alternative or what's the main reason behind that .
I have some python crons which perform R/W Operations on collections at the same time I have some node side backend API that performs R/W operations on the same collection so will it cause any issue? Currently, this all is performing on the MySQL side but now as per client requirement, I need to move to MongoDB from MySQL. So please help me get clear about this problem.
Read FAQ: Concurrency
MongoDB uses reader-writer locks that allow concurrent readers shared
access to a resource, such as a database or collection.
In addition to a shared (S) locking mode for reads and an exclusive
(X) locking mode for write operations, intent shared (IS) and intent
exclusive (IX) modes indicate an intent to read or write a resource
using a finer granularity lock. When locking at a certain granularity,
all higher levels are locked using an intent lock.
And regarding Wired Tiger engine
For most read and write operations, WiredTiger uses optimistic
concurrency control. WiredTiger uses only intent locks at the global,
database and collection levels.
Current default mongodb engine is WiredTiger so if you use it - you're OK. To check the engine execute this db.serverStatus().storageEngine

Spring Data MongoDB Concurrent Updates Behavior

Imagine theres a document containing a single field: {availableSpots: 100}
and there are millions of users, racing to get a spot by sending a request to an API server.
each time a request comes, the server reads the document and if the availableSpot is > 0, it then decrements it by 1 and creates a booking in another collection.
Now i read that mongodb locks the document whenever an update operation is performed.
What will happen if theres a million concurrent requests? will it take a long time because the same document keeps getting locked? Also, the server reads the value of document before it tries to update the document, and by the time it acquires the lock, the spot may not be available anymore.
It is also possible that the threads are getting "availableSpot > 0" is true at the same instant in time, but in reality the availableSpot may not be enough for all the requests. How to deal with this?
The most important thing here is atomicity and concurrency.
1. Atomicity
Your operation to update (decrement by one) if availableSpots > 0 :
db.collection.updateOne({"availableSpots" :{$gt : 0}}, { $inc: { availableSpots: -1 })
is atomic.
$inc is an atomic operation within a single document.
Refer : https://docs.mongodb.com/manual/reference/operator/update/inc/
2. Concurrency
Since MongoDB has document-level concurrency control for write operations. Each update will take a lock on the document.
Now your questions:
What will happen if theres a million concurrent requests?
Yes each update will be performed one by one (due to locking) hence will slow down.
the server reads the value of document before it tries to update the
document, and by the time it acquires the lock, the spot may not be
available anymore.
Since the operation is atomic, this will not happen. It will work as you want, only 100 updates will be executed with number of affected rows greater than 0 or equal to 1.
MongoDB uses Wired Tiger as a default storage engine starting version 3.2.
Wired Tiger provides document level concurrency:
From docs:
WiredTiger uses document-level concurrency control for write
operations. As a result, multiple clients can modify different
documents of a collection at the same time.
For most read and write operations, WiredTiger uses optimistic
concurrency control. WiredTiger uses only intent locks at the global,
database and collection levels. When the storage engine detects
conflicts between two operations, one will incur a write conflict
causing MongoDB to transparently retry that operation.
When multiple clients are trying to update a value in a document, only that document will be locked, but not the entire collections.
My understanding is that you are concerned about the performance of many concurrent ACID-compliant transactions against two separate collections:
a collection (let us call it spots) with one document {availableSpots: 999..}
another collection (let us call it bookings) with multiple documents, one per booking.
Now i read that mongodb locks the document whenever an update operation is performed.
It is also possible that the threads are getting "availableSpot > 0"
is true at the same instant in time, but in reality the availableSpot
may not be enough for all the requests. How to deal with this?
With version 4.0, MongoDB provides the ability to perform multi-document transactions against replica sets. (The forthcoming MongoDB 4.2 will extend this multi-document ACID transaction capability to sharded clusters.)
This means that no write operations within a multi-document transaction (such as updates to both the spots and bookings collections, per your proposed approach) are visible outside the transaction until the transaction commits.
Nevertheless, as noted in the MongoDB documentation on transactions a denormalized approach will usually provide better performance than multi-document transactions:
In most cases, multi-document transaction incurs a greater performance
cost over single document writes, and the availability of
multi-document transaction should not be a replacement for effective
schema design. For many scenarios, the denormalized data model
(embedded documents and arrays) will continue to be optimal for your
data and use cases. That is, for many scenarios, modeling your data
appropriately will minimize the need for multi-document transactions.
In MongoDB, an operation on a single document is atomic. Because you can use embedded documents and arrays to capture relationships between data in a single document structure instead of normalizing across multiple documents and collections, this single-document atomicity obviates the need for multi-document transactions for many practical use cases.
But do bear in mind that your use case, if implemented within one collection as a single denormalized document containing one availableSpots sub-document and many thousands of bookings sub-documents, may not be feasible as the maximum document size is 16MB.
So, in conclusion, a denormalized approach to write atomicity will usually perform better than a multi-document approach, but is constrained by the maximum document size of 16MB.
You can try using findAndModify() option while trying to update the document. Each time you will need to cherry pick whichever field you want to update in that particular document. Also, since mongo db replicates data to Primary and secondary nodes, you may also want to adjust your WriteConcern values as well. You can read more about this in official documentation. I have something similar coded that handles similar kind of concurrency issues in mongoDB using spring mongoTemplate. Let me know if you want any reference related to java with that.

How Efficient is Mongo DB ISOLATION

I came across a document that Mongo DB maintains a global write lock, wanted to know how efficient it is to support "ISOLATION" of "ACID" as of SQL database.
I came across a document that Mongo DB maintains a global write lock
That's old information, MongoDB is now on a database level lock, maybe sometime in the future collection, however, that has been put back in favour of concurrency.
wanted to know how efficient it is to support "ISOLATION" of "ACID" as of SQL database.
First thing first, MongoDB IS NOT AN ACID DATABASE. If you want ACID you should go with an ACID compliant database. Don't try and make a database do what it isn't designed to do.
As for actual isolation, currently MongoDB has isolation on a single document level with atomic operations such as $inc, $set, $unset and all those others. Isolation does not occur on multiple documents, there is an $isolated ( http://docs.mongodb.org/manual/reference/operator/isolated/ ) operator but it is highly recommened not to use it, plus it isn't supported on sharded collections.
There is also a documentation page on providing isolation levels: http://docs.mongodb.org/manual/tutorial/isolate-sequence-of-operations/ but only findAndModify and indexing can provide some element of isolation whereby other queries will not interferer.
Fundamentally, even if it had atomic operations on multiple documents, MongoDB cannot normally support isolation across many documents, this is due to one of its main concurrency features, the ability to subside out of memory operations for ones in memory.
And so I come back to my original point, if you want ACID go to a ACID tech.

creating a different database for each collection in MongoDB 2.2

MongoDB 2.2 has a write lock per database as opposed to a global write lock on the server in previous versions. So would it be ok if i store each collection in a separate database to effectively have a write lock per collection.(This will make it look like MyISAM's table level locking). Is this approach faulty?
There's a key limitation to the locking and that is the local database. That database includes a the oplog collection which is used for replication.
If you're running in production, you should be running with Replica Sets. If you're running with Replica Sets, you need to be aware of the write lock effect on that database.
Breaking out your 10 collections into 10 DBs is useless if they all block waiting for the oplog.
Before taking a large step to re-write, please ensure that the oplog will not cause issues.
Also, be aware that MongoDB implements DB-level security. If you're using any security features, you are now creating more DBs to secure.
Yes that will work, 10gen actually offers this as an option in their talks on locking.
I probably isolate every collection, though. Most databases seem to have 2-5 high activity collections. For the sake of simplicity it's probably better to keep the low activity collections grouped in one DB and put high activity collections in their own databases.