MongoDB FindAndModify with concurrent requests - mongodb

I understand that this is atomic.
Does Mongo throw a lock exception if there are concurrent requests?

Stumbled into this question while working on MongoDB upgrades. Unlike at the time, this question was asked, now MongoDB supports document-level locking out of the box.
From: http://docs.mongodb.org/manual/faq/concurrency/
"How granular are locking in MongoDB?
Changed in version 3.0.
Beginning with version 3.0, MongoDB ships with the WiredTiger storage engine, which uses optimistic concurrency control for most read and write operations. WiredTiger uses only intent locks at the global, database, and collection levels. When the storage engine detects conflicts between two operations, one will incur a write conflict causing MongoDB to transparently retry that operation."

Related

Can we perform concurrent read and write operation in mongodb?

I ha have a query regarding MongoDB. I read the documentation but did not get a clear idea about that, so asking here. I have one collection on which I need to perform 24/7 read and write operations from different sources. So the query is May we perform concurrent read and write operations on the same collection at the same time? If no then what is the alternative or what's the main reason behind that .
I have some python crons which perform R/W Operations on collections at the same time I have some node side backend API that performs R/W operations on the same collection so will it cause any issue? Currently, this all is performing on the MySQL side but now as per client requirement, I need to move to MongoDB from MySQL. So please help me get clear about this problem.
Read FAQ: Concurrency
MongoDB uses reader-writer locks that allow concurrent readers shared
access to a resource, such as a database or collection.
In addition to a shared (S) locking mode for reads and an exclusive
(X) locking mode for write operations, intent shared (IS) and intent
exclusive (IX) modes indicate an intent to read or write a resource
using a finer granularity lock. When locking at a certain granularity,
all higher levels are locked using an intent lock.
And regarding Wired Tiger engine
For most read and write operations, WiredTiger uses optimistic
concurrency control. WiredTiger uses only intent locks at the global,
database and collection levels.
Current default mongodb engine is WiredTiger so if you use it - you're OK. To check the engine execute this db.serverStatus().storageEngine

How to manage concurrency in MongoDB?

I am new to MongoDB database and for one of my application planning some portion of it to move to MongoDB. Where we need to handle optimistic concurrency. What are the best practices available with MongoDB.
Is MongoDB right choice for application which requires concurrency?
Yes MongoDB would be right choice for concurrency.
MongoDB Locking is Different than the locking in RDBMS.
MongoDB uses multi-granularity locking(see wired tiger) that allows operations to lock at the global, database or collection level, and allows for individual storage engines to implement their own concurrency control below the collection level (e.g., at the document-level in WiredTiger).
MongoDB uses reader-writer locks that allow concurrent readers shared access to a resource, such as a database or collection, but in MMAPv1, give exclusive access to a single write operation.
WiredTiger uses optimistic concurrency control. WiredTiger uses only intent locks at the global, database and collection levels. When the storage engine detects conflicts between two operations, one will incur a write conflict causing MongoDB to transparently retry that operation.
MongoDB has a reader/writer latch for each database.
The latch is multiple-reader, single-writer, and it is writer-greedy, so we can have a unlimited number of simultaneous readers on a database,
but there can be only one writer at a time on any collection in any one database.
"writer-greedy", gives priority to write, so when we get a write request, all the read requests are blocked until the write is completed.
The lock here is called as latch since it's lighter than a lock and it performs the job within microseconds.
MongoDB is capable of running as many simultaneous queries.
Hope it Helps!!
References
https://docs.mongodb.com/manual/faq/concurrency/
https://docs.mongodb.com/manual/reference/command/findAndModify/

How to choose from MMAPV1, WiredTiger or In-Memory StorageEngine for MongoDB?

In MongoDb Documentation 3.2 I saw that they support 3 Storage Engine,
MMAPV1, WiredTiger, In-Memory, it is very confusing which one to choose.
I have the sensation from the description that WiredTiger is better than MMAPV1, but in other sources they say that MMAPV1 is better for heavy reads... and WiredTiger for heavy writes...
Is there some constraints when to choose one over the other ?
Can someone suggest some best practices for example
when I have this type of application usually is best this , else choose an other...
This is from personal experience, however please have a look at this blog entry it explains very well different types of the engines:
Mongo Blog v3
Comparing the MongoDB WiredTiger and MMAPv1 storage engines.Higher Performance & Efficiency Between 7x and 10x Greater Write Performance
MongoDB 3.0 provides more granular document-level concurrency control, delivering between 7x and 10x greater throughput for most write-intensive applications, while maintaining predictable low latency.
For me choice was very simple, I needed document level locks which makes WiredTiger ideal choice, we don't have Enterprise version of mongo hence in memory engine is not available. MMAPv1 Btree is very basic technique to map memory to hard drive and not very efficient.
The MMAP storage engine uses a process called “record allocation” to grab disk space for document storage. All records are contiguously located on disk, and when a document becomes larger than the allocated record, it must allocate a new record. New allocations require moving a document and updating all indexes that refer to the document, which takes more time than in-place updates and leads to storage fragmentation. Furthermore, MMAPv1 in it’s current iterations usually leads to high space utilization on your filesystem due to over-allocation of record space and it’s lack of support for compression.
As mentioned previously, a storage engine’s locking scheme is one of the most important factors in overall database performance. MMAPv1 has collection-level locking – meaning only one insert, update or delete operation can use a collection at a time. This type of locking scheme creates a very common scenario in concurrent workloads, where update/delete/insert operations are always waiting for the operation(s) in front of them to complete. Furthermore, oftentimes those operations are flowing in more quickly than they can be completed in serial fashion by the storage engine. To put it in context, imagine a giant supermarket on Sunday afternoon that only has one checkout line open: plenty of customers, but low throughput!
Everyone has different requirements, but for most cases WiredTiger would be ideal choice the fact that it makes atomic operations on document level and not collection level has a great advantage, you simply can't beat that.
More reads and not a lot of writes
If reading is your main concern here is one way to address that.
You can tweak Mongo Driver Read Preference Modes in the following way:
Setup Replica Set, say 1 master and 3 secondaries.
Set write concern to majority this would make
write a bit slower (trade off).
Set read preference to secondary.
This setup will perform very well when you have a lot of reads, but as a tradeoff write would be slower. However throughput of read data would be great.
I hope this helps if you have additional questions add them as a comment and I will try to address it in this answer.
Also you can check MMAPv1 vs WiredTiger review and notice how he changed his mind from MMAPv1 to WiredTiger. The seller is document locking that performance you just can't beat.
For new projects, I use WiredTiger now. Since a migration from a compressed to an uncompressed WiredTiger storage is rather easy, I tend to start with compression to enhance the CPU utilization ("get more bang for the buck"). Should the compression have a noticeable impact on performance or UX, I migrate to uncompressed WiredTiger.
MongoDB database profiler
Best way of determining your database needs is to setup test cluster and run application on it with MongoDB profiler
Like most database profilers, the MongoDB profiler can be configured to only write profile information about queries that took longer than a given threshold. So once you know slow queries you can figure out if it reads vs writes or cpu vs ram and go from there.
You should use a replica set consisting of both in-memory and WiredTiger storage engines. And you should shard your MongoDB in such a way that the most frequented data should be accessed by the in-memory storage engine and rest uses WiredTiger storage engine.
After acquiring WiredTiger in 2014, MongoDB introduced this storage engine as their default storage engine from version 3.2. Thereafter, they themselves started to encourage users to use WiredTiger because of its following advantages over MMAPV1:
WiredTiger uses document level concurrency whereas MMAPV1 uses collection level locking. That means multiple users can write to a collection simultaneously using WiredTiger but not using MMAPV1.
Since WiredTiger manages its own memory, it can use compression whereas MMPAV1 doesn't have any such feature.
WiredTiger doesn't allow any in-place updates. Thus, eventually it reclaims the space that is no longer used.
Only advantages of MMPAV1 over WiredTiger which I have found so far is:
WiredTiger is not available on Solaris platform whereas MMPAV1 is.
Even while updating a big document with only a single element, WiredTiger re-writes the whole document making it slower.
So you can always left MMPAV1 out while choosing your storage engine. Now let's come to the point of in-memory storage engine. Starting in MongoDB Enterprise version 3.2.6, the in-memory storage engine is part of general availability (GA) in the 64-bit builds.
It has the following advantages over the storage engines:
Similar to WiredTiger, the in-memory storage engine also allow document level concurrency.
The in-memory storage engine is lot faster than others.
By avoiding disk I/O, the in-memory storage engine allows for more predictable latency of database operations.
But this storage engine has quite a few disadvantages as well:
The in-memory storage engine does not persist data after process shutdown.
If your dataset is way too large, then in-memory engine is not a good option.
In-memory storage engine requires that all its data (including oplog if mongod is part of replica set, etc.) fit into the specified --inMemorySizeGB command-line option or storage.inMemory.engineConfig.inMemorySizeGB setting.
Check the MongoDB Manual for example Deployment Architectures using in-memory storage engine.

How Efficient is Mongo DB ISOLATION

I came across a document that Mongo DB maintains a global write lock, wanted to know how efficient it is to support "ISOLATION" of "ACID" as of SQL database.
I came across a document that Mongo DB maintains a global write lock
That's old information, MongoDB is now on a database level lock, maybe sometime in the future collection, however, that has been put back in favour of concurrency.
wanted to know how efficient it is to support "ISOLATION" of "ACID" as of SQL database.
First thing first, MongoDB IS NOT AN ACID DATABASE. If you want ACID you should go with an ACID compliant database. Don't try and make a database do what it isn't designed to do.
As for actual isolation, currently MongoDB has isolation on a single document level with atomic operations such as $inc, $set, $unset and all those others. Isolation does not occur on multiple documents, there is an $isolated ( http://docs.mongodb.org/manual/reference/operator/isolated/ ) operator but it is highly recommened not to use it, plus it isn't supported on sharded collections.
There is also a documentation page on providing isolation levels: http://docs.mongodb.org/manual/tutorial/isolate-sequence-of-operations/ but only findAndModify and indexing can provide some element of isolation whereby other queries will not interferer.
Fundamentally, even if it had atomic operations on multiple documents, MongoDB cannot normally support isolation across many documents, this is due to one of its main concurrency features, the ability to subside out of memory operations for ones in memory.
And so I come back to my original point, if you want ACID go to a ACID tech.

Why mongodb locks database instead of collecton

I read some information in mongodb manual regarding locking of database. It says that mongodb implements some sort of reader-writer lock for multiple clients working with database. It seems absolutely logical, when we need to ensure data integrity.
My question is why mongodb locks databases instead of collections?
The feature simply isn't done yet. It's planned for 2.4+ (maybe 2.5?). Until 2.2, it was a global lock and not a database-level lock.