How to achieve strong consistency in MongoDB Replica Sets? - mongodb

In MongoDB documentation, here, it has been mentioned that in a replica set even with majority readConcern we would achieve eventual consistency. I am wondering how is this possible when we have majority in both reads and writes which leads to a quorum (R+W>N) in our distributed system? I expect a strong consistent system in this setting. This is the technique which Cassandra uses as well in order to achieve strong consistency.
Can someone clarify this for me please?

MongoDb is not regarded very well in terms of strong consistency. If you have a typical sharded and replicated setup to increase consistency will need to trade off some of the performance of the db. As you know you can execute write operations only on the master of the replica set. By default you can only read from it as well. This is possibly the strongest consistency you can get from MongoDb AFAIK as the other nodes are used only for replication, failover and availability reasons. And you could read from the secondary nodes only for operations where having the latest data is not crucial and for long-running operations, such as aggregation for example.
If you set up sharding you could offload a big portion of the read/write operations to different primary nodes. I think that when it comes to MongoDb that is all you could do in order to increasing consistency and performance in particular for larger data sets.

Related

MongoDB: Write concern w:[number>1] - why?

I'm trying to understand the advantages of using write concern where is greater than 1.
I understand the use cases for w:1, and w:primary. I'm trying to understand why someone would use other values. Let's use a 6-node (+arbiter) replica set as an example.
If it's guaranteed writes with {w:majority, j:true} will survive the
failure of the primary, what the advantage of the w:5?
Does w:6 help to achieve linearizable access to secondaries at the
expense of availability as a failure of any node will prevent writes?
This is not documented as such.
Why would someone ever use w:2 or w:3? It doesn't guarantee the
write will survive a failure of the primary.
Write concern is a mechanism for tuning availability, durability and performance.
In a w:1 environment, you essentially expect to lose data if the primary experiences any issue whatsoever.
In a w:2 environment, you still may lose data but you expect less loss because the data will be replicated. If you lose two nodes, and you are unlucky with where the second write went, you can lose the data.
In a w:3 environment, there is still potential for data loss but it is less than in the w:2 environment.
w:4 is majority write, a sensible default for most applications.
w:5 provides one "unnecessary" write. This means your writes will take longer than they would with w:majority but if you have elections, secondary nodes will generally be more up-to-date with the primary, so you would reduce election times.
w:6 writes to every node. If you are doing secondary reads and your nodes are geographically distributed, this is a way of getting the data to all of the nodes quicker at the expense of potential write unavailability and obviously longer writes.
Does w:6 help to achieve linearizable access to secondaries at the expense of availability as a failure of any node will prevent writes?
No, this is not one of the benefits you get from w=# of nodes.

How to decide when to use replicate sets for mongodb in production

We are currently hosting the MongoDB using its official docker image in ec2, for our production environment, its 32gb memory server dedicated to just this service.
How can using replica sets help us in the improvement of the performance of our MongoDB, we are currently facing that the response for queries is getting slower day by day.
Are there any measures through which we can determine that investing in the replica set will provide worthy benefits as well and will not be premature optimization.
MongoDB replication is a high availability solution (see note at the end of the post for more details on Replication). Replication is not a performance improvement solution.
MongoDB query performance depends upon various factors: size of collection, size of document, database design, query definition and indexes. Inadequate hardware (memory, hard drive, cpu and network) can affect the query performance. The number of operations at a given time can also affect the performance.
For faster query performance the main consideration is using indexes. Indexes affect directly the query filter and sort operations. To find if your query is performing optimally and using the proper indexes generate a query plan using the explainwith "executionStats" mode; study the plan. Explain can be run on MongoDB find, update, delete and aggregation queries. All these queries can benefit from indexes. See Query Optimization.
Adding capabilities to the existing hardware is known as vertical scaling; and replication is not vertical scaling.
Replication:
This is configured as a replica-set - a primary node and multiple secondary nodes. The primary is the main point of contact for application - all writes happen on the primary, (and reads, by default). The data written to the primary is replicated to the secondaries. This way data redundancy is accomplished. When the primary goes down one of the secondaries takes over as primary and keep the system running via a failover process. Data durability, high availability, redundancy and failover are the man concepts with replication. In MongoDB a replica-set cluster can have up to fifty nodes.
It is recommended to use replica-set in production due to HA functionality.
As a result of source limits on one hand and the need of HA in production on the other hand, I would suggest you to create a minimal replica-set which will consist of Primary, Secondary and an Arbiter (an arbiter does not contain any data and is very low memory consumer).
Also, Writes typically effect your memory performance much more than reads. In order to achieve better write performance I would advice you to create more shards (the more masters you have, the more writes you can handle at the same time).
However, I'm not sure what case your mongo's performance to slow so fast. I think you should:
Check what is most effect your production's performance (complicated queries or hard writes).
Change your read preference to "nearest".
Consider to disable Read Concern "majority" (remember that by default there is a write "majority" concern. Members should be up to date).
Check for a better index.
And of curse create a replica-set!
Good Luck! :P

Will aggregations run on a replica slow down write rate of the primary?

I need to accept data at a high rate and I need to aggregate this data periodically. My current strategy is to have a replica on a different server which will act as a "processor" of the data by using aggregations.
My question is whether running aggregations on a replica may slow down primary?
You can benefit from replicas for read-only workloads. Although your replicas tend to wear down from write operations, as much as the primary, you may dedicate the primary to write operations and read only from replicas. Off course this is subject to lagging, once the replicas could not be able to catch it up with the primary just-in-time. Anyway, theoretically, no read operation on a replica should impact the primary, including aggregations.
PS: Sharding is always better for load distribution, off course, but it doesn't mean replicas are completely useless for that.

Can someone give me detailed technical reasons why writing to a secondary in MongoDB replica set is not allowed

I know we can't write to a secondary in MongoDB. But I can't find any technical reason why. In my case, I don't really care if there is a slight delay but write to a secondary might be faster. Please provide some reference if you can. Thanks!!
The reason why you can not write to a secondary is the way replication works:
Secondaries connect to a special collection on the primary, called oplog. This oplog contains operations which were run through the query optimizer. Basically, the oplog is a capped collection, and the secondaries use a tailable cursor to access it's entries and processes it from the oldest to the newest.
When a election takes place because the primary goes down / steps down, the secondary with the most recent oplog entry is elected primary. The secondaries connect to the new primary, query for the oplog entries they haven't processed yet and the cluster is in sync.
This procedure is pretty straight forward. Now imagine one could write to a secondary. All nodes in the cluster would have to have a tailable cursor on all other nodes of the cluster, and maintaining a consistent state in case of one machine failing becomes a very complicated and in case of a failure even race condition dependent thing. Effectively, there could be no guarantee even for eventual consistency any more. It would be a more or less a gamble.
That being said: A replica set is not for load balancing. A replica sets purpose is to enhance the availability and durability of the data. Because reading from a secondary is a non-risky thing, MongoDB made it possible, according to their dogma of offering the maximum of possible features without compromising scalability (which would be severely hampered if one could write to secondaries).
But MongoDB does provide a load balancing feature: sharding. Choosing the right shard key, you can distribute read and write load over (almost) as many shards as you want. Not to mention that you can provide a lot more of the precious RAM for a reasonable price when sharding.
There is a one liner answer:
Multi-master replication is a hairball.
If you was allowed to write to secondaries MongoDB would have to use milti-master replication to ge this working: http://en.wikipedia.org/wiki/Multi-master_replication where essentially evey node copies to each other the OPs (operations) they have received and somehow do it without losing data.
This form of replication has many obsticles to overcome.
One would be throughput; remember that OPs need to transfer across the entire network so it is possible you might actually lose throughput while adding consistentcy problems. So getting better throughput would be a problem. It is much having a secondary, taking all of the primaries OPs and then its own for replication outbound and then asking it to do yet another job.
Adding consistentcy over a distributed set like this would also be hazardous, one main question that bugs MongoDB when asking if a member is down or is: "Is it really down or just unavailable?". It is almost impossible to ensure true consistentcy in a distributed set like this, at the very least tricky.
Those are just two problems immediately.
Essentially, to sum up, MongoDB does not yet possess mlti-master replication. It could in the future but I would not be jumping for joy if it does, I will most likely ignore such a feature, normal replication and sharding in both ACID and non-ACID databases causes enough blood pressure.

mongodb replication + sharding consistency

I have a doubt (well a couple). I think i grasp the answer, but im looking for a confirm
lets say i would implement a sharded cluster of mongodb, is that necessary to have a replica set lying beside shards?
I know that if i use only the replicaSet, and i decide to distribute the reading operation on the secondary nodes, it will cause the eventual-consistency, right?
and in the other hand if i don't enable reads on secondary nodes, the "only" advantage i will get is to protect the database in case of a node will fall
but what about the consistency in a sharded-replicaset? it will still be eventual-consistent or it will be full consistent?
is that necessary to have a replica set lying beside shards
You don't have to but if you care for availability you will.
and i decide to distribute the reading operation on the secondary nodes, it will cause the eventual-consistency, right?
Yes and since secondaries gather as much OPs as primaries and most drivers will only read from one active secondary reading from secondary is quite useless.
the "only" advantage i will get is to protect the database in case of a node will fall
The "only"? That is the whole point of replica sets, to produce automatic fail over. That is there fundamental reason for existing and it is a big one.
it will still be eventual-consistent or it will be full consistent?
It depends on where you do your reads but if you read from secondaries in a sharded setup you not only get eventual consistency but due to chunk movement you might also get duplicate documents.
If you are reading from primaries then you will get strong consistency in a sharded replica set setup.