Mongod slow query be happend when replicate set sync - mongodb

Recently, I found that a insert op make the mongod slow query be happend.
And always heppend when the secondary mongod instance syncing data from another node.
The replicate set has three members and I set the client driver write concern "w : 2".
the oplog sync will block insert op?
what be happend when insert document to a syncing node?

The writeConcern setting w:2 means that the write will be acknowledged when exactly two members of the replica set has acknowledged that the write happened (see https://docs.mongodb.com/v3.2/reference/write-concern/#w-option). In other words, it will wait until the write has replicated (via the oplog) to one other node, since the Primary is counted as one node.
This means that the "speed" of the insert/update query will be subject to your network speed. If the network is slow or congested, then the insert will appear to be "slow". This is not due to replication blocking anything, it is simply the effect of specifying w:2 in a congested network.
There may be a network congestion that triggers both the sync source change and the slow insert, but the replication process by itself does not block any insert operation.

Related

Mongodb Read operation

I'm new to MongoDB correct me if I'm wrong.
In MongoDB read and write operation is performed on the primary node. Doesn't it makes more sense to do a read operation in both primary and secondary while write operation only in primary node. As the primary node will eventually update the secondary nodes.
If both read and write operation has to be done from primary node then why to maintain more than one secondary node as it will not reduce the traffic to a single database, ignoring the data safety part for time being.
By default, the Primary handled both read and write but you can direct your reads to Secondary nodes and mongodb supports that. The question is, are you ok with reading stale data. Because the Secondary nodes replicate by tailing the Primary's oplog, they usually lag behind the Primary and you may end up reading old data at times. If your requirement isn't realtime read/write, it is totally fine to read the data from Secondary nodes
The main purpose of maintaining more than one secondary node is also for High Availability (no downtime). For instance, if you've a 3 node replica set and say one node is down due to NW issue. At this state, you have two nodes (majority members) online that can serve the read and write requests without any impact to your application

Can MongoDB manage RollBack procedure more then 300 MB Streaming Data?

I am dealing with rollback procedures of MongoDB. Problem is rollback for huge data may be bigger than 300 MB or more.
Is there any solution for this problem? Error log is
replSet syncThread: replSet too much data to roll back
In official MongoDB document, I could not see a solution.
Thanks for the answers.
The cause
The page Rollbacks During Replica Set Failover states:
A rollback is necessary only if the primary had accepted write operations that the secondaries had not successfully replicated before the primary stepped down. When the primary rejoins the set as a secondary, it reverts, or “rolls back,” its write operations to maintain database consistency with the other members.
and:
When a rollback does occur, it is often the result of a network partition.
In other words, rollback scenario typically occur like this:
You have a 3-nodes replica set setup of primary-secondary-secondary.
There is a network partition, separating the current primary and the secondaries.
The two secondaries cannot see the former primary, and elected one of them to be the new primary. Applications that are replica-set aware are now writing to the new primary.
However, some writes keep coming into the old primary before it realized that it cannot see the rest of the set and stepped down.
The data written to the old primary in step 4 above are the data that are rolled back, since for a period of time, it was acting as a "false" primary (i.e., the "real" primary is supposed to be the elected secondary in step 3)
The fix
MongoDB will not perform a rollback if there are more than 300 MB of data to be rolled back. The message you are seeing (replSet too much data to roll back) means that you are hitting this situation, and would either have to save the rollback directory under the node's dbpath, or perform an initial sync.
Preventing rollbacks
Configuring your application using w: majority (see Write Concern and Write Concern for Replica Sets) would prevent rollbacks. See Avoid Replica Set Rollbacks for more details.

Cannot read from last remaining node in replica set

I have a 3 node MongoDB (2.6.3) replica set and am testing various failure scenarios.
It was my understanding that if a majority of the replica nodes are not available then the replica set becomes read only. But what I am experiencing is if I shut down my 2 secondary nodes, the last remaining node (which was previously primary) becomes a secondary and I cannot even read from it.
From the docs:
Users may configure read preference on a per-connection basis to
prefer that the read operations return on the secondary members. If
clients configure the read preference to permit secondary reads, read
operations can return from secondary members that have not replicated
more recent updates or operations. When reading from a secondary, a
query may return data that reflects a previous state.
It sounds like I can configure my client to allow reads from secondaries, but since it was the primary node that I left up, it should be up to date with all of the data. Does MongoDB make the last node secondary even if it is fully caught up with data?
As you've noted, once you've shut down the two secondaries, your primary steps down and becomes a secondary (it's a normal scenario once a primary looses connection to the majority of members).
The default read preference of a replica set is to read from primary, but since your former primary is not even primary anymore, as you have encountered , "I cannot even read from it."
You can change read-preference on a driver/database/collection and even operation basis.
since it was the primary node that I left up, it should be up to date with all of the data. Does MongoDB make the last node secondary even if it is fully caught up with data?
As said, the primary becomes secondary as it steps down, nothing to do with the fact that it's up to date or not. It wouldn't read even because of the default read preference, if you will change your driver preference to secondary , nearest or so, you'll be able to continue reading even if a single node (former primary) remains.

Replication Behavior in MongoDB

I have a single mongod instance with 2 replications(secondary mongod instances) and a java code which inserts 1 million simple objects with WriteConcern = JOURNAL_SAFE.
While the java code is in execution we kill the primary instance, the java code throws an exception server not available. Then i shutdown both other secondary nodes and started each node separately as standalone and then check the record count. we observe that record count in both secondary mongod instances is same while in primary a one record is missing and the missing record is the one on which the job failed(mongod instance was killed).
Can anyone please explain this behavior, if the record is not present is primary how can it be possible that record exist in secondary.
Regards,
Bhagwant Bhobe
This not at all unexpected - in fact, I would expect this to be the case because replication in MongoDB is asynchronous and "as fast as possible" - as soon as the primary records the write in memory, it is visible via the oplog to the secondaries which apply it to themselves.
Because you killed the primary server before it had a chance to flush the write from memory to disk, that node does not have the inserted record when you examine it, but the secondaries have it because it was replicated, and normally replication takes less time than flushing data to disk (though it depends on speed of your disk, speed of your network and relative timing of events).

What is the advantage to explicitly connecting to a Mongo Replica Set?

Obviously, I know why to use a replica set in general.
But, I'm confused about the difference between connecting directly to the PRIMARY mongo instance and connecting to the replica set. Specifically, if I am connecting to Mongo from my node.js app using Mongoose, is there a compelling reason to use connectSet() instead of connect()? I would assume that the failover benefits would still be present with connect(), but perhaps this is where I am wrong...
The reason I ask is that, in mongoose, the connectSet() method seems to be less documented and well-used. Yet, I cannot imagine a scenario where you would NOT want to connect to the set, since it is recommended to always run Mongo on a 3x+ replica set...
If you connect only to the primary then you get failover (that is, if the primary fails, there will be a brief pause until a new master is elected). Replication within the replica set also makes backups easier. A downside is that all writes and reads go to the single primary (a MongoDB replica set only has one primary at a time), so it can be a bottleneck.
Allowing connections to slaves, on the other hand, allows you to scale for reads (not for writes - those still have to go the primary). Your throughput is no longer limited by the spec of the machine running the primary node but can be spread around the slaves. However, you now have a new problem of stale reads; that is, there is a chance that you will read stale data from a slave.
Now think hard about how your application behaves. Is it read-heavy? How much does it need to scale? Can it cope with stale data in some circumstances?
Incidentally, the point of a minimum 3 members in the replica set is to offer resiliency and safe replication, not to provide multiple nodes to connect to. If you have 3 nodes and you lose one, you still have enough nodes to elect a new primary and have replication to a backup node.