MongoDB replica out of sync when performing a lot of inserts

MongoDB replica out of sync when performing a lot of inserts - mongodb

I have a three member replica set using MongoDB v3.2.4. Each member is a VM with 8 cores and 8GB RAM, and in normal operations these nodes are running very low in CPU and memory consumption.
I have a 60GB database (30 million docs) that once a month is totally reloaded by a Map/Reduce job written in Pig. During this job the cluster receives 30k insert/s and in a few minutes the secondaries becomes out of sync.
The current oplog size is 20GB (already modified from the default) but this does not resolve the replication sync issue.
I don't know if modifying the oplog size again will help. My concern is that the replication seems to be done when there is no load on the primary. Since my insert job lasts 1 hour does that mean I need an oplog the size of my db?
Is there a way to tell MongoDB to put more effort on replication and have a more balanced workload between accepting inserts and replication?

Is there a way to tell mongo to put more effort on replication to have a more balanced workload between accepting inserts and replicatings these inserts?
To ensure data has replicated to secondaries (and throttle your inserts) you should increase your write concern to w:majority. The default write concern (w:1) only confirms that a write operation has been accepted by the primary, so if your secondaries cannot keep up for an extended period of inserts they will eventually fall out of sync (as you have experienced).
You can include the majority as an option in your MongoDB Connection String URI, eg:
STORE data INTO
'mongodb://user:pass#db1.example.net,db2.example.net/my_db.my_collection?replicaSet=replicaSetName&w=majority'
USING com.mongodb.hadoop.pig.MongoInsertStorage('', '');

Related

MongoDB Replica Set CPU load

I am running a fairly standard MongoDB (3.0.5) replica set with 1 primary and 2 secondaries. My PHP application's read preference is primary, so no reads take place on the secondaries - they are only for failover. I am running a load test on my application, which creates around 600 queries / updates per second. The operations are all being run against a collection that has ~500,000 documents. However, the queries are optimized and supported by indexes. Any query will not take longer than 40ms max.
My problem is that I am getting a quite high CPU load on all 3 nodes (200% - 300%) - sometimes the load on the secondaries is even higher than on the primary. Disk IO and RAM usage seem to be okay - at least they are not hitting any limits.
The primary's log file contains a huge amount of getmore oplog queries - I would guess that any operation on the primary creates an oplog query. It appears to me that this is too much replication overhead but I don't have any prior experience with MongoDB under load and I don't have any key figures.
As the setup will have to tolerate even more load in production, my question is whether the replication overhead is to be expected and whether it's normal that the CPU load goes up that high, even on the secondaries or is there something I'm missing?

Think about it this way. Whatever data-changing operation happens on the primary, it also needs to happen on every secondary. If there are many such operations and they create high CPU load on the primary, well, then the same situation will repeat itself on the secondaries.
Of course, in your case you'd expect the primary's CPU to be more stressed, because in addition to the writes it also handles all the reads. Probably, in your scenario, reads are relatively light and there aren't many of them when compared to the amount of writes. This would explain why the load on the primary is roughly the same as on the secondaries.
my question is whether the replication overhead is to be expected
What you call replication overhead I see as the nature of replication. A primary stressed by writes results in all secondaries being stressed by writes as well.
and whether it's normal that the CPU load goes up that high, even on the secondaries
You have 600 write queries per second and your RAM and disk are not stressed, to me this signifies that you've set up your indexes properly. High CPU load is expected with this amount of write operations per second, because the indexes are being used intensively.
Please keep in mind that once you have gathered more data, the indexes and the memory-mapped data may not fit into memory anymore, and then both the RAM and the disk will be stressed, while CPU is unlikely to be under high load anymore. In this situation, you will probably want to either add more RAM or look into sharding.

Data loss during MongoDB replica set failover

I have a Go client that continually inserts sensor data into a MongoDB replica set 30 times per sec. I ran a failover test on my 3-member set using majority write concern (while still firing the inserts). The failover process took only 2 seconds, but afterward I realized some data samples were missing. They were not in the database or in the old primary's rollback file--completely missing.
In general, how can I assure no data loss (or minimal loss) in MongoDB during failover? I'm new to MongoDB. Are there commercially available MongoDB management modules for this (I have the free edition)? E.g. can MongoDB store incoming data temporarily during failovers and later persist the data to the database? I don't want to resort to handling failovers on the client--I want failovers to be transparent to the client.

MongoDB: Why would secondary members increase memory usage before the primary?

I have a MongoDB v2.4 replica set on AWS and have been monitoring my stats using MMS and dbStats(). Yesterday I saw an increase in both mapped and virtual memory usage, which correlated with an increased data fileSize and looked completely normal...except that the increase occurred on the secondaries a full two hours before it occurred on the primary (all of these members being located in the same data center).
I vaguely recall that not all members of a replica set will necessarily have the same organization of data in their data files, and I know that you can use the compact() command to defragment the files.
The only difference between the primary and the secondaries in this replica set is that, at one time, the primary was taken offline for roughly 20 minutes. It was then brought back online and re-elected as the primary.
My question is: Is there any reason to be alarmed that the primary seemed to lag behind the secondaries when increasing its mapped & virtual memory usage?

Replication Behavior in MongoDB

I have a single mongod instance with 2 replications(secondary mongod instances) and a java code which inserts 1 million simple objects with WriteConcern = JOURNAL_SAFE.
While the java code is in execution we kill the primary instance, the java code throws an exception server not available. Then i shutdown both other secondary nodes and started each node separately as standalone and then check the record count. we observe that record count in both secondary mongod instances is same while in primary a one record is missing and the missing record is the one on which the job failed(mongod instance was killed).
Can anyone please explain this behavior, if the record is not present is primary how can it be possible that record exist in secondary.
Regards,
Bhagwant Bhobe

This not at all unexpected - in fact, I would expect this to be the case because replication in MongoDB is asynchronous and "as fast as possible" - as soon as the primary records the write in memory, it is visible via the oplog to the secondaries which apply it to themselves.
Because you killed the primary server before it had a chance to flush the write from memory to disk, that node does not have the inserted record when you examine it, but the secondaries have it because it was replicated, and normally replication takes less time than flushing data to disk (though it depends on speed of your disk, speed of your network and relative timing of events).

MongoDB write lock blocks reads on secondaries?

MongoDB docs about concurrency state that the DB is 'write greedy'. That is something I understand. However it does not tell about what locks do to secondaries in a replica set.
Taking my use-case which would get about 40 writes per 100 queries wherein I am not in need of having the most recent document at all times. A lag of 5-10 seconds is okay with me which is how much the secondaries in a replica set would be behind the master. Now if the write lock locks down master as well as the replicas, then I am locked out of reads on secondaries as well.
I wanted to know if writers will lock read operations on secondaries as well.

Into a replica set SECONDARY servers are not affected by the write lock on MASTERS.
You can see the status of your servers by using mongotop or montostat.

The locks are per mongod instance. That means that the read/write locks are locking operations only on the primary. The secondaries are reading oplog from primary and replicating actions from the primary.
You can read much more details on their manual about concurrency.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse