mongodb failure to resync a stale member of a replica set - mongodb

I have mongodb (version 4.2) replicaset with 3 nodes - primary, secondary, arbiter,
primary occupies close to 250 GB disk space, oplog size is 15 GB
secondary was down for few hours, tried recovering it by restarting, it went into recovering forever.
tried initial sync by deleting files on data path, took 15 hours, data path size went to 140GB and failed
tried to copy files from primary and seed it to recover secondary node
followed https://www.mongodb.com/docs/v4.2/tutorial/resync-replica-set-member/
This did not work - (again stale)
in the latest doc (5.0) they mention to use a new member ID, does it apply for 4.2 as well?
changing the member ID throws error as IP and port is same for node I am trying to recover
This method was also unsuccessful, planning to recover the node using different data path and port as primary might consider it as a new node, then once the secondary is up, will change the port to which I want and restart, will it work?
please provide any other suggestions to recover a replica node with large data like 250 GB

shut down primary
Copying the data files from primary node, placing it in new db path (other than the recovering nodes db path)
changing log path
starting mongo service with different port (other than the one used by recovering node)
start primary
adding it to replicaset using rs.add("IP:new port") on primary
worked, could see the secondary node coming up successfully

Related

Mongodb : Primary was down for a day, when turned up old data not synching, current data synching

I have a mongodb replica set with one primary and two secondary, primary were down for one day, due to some os issues. Then one of the secondary elected as primary. Used that secondary as primary and used in application. After one day, old primary turned up and currently it is using as secondary. It seems only current data are updating in this secondary not synching the data of the downtime day. Is there any way to know, it is synching in background? if not synching what i have to do? I need to convert old primary again as primary, it can be done only if all data get synched.
If your old PRIMARY become SECONDARY this mean it successfully replicated the missed data from your new PRIMARY ... , exception is made only for the writes that exist in your old PRIMARY immediately before the election not able to replicate to your new PRIMARY. Those writes you can locate in your "rollback" folder in your old PRIMARY in bson format , you will need to review those bson files and check manually if you need to re-insert or consider them again.
Troubleshooting steps to follow:
Check the current replicaSet status with:
rs.status()
If you have 1x PRIMARY + 2x SECONDARY it seems everything to be fine , your members are in sync to the PRIMARY.
If some of the members are in different state they maybe init syncing and you will need to wait for some time until the process finish or there is other problem so you may need to wait abit.
If for some reason the members do not init sync succesfully , you can force fresh init sync so they try from the scratch.
Here are simple steps to start init sync:
4.1. Stop the member:
mongo --p XXX
use admin
db.shutdownServer()
4.2. Go the data folder and remove anything inside:
cd /my_member_data_folder/
rm -rf *
4.3. Start the member again and wait until init sync and successfully move to SECONDARY state.
mongod --config my_config_file.conf
In case everything is fine and you just need to switch the replicaSet to your old PRIMARY you may do so by just re-configure the priority of this member as follow:
newPRIMARY>var x=rs.conf()
newPRIMARY>x.members[0].priority=10
newPRIMARY>rs.reconfig(x)
Supposing members[0] is the old PRIMARY , and the rest of member they have the priority < 10

MongoDB replica set failed

I am having a MongoDB Replica set consisting three nodes, 1 Primary, 1 Secondary and one Arbiter.
When I was performing the initial re-sync on secondary node from the primary, the primary node got terminated. When I checked the logs of primary node the exception being shown was
SEVERE: Invalid access at address: 0x7fcde1e00ff0SEVERE: Invalid access at address: 0x7fcde1e00ff0
SEVERE: Got signal: 7 (Bus error)
Since then this primary node is not getting started due to this exception and secondary node is stuck in the STARTUP2 state.
I am able to start the primary node on different port as a standalone node (or in maintenance mode) and read its data. But whenever I am running it as a part of replica set it is getting terminated with above exception
The primary and secondary are having RAID0 as their storage. The data size is around 550GB.
I copied the whole data of primary node(currently down) to the secondary node(in STARTUP2 state) and then restarted the secondary node. But it also didn't worked. Secondary node getting elected to primary on restart but also getting terminated within a second of election with below exception :
SEVERE: Fatal DBException in logOp(): 10334 BSONObj size: 50359410 (0x3006C72) is invalid. Size must be between 0 and 16793600(16MB) First element: 2: ?type=111
SEVERE: terminate() called, printing stack (if implemented for platform):
0x11fd1b1 0x11fc438 0x7ff56dc01846 0x7ff56dc01873 0xe54c9e 0xc4de1b 0xc58f46 0xa0bac1 0xa0c250 0xa0f1bf 0xa0fcc1 0xa1323e 0xa2949a 0xa2af32 0xa2cd36 0xd61654 0xba21a2 0xba3780 0x7724a9 0x11b2fde
How to recover and restore the replica set in this case.
I am also having the backup of this data. Can I drop this replica set and recreate the replica set with this backup data ?
There is another replica set in this MongoDB cluster which is working fine.
Your secondary server's eligibility is impossible due to replication lag.
Can you post your rs.status()'s output?
Your secondary server probably has a "could not find member to sync from" infoMessage.
I've run through something similar before due to bad RAM. It can be whatever.
Fix it by copying the primary server's data into another folder on the secondary and start a new instance on some other port on it, and then add it to the replica ( with the { force: true } options ) so the secondary server have somewhere to sync from.
You can also destroy the replica and create it again, but beware not to loose your replica's op-log.

MongoDB Sharded Cluster Add new secondary node

We have sharded cluster with 2 secondaries on each shard.
Due to space problem one of the secondaries got corrupted.
In order to add new node to the existing shard we have removed data directories on the problematic secondary data node.
Then added new data node using rs.config into existing replica set.
We have around 1.2TB data.
I can see the data folder size is increasing so it proves that its synchronizing from the primary shard.
When I do rs.status() the replica set member shows that the new node is in STARTUP2 mode
Also it shows otptitime as
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
Data node is able to see primary node as checked from "lastHeartbeatRecv"
We are using Amazon AWS.
Please advise if there are any different method to add new data node with fast sync from Primary as the data is 1.2TB and we have kicked of the sync process 7 days before.
Copy the recent snapshot of the good secondary one into the problematic secondary node after cleaning the data directories. Then add this node into replica set. Let the oplog be applied automatically and synch with primary node. This way synchronization time will be reduced as secondary has to just catch up the backlog from the day when snapshot of the good secondary node is taken.

Do you lose records when you reconfigure mongodb replicaset?

I have 3 member replicaSet in MongoDB which fell apart when re-configuring the host names of the sever instances. I had to reconfigure the replicaSet, however I am curious how MongoDB handles records that are not synced across all the members.
Case 1) There is a new record on the MongoDB server that I access to reconfigure the set.
Case 2) There is a new record on another MongoDB server that is added later to the replica set.
Each replica-set has one primary node and one or more secondary nodes.
All writes happen on the primary. The primary then sends these changes to the secondaries (the list of changes is referred to as "the oplog"). That means the primary is always the member with the most up-to-date data.
When the primary is suddenly unreachable, the replica-set is put into read-only mode and an election takes place to find a new primary. Usually the secondary which is most up-to-date is selected (more details on replica-set election). Any writes which were not propagated to that secondary yet are lost.
When the old primary goes back online, it re-joins the replica-set as a secondary. Its data gets synchronized to the state of the new primary. Any writes which only happened on the old primary which weren't propagated to the new primary before the crash are rolled back.
The rolled-back writes are backed up as bson-files in the directory /rollback and can be re-added to the replica-set using bsondump and mongorestore. Details about this procedure can be found in the article Rollbacks During Replica Set Failover

MongoDB Replica-Set Disk Cleanup

I am trying to shrink the size of my MongoDB replica-set(the collections are the same size but disk space keeps growing). According to the MongoDB website, I should just run mongod --repair on the master node to compact all collections. The problem would be downtime for the website. So, I have two options(that I know about):
Take secondary node off of replica-set and run mongod --repair on it and restart back on replica-set. I tried this and couldn't get past permission errors on 'local' collection.
Shut down secondary node and delete all files in the data directory. Restart mongo and let it recover from master. This actually worked for me but my only concern is, what if your journal collection is full and since it's a capped collection, will you only receive the data that is in the journal or will you actually copy over all of master's data?
Has anyone else run into this scenario? I'm surprised by the lack of information when trying to search for this.
Take secondary node off of replica-set and run mongod --repair on it and restart back on replica-set.
This is a common practice which is usually referred to as a "rolling repair". You take each secondary out of the replica set and repair it, and eventually step down the primary for repair as a last step. As long as you always have a majority of your replica set nodes available this approach will minimize potential downtime.
If you are frequently deleting data you should consider using the new PowerOf2Sizes collection option in MongoDB 2.2. This changes the allocation method to allocate document space in powers of two (eg. a 500 byte document would be allocated 512 bytes), which allows for more effective reuse of the space from deleted documents (at the slight expense of a few more bytes per document).
I tried this and couldn't get past permission errors on 'local' collection.
Permission errors on the 'local' collection sound like file system permissions (i.e. based on the user you were running your mongod as). You should run the repair process with the same user.
Shut down secondary node and delete all files in the data directory. Restart mongo and let it recover from master. This actually worked for me but my only concern is, what if your journal collection is full and since it's a capped collection, will you only receive the data that is in the journal or will you actually copy over all of master's data?
It sounds like you are conflating the Journal which is used for durability and crash recovery with the Oplog used for replication.
If you resync a node from the primary, all data will be copied over. During this initial period the
node will be in RECOVERING state and is not considered a "healthy" node (i.e. available for queries).
Once the node is caught up it will change to a normal SECONDARY state at which point the oplog will be used for ongoing sync.
Some further reading:
Replication fundamentals
Replica set status reference