MonogoDB Replica Set Status Not changing from Startup to Secondary - mongodb

I have setup a MongoDB replica set with 3 nodes(vm's running CentOS). One node became Primary other 2 stuck in Startup. When these 2 nodes will change their states from startup to secondary.
aryabhata:PRIMARY> rs.status()
{
"set" : "aryabhata",
"date" : ISODate("2016-04-30T08:10:45.173Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "localhost.localdomain:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 69091,
"optime" : Timestamp(1461935462, 1),
"optimeDate" : ISODate("2016-04-29T13:11:02Z"),
"electionTime" : Timestamp(1461934754, 1),
"electionDate" : ISODate("2016-04-29T12:59:14Z"),
"configVersion" : 459192,
"self" : true
},
{
"_id" : 1,
"name" : "repset1.com:27017",
"health" : 1,
"state" : 0,
"stateStr" : "STARTUP",
"uptime" : 92,
"optime" : Timestamp(0, 0),
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2016-04-30T08:10:44.485Z"),
"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
"pingMs" : 0,
"configVersion" : -2
},
{
"_id" : 2,
"name" : "repset2.com:27017",
"health" : 1,
"state" : 0,
"stateStr" : "STARTUP",
"uptime" : 68382,
"lastHeartbeat" : ISODate("2016-04-30T08:10:43.974Z"),
"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
"pingMs" : 0,
"configVersion" : -2
}
],
"ok" : 1
}

My problem fix with set ip address for Primary instead hostname
cfg = rs.conf()
cfg.members[0].host = "public-or-private-primary-ip:27017"
rs.reconfig(cfg)
after that secondary state change to STARTUP2

From primary check whether you are able to connect to secondary
mongo --host repset1.com --port 27017
When the above one fails may be firewall or BindIP issue.
Check bind_ip (should be 0.0.0.0, change in mongodb.conf is it's 127.0.0.1):
netstat -plunt | grep :27017 | grep LISTEN
Look at the log-files of secondaries, why they are stuck. Did they receive the configuration details?
Try to reconfigure, see mongo replicaset reconfigure

For me the problem was that primary had authorization enabled. In this case the secondaries always stayed in STARTUP.
To use authorization you need to set keyFile in configuration file of all nodes (primary and secondary).
Create mongodb key file on linux:
openssl rand -base64 741 > mongodb.key
chmod 600 mongodb.key
chown mongod:mongod mongodb.key
mongod.conf file:
replication:
replSetName: rs0
security:
authorization: enabled
keyFile: /home/mongodb.key
Source MongoDB replica set with simple password authentication

It requires that both the primary resolves the host name to IP of the secondary and also the secondary resolves the hostname of the primary to an IP.
In my case I forgot to add in hosts file for the secondary to resolve the hostname of the primary. Once I updated the hosts file in the secondary, the state of the secondary transitioned to STARTUP2 and then to SECONDARY.

Related

[Mongodb][OpsManager]Mongodb secondary instances are not adding to replica set

I have created 3 mongodb instances with below mongod.conf [Primary1,Secondary1,Secondary2]
net:
port: 27017
bindIp: 0.0.0.0
replication:
replSetName: Replica1
I have started the 3 instance using " sudo service mongod start "
If I connect primary server to other 2 servers using mongo –host "ip" –port 27017..its working and connected.
Issue 1:
After I do rs.initiate() and rs.conf()
Secondary 1 is included in members but when I use rs.status() but showing stateStr" : "STARTUP"
And If I check its log means below error :
"msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
Issue 2:
Because of issue 1 if I try to rs.add() for Secondary 2 ,its not working
rs.status() reponse
{
"_id" : 0,
"name" : "primary:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
.......
}
{
"_id" : 1,
"name" : "secondary1:27017",
"health" : 1,
"state" : 0,
"stateStr" : "STARTUP",
"uptime" : 1579,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2021-01-08T04:47:35.455Z"),
"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : -2,
"configTerm" : -1
}
Log Output of secondary 1's mongodb.log
{"t":{"$date":"2021-01-08T04:39:16.634+00:00"},"s":"I", "c":"CONNPOOL", "id":22576, "ctx":"ReplNetwork","msg":"Connecting","attr":{"hostAndPort":"primary:27017"}}
{"t":{"$date":"2021-01-08T04:39:18.004+00:00"},"s":"I", "c":"CONTROL", "id":20714, "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2021-01-08T04:39:18.004+00:00"},"s":"I", "c":"CONTROL", "id":20712, "ctx":"LogicalSessionCacheReap","msg":"Sessions collection is not set up; waiting until next sessions reap interval","attr":{"error":"NamespaceNotFound: config.system.sessions does not exist"}}
{"t":{"$date":"2021-01-08T04:39:36.634+00:00"},"s":"I", "c":"CONNPOOL", "id":22576, "ctx":"ReplNetwork","msg":"Connecting","attr":{"hostAndPort":"primary:27017"}}
Issue got fixed. Primary database was able to connect to secondary hosts but in return secondary host can't. So I adjusted the security groups and connected. Takeaway is all the primary and secondary hosts should be able to connect with each other in mongodb port (27017)
Same issue happened for me , networks and everything is fine . Issue with host entry missing for primary server, still secondary was trying to connect DNS name
"ctx":"ReplNetwork","msg":"Connecting","attr":{"hostAndPort":"MONGODB01:27717"}}
Immediately have added into host entry everything is synced

How mongo db replicaset works in Azure

I have deployed the Mongo DB replica set using this template mongodb-replica-set-centos.
Mongo DB VM 1 (primary):
ps aux | grep mongo
root 10161 0.7 0.5 797140 40900 ? SLl 05:18 0:05 mongod --dbpath /var/lib/mongo/ --replSet repset --logpath /var/log/mongodb/mongod.log --fork --config /etc/mongod.conf
sshuser 10347 0.0 0.0 112640 960 pts/0 S+ 05:29 0:00 grep --color=auto mongo
Mongo DB database:-
mongo -u mongoadmin -p mongoadmin admin
MongoDB shell version: 3.2.19
connecting to: admin
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
http://docs.mongodb.org/
Questions? Try the support group
http://groups.google.com/group/mongodb-user
Server has startup warnings:
2018-03-23T05:18:21.137+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2018-03-23T05:18:21.137+0000 I CONTROL [initandlisten]
repset:PRIMARY> rs.status()
{
"set" : "repset",
"date" : ISODate("2018-03-23T07:38:45.694Z"),
"myState" : 1,
"term" : NumberLong(1),
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "52.170.83.3:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 8426,
"optime" : {
"ts" : Timestamp(1521782318, 3),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-03-23T05:18:38Z"),
"electionTime" : Timestamp(1521782318, 1),
"electionDate" : ISODate("2018-03-23T05:18:38Z"),
"configVersion" : 2,
"self" : true
},
{
"_id" : 1,
"name" : "10.0.1.5:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 8407,
"optime" : {
"ts" : Timestamp(1521782318, 3),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-03-23T05:18:38Z"),
"lastHeartbeat" : ISODate("2018-03-23T07:38:45.538Z"),
"lastHeartbeatRecv" : ISODate("2018-03-23T07:38:42.546Z"),
"pingMs" : NumberLong(1),
"configVersion" : 2
}
],
"ok" : 1
}
Mongo DB VM 2 (secondary):
ps aux | grep mongo
root 10115 0.4 0.5 447908 37892 ? SLl 05:11 0:17 mongod --dbpath /var/lib/mongo/ --config /etc/mongod.conf --replSet repset --logpath /var/log/mongodb/mongod.log --fork
sshuser 10269 0.0 0.0 112640 960 pts/0 S+ 06:21 0:00 grep --color=auto mongo
Mongo DB database:-
mongo -u mongoadmin -p mongoadmin admin
MongoDB shell version: 3.2.19
connecting to: admin
2018-03-23T07:38:54.311+0000 E QUERY [thread1] Error: Authentication failed. :
DB.prototype._authOrThrow#src/mongo/shell/db.js:1441:20
#(auth):6:1
#(auth):1:2
exception: login failed
Mongo DB VM 3(secondary):-
ps aux | grep mongo
root 10122 0.6 0.5 795472 40420 ? SLl 05:12 0:26 mongod --dbpath /var/lib/mongo/ --config /etc/mongod.conf --replSet repset --logpath /var/log/mongodb/mongod.log --fork
sshuser 10381 0.0 0.0 112640 960 pts/0 S+ 06:21 0:00 grep --color=auto mongo
Mongo DB database:-
mongo -u mongoadmin -p mongoadmin admin
MongoDB shell version: 3.2.19
connecting to: admin
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
http://docs.mongodb.org/
Questions? Try the support group
http://groups.google.com/group/mongodb-user
Server has startup warnings:
2018-03-23T05:12:19.613+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2018-03-23T05:12:19.613+0000 I CONTROL [initandlisten]
repset:SECONDARY> rs.status()
{
"set" : "repset",
"date" : ISODate("2018-03-23T07:39:04.009Z"),
"myState" : 2,
"term" : NumberLong(1),
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "52.170.83.3:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 8425,
"optime" : {
"ts" : Timestamp(1521782318, 3),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-03-23T05:18:38Z"),
"lastHeartbeat" : ISODate("2018-03-23T07:39:02.571Z"),
"lastHeartbeatRecv" : ISODate("2018-03-23T07:39:03.573Z"),
"pingMs" : NumberLong(1),
"electionTime" : Timestamp(1521782318, 1),
"electionDate" : ISODate("2018-03-23T05:18:38Z"),
"configVersion" : 2
},
{
"_id" : 1,
"name" : "10.0.1.5:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 8806,
"optime" : {
"ts" : Timestamp(1521782318, 3),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-03-23T05:18:38Z"),
"infoMessage" : "could not find member to sync from",
"configVersion" : 2,
"self" : true
}
],
"ok" : 1
}
Questions:-
Why I am unable to login into Mongo DB VM 2?
After I shutdown Mongo DB VM 3, will Mongo DB VM 2 acts as a secondary node?
If I shutdown Mongo DB VM 1, Will any one of secondary node act as a primary node?
All three questions are answered by the same fact: DB VM2 is not part of the replica set. It's clear from the rs.status() information that only two nodes are registered as part of the replica set, VM1 and VM3.
The implications are that:
On DB VM2, it is not part of the replica set so it does not have the authentication credentials you are trying to log in with
No, DB VM2 will not act as a secondary node; because it is not part of the replica set
In the current setup, with only 2 nodes in the replica set, if you shut down either node (VM1 or VM3) then the other node will not elect itself primary, because it cannot command a majority in an election.
Take a look at the docs on Replica Set Elections to understand what the majority is and why it matters; and take a look at DB VM2 to understand why it is not part of your replica set. Did you ever actually add it?

In mongodb 3.0 replication, how elections happen when a secondary goes down

Situation: I have a MongoDB replication set over two computers.
One computer is a server that holds the primary node and the arbiter. This server is a live server and is always on. It's local IP that is used in replication is 192.168.0.4.
Second is a PC that the secondary node resides on and is on for a few hours a day. It's local IP that is used in replication is 192.168.0.5.
My expectation: I wanted the live server to be the main point of data interaction of my application, regardless of the state of the PC (whether it is reachable or not, since PC is secondary), so I wanted to make sure that server's node is always primary.
The following is the result of rs.config():
liveSet:PRIMARY> rs.config()
{
"_id" : "liveSet",
"version" : 2,
"members" : [
{
"_id" : 0,
"host" : "192.168.0.4:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 10,
"tags" : {
},
"slaveDelay" : 0,
"votes" : 1
},
{
"_id" : 1,
"host" : "192.168.0.5:5051",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : 0,
"votes" : 1
},
{
"_id" : 2,
"host" : "192.168.0.4:5052",
"arbiterOnly" : true,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : 0,
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatTimeoutSecs" : 10,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
}
}
}
Also I have set the storage engine to be WiredTiger, if that matters.
What I actually get, and the problem: When I turn off the PC, or kill its mongod process, then the node on the server becomes secondary.
The following is the output of the server when I killed PC's mongod process, while connected to primary node's shell:
liveSet:PRIMARY>
2015-11-29T10:46:29.471+0430 I NETWORK Socket recv() errno:10053 An established connection was aborted by the software in your host machine. 127.0.0.1:27017
2015-11-29T10:46:29.473+0430 I NETWORK SocketException: remote: 127.0.0.1:27017 error: 9001 socket exception [RECV_ERROR] server [127.0.0.1:27017]
2015-11-29T10:46:29.475+0430 I NETWORK DBClientCursor::init call() failed
2015-11-29T10:46:29.479+0430 I NETWORK trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2015-11-29T10:46:29.481+0430 I NETWORK reconnect 127.0.0.1:27017 (127.0.0.1) ok
liveSet:SECONDARY>
There are two doubts for me:
Considering this part of MongoDB documentation:
Replica sets use elections to determine which set member will become primary. Elections occur after initiating a replica set, and also any time the primary becomes unavailable.
The election occurs when the primary is not available (or at the time of initiating, however this is part does not concern our case), but primary was always available, so why the election happens.
Considering this part of the same documentation:
If a majority of the replica set is inaccessible or unavailable, the replica set cannot accept writes and all remaining members become read-only.
Considering the part 'members become read-only', I have two nodes up vs one down, so this should not also affect our replication.
Now my question: How to keep the node on the server as primary, when the node on PC is not reachable?
Update:
This is the output of rs.status().
Thanks to Wan Bachtiar, now This makes the behavior obvious, since arbiter was not reachable.
liveSet:PRIMARY> rs.status()
{
"set" : "liveSet",
"date" : ISODate("2015-11-30T04:33:03.864Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "192.168.0.4:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1807553,
"optime" : Timestamp(1448796026, 1),
"optimeDate" : ISODate("2015-11-29T11:20:26Z"),
"electionTime" : Timestamp(1448857488, 1),
"electionDate" : ISODate("2015-11-30T04:24:48Z"),
"configVersion" : 2,
"self" : true
},
{
"_id" : 1,
"name" : "192.168.0.5:5051",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 496,
"optime" : Timestamp(1448796026, 1),
"optimeDate" : ISODate("2015-11-29T11:20:26Z"),
"lastHeartbeat" : ISODate("2015-11-30T04:33:03.708Z"),
"lastHeartbeatRecv" : ISODate("2015-11-30T04:33:02.451Z"),
"pingMs" : 1,
"configVersion" : 2
},
{
"_id" : 2,
"name" : "192.168.0.4:5052",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"lastHeartbeat" : ISODate("2015-11-30T04:33:00.008Z"),
"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
"configVersion" : -1
}
],
"ok" : 1
}
liveSet:PRIMARY>
As stated in the documentation, if a majority of the replica set is inaccessible or unavailable, the replica set cannot accept writes and all remaining members become read-only.
In this case the primary has to step down if the arbiter and the secondary are not reachable. rs.status() should be able to determine the health of the replica members.
One thing you should also watch for is the primary oplog size. The size of the oplog determines how long a replica set member can be down for and still be able to catch up when it comes back online. The bigger the oplog size, the longer you can deal with a member being down for as the oplog can hold more operations. If it does fall too far behind, you must resynchronise the member by removing its data files and performing an initial sync.
See Check the size of the Oplog for more info.
Regards,
Wan.

Why a member of mongodb keep RECOVERING?

I set up a replica set with three members and one of them is an arbiter.
One time I restart a member, the member keep RECOVERING for a long time and did not be SECONDARY again, even though the database was not large.
The status of replica set is like that:
rs:PRIMARY> rs.status()
{
"set" : "rs",
"date" : ISODate("2013-01-17T02:08:57Z"),
"myState" : 1,
"members" : [
{
"_id" : 1,
"name" : "192.168.1.52:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 67968,
"optime" : Timestamp(1358388479000, 1),
"optimeDate" : ISODate("2013-01-17T02:07:59Z"),
"self" : true
},
{
"_id" : 2,
"name" : "192.168.1.50:29017",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 107,
"lastHeartbeat" : ISODate("2013-01-17T02:08:56Z"),
"pingMs" : 0
},
{
"_id" : 3,
"name" : "192.168.1.50:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 58,
"optime" : Timestamp(1358246732000, 100),
"optimeDate" : ISODate("2013-01-15T10:45:32Z"),
"lastHeartbeat" : ISODate("2013-01-17T02:08:55Z"),
"pingMs" : 0,
"errmsg" : "still syncing, not yet to minValid optime 50f6472f:5d"
}
],
"ok" : 1
}
How should I fix this problem?
I had exact same issue: Secondary member of replica stuck in recovering mode.
Here how to solve the issue:
stop secondary mongo db
delete all secondary db data files
start secondary mongo
It will start in startup2 mode and will replicate all data from Primary
I've fixed the issue by following the below procedure.
Step1:
Login to different node and remove the issue node from mongodb replicaset. eg.
rs.remove("10.x.x.x:27017")
Step 2:
Stop the mongodb server on the issue node
systemctl stop mongodb.service
Step 3:
Create a new new folder on the dbpath
mkdir /opt/mongodb/data/db1
Note : existing path was /opt/mongodb/data/db
Step 4:
Modify dbpath on /etc/mongod.conf or mongdb yaml file
dbPath: /opt/mongodb/data/db1
Step 5:
Start the mongodb service
systemctl start mongodb.service
Step 6:
Takebackup of the existing folder and remove it
mkdir /opt/mongodb/data/backup
mv /opt/mongodb/data/db/* /opt/mongodb/data/backup
tar -cvf /opt/mongodb/data/backup.tar.gz /opt/mongodb/data/backup
rm -rf /opt/mongodb/data/db/
This will happen if replication has been broken for a while and on the slave it's not enough data to resume replication.
You would have to re-sync the slave either by replicating data from scratch or by copying it from another server and then resume it.
Check mongodb documentation for this issue https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/#replica-set-auto-resync-stale-member

mongodb - All nodes in replica set are primary

I am trying to configure a replica set with two nodes but when I execute rs.add("node2") and then rs.status() both nodes are set to PRIMARY. Also when I run rs.status() on the other node the only node that appears is the local one.
Edit1:
rs.status() output:
{
"set" : "rs0",
"date" : ISODate("2012-09-22T01:01:12Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "node1:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 70968,
"optime" : Timestamp(1348207012000, 1),
"optimeDate" : ISODate("2012-09-21T05:56:52Z"),
"self" : true
},
{
"_id" : 1,
"name" : "node2:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 68660,
"optime" : Timestamp(1348205568000, 1),
"optimeDate" : ISODate("2012-09-21T05:32:48Z"),
"lastHeartbeat" : ISODate("2012-09-22T01:01:11Z"),
"pingMs" : 0
}
],
"ok" : 1
}
Edit2: I tried doing the same thing with 3 different nodes and I got the same result (rs.status() says I have a replica set with three primary nodes). Is it possible that this problem is caused by some specific configuration of the network?
If you issue rs.initiate() from both of your the members of the replica set before rs.add() then both will come up as primary.
You should only use rs.initiate() on one of the members of the replica set, the one that you intend to be primary initially. Then you can rs.add() the other member to the replica set.
The answer above does not answer how to fix it. I kind of got it done using trial and error.
I have cleaned up the data directory (as in rm -rf *) and restarted these PRIMARY nodes, except one. I added them back. It seems to work.
Edit1
The nice little trick below did not seem to work for me,
So, I logged into the mongod console using mongo <hostname>:27018
here is how the shell looks like:
rs2:PRIMARY> rs.conf()
{
"_id" : "rs2",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "ip-10-159-42-911:27018"
}
]
}
I decided to change it to secondary. So,
rs2:PRIMARY> var c = {
... "_id" : "rs2",
... "version" : 1,
... "members" : [
... {
... "_id" : 1,
... "host" : "ip-10-159-42-911:27018",
... "priority": 0.5
... }
... ]
... }
rs2:PRIMARY> rs.reconfig(c, { "force": true})
Mon Nov 11 19:46:39.244 DBClientCursor::init call() failed
Mon Nov 11 19:46:39.245 trying reconnect to ip-10-159-42-911:27018
Mon Nov 11 19:46:39.245 reconnect ip-10-159-42-911:27018 ok
reconnected to server after rs command (which is normal)
rs2:SECONDARY>
Now it is secondary. I do not know if there is a better way. But this seems to work.
HTH