Mongo 2.6.4 won't stepDown() because can't find a secondary within 10 seconds, but rs.status() shows optimeDates in sync - mongodb

I'm attempting to step down my mongo primary and have my secondaries take over - mongo won't step down and says the my secondaries are more than 10 seconds out of sync. Yet my replica set says they are in sync - I'm baffled and it is likely something silly I'm missing
here's my output:
MongoDB shell version: 2.6.4
sessionV2:PRIMARY> rs.stepDown()
{
"closest" : NumberLong(0),
"difference" : NumberLong(1441842526),
"ok" : 0,
"errmsg" : "no secondaries within 10 seconds of my optime"
}
sessionV2:PRIMARY> rs.status()
{
"set" : "sessionV2",
"date" : ISODate("2015-09-09T23:48:53Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "sessionv2-mongo-replset-moprd1-02:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 2659,
"optime" : Timestamp(1441842533, 61),
"optimeDate" : ISODate("2015-09-09T23:48:53Z"),
"electionTime" : Timestamp(1441839881, 1),
"electionDate" : ISODate("2015-09-09T23:04:41Z"),
"self" : true
},
{
"_id" : 1,
"name" : "sessionv2-mongo-replset-moprd1-01:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 2658,
"optime" : Timestamp(1441842531, 120),
"optimeDate" : ISODate("2015-09-09T23:48:51Z"),
"lastHeartbeat" : ISODate("2015-09-09T23:48:51Z"),
"lastHeartbeatRecv" : ISODate("2015-09-09T23:48:51Z"),
"pingMs" : 0,
"syncingTo" : "sessionv2-mongo-replset-moprd1-03:27017"
},
{
"_id" : 2,
"name" : "sessionv2-mongo-replset-moprd1-03:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 2658,
"optime" : Timestamp(1441842531, 120),
"optimeDate" : ISODate("2015-09-09T23:48:51Z"),
"lastHeartbeat" : ISODate("2015-09-09T23:48:51Z"),
"lastHeartbeatRecv" : ISODate("2015-09-09T23:48:52Z"),
"pingMs" : 0,
"syncingTo" : "sessionv2-mongo-replset-moprd1-02:27017"
}
],
"ok" : 1
}
sessionV2:PRIMARY>
here's what the primary reports as far a status:
sessionV2:PRIMARY> rs.printSlaveReplicationInfo()
source: sessionv2-mongo-replset-moprd1-01:27017
syncedTo: Wed Sep 09 2015 19:15:02 GMT-0500 (CDT)
1 secs (0 hrs) behind the primary
source: sessionv2-mongo-replset-moprd1-03:27017
syncedTo: Wed Sep 09 2015 19:15:02 GMT-0500 (CDT)
1 secs (0 hrs) behind the primary
sessionV2:PRIMARY>
and an oplog view from a secondary:
sessionV2:SECONDARY> db.getReplicationInfo()
{
"logSizeMB" : 5120,
"usedMB" : 5077.25,
"timeDiff" : 12226,
"timeDiffHours" : 3.4,
"tFirst" : "Wed Sep 09 2015 15:53:29 GMT-0500 (CDT)",
"tLast" : "Wed Sep 09 2015 19:17:15 GMT-0500 (CDT)",
"now" : "Wed Sep 09 2015 19:17:15 GMT-0500 (CDT)"
}
thanks in advance!
2015 Sept 10th update:
we stopped each secondary and performed an initial sync from primary - then attempted to step down the primary again - it looks like the PRIMARY can't find the secondary oplogDate (we were unsure if forcing would free a SECONDARY to take PRIMARY)
sessionV2:PRIMARY> db.runCommand( { replSetStepDown: 60, force: false } )
{
"closest" : NumberLong(0),
"difference" : NumberLong(1441936029),
"ok" : 0,
"errmsg" : "no secondaries within 10 seconds of my optime"
}

issue solved! - I was playing with workarounds and documenting the replica setup - our initial scripts set the primary at default priority and the secondaries at priority 0 (which means they will never take PRIMARY role) - so basically -> bad config and the error message doesn't give any info onto the root problem (the docs are pretty clear, just missed it because I trusted our init scripts) - if you run into this and your replicaset oplogs are up to date, double check that the priorities are not set at 0

Related

Validate Replica Set Data

I am new here and I hope I am posting my question on the right forum. I didn't see where to pick the right forum category for MongoDB..
I have 2 questions -
I am using Mongodb 2.6, and I am in the process of migrating 2 replica sets RS0 & RS1 from a data center to AWS. I have 3 servers on each replica set, making a total of 6 servers. The option that I am using to migrate data to new servers is by expanding the replica sets to the new hardware and then let them catch-up completely before I can remove the nodes on the old hardware from the replica set.
Question-1> How do I validate the data on both replica sets (source & destination) to make sure the data is 100% in sync before I can remove the old replica set from the source? What are the proper commands I can use to check the number of collections and data counts on all collections for all databases I am migrating ?
Question-2> Correct me if I am wrong - My understanding is when using Replica sets, we have to keep odd numbers of members within a RS. Right now I have 3 servers per RS which is fine, but when I add a new member to my current RS, which will be pointing to a new server, I will end up with 4 members - wouldn't that cause a problem ? Should I add 2 members in my RS instead so that I can keep 5 members which is an odd number ?
Thank you so much in advance !
Question 1: use rs.status() on any of the replica set members; you can check the status of each member and the optime field (compare with the primary):
http://docs.mongodb.org/manual/reference/method/rs.status/
Question 2: you need an odd number of members because only one member can be elected as primary, and each member can vote for 1 member, so having an even number of members could lead, during the primary member election, to an equal number of votes for two or more members. To have an odd number of members you can set up an arbitet instance: http://docs.mongodb.org/master/tutorial/add-replica-set-arbiter/
#Stefano gave you the answer.I would just like to add a few:
Question 1:
You can use rs.status to check the replica set status.This sort out the primary and secondary clearly.
{
"set" : "replset",
"date" : ISODate("2015-11-19T15:22:32.597Z"),
"myState" : 1,
"term": NumberLong(1),
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "m1.example.net:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 269,
"optime" : {
"ts" : Timestamp(1447946550, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2015-11-19T15:22:30Z"),
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(1447946549, 1),
"electionDate" : ISODate("2015-11-19T15:22:29Z"),
"configVersion" : 1,
"self" : true
},
{
"_id" : 1,
"name" : "m2.example.net:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 13,
"optime" : {
"ts" : Timestamp(1447946539, 1),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("2015-11-19T15:22:19Z"),
"lastHeartbeat" : ISODate("2015-11-19T15:22:31.323Z"),
"lastHeartbeatRecv" : ISODate("2015-11-19T15:22:32.045Z"),
"pingMs" : NumberLong(0),
"configVersion" : 1
},
{
"_id" : 2,
"name" : "m3.example.net:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 13,
"optime" : {
"ts" : Timestamp(1447946539, 1),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("2015-11-19T15:22:19Z"),
"lastHeartbeat" : ISODate("2015-11-19T15:22:31.325Z"),
"lastHeartbeatRecv" : ISODate("2015-11-19T15:22:31.971Z"),
"pingMs" : NumberLong(0),
"configVersion" : 1
}
],
"ok" : 1
}
To know the slave delay fire rs.printSlaveReplicationInfo():
source: localhost.localdomain:27070
syncedTo: Mon May 02 2016 12:34:36 GMT+0530 (IST)
0 secs (0 hrs) behind the primary
source: localhost.localdomain:27072
syncedTo: Mon May 02 2016 12:34:36 GMT+0530 (IST)
0 secs (0 hrs) behind the primary
source: localhost.localdomain:27073
syncedTo: Mon May 02 2016 12:34:36 GMT+0530 (IST)
0 secs (0 hrs) behind the primary
To know more detail about replication catch up in the oplog try rs.printReplicationInfo():
configured oplog size: 700.0038909912109MB
log length start to end: 261920secs (72.76hrs)
oplog first event time: Fri Apr 29 2016 11:49:16 GMT+0530 (IST)
oplog last event time: Mon May 02 2016 12:34:36 GMT+0530 (IST)
now: Mon May 02 2016 12:49:37 GMT+0530 (IST)
Question 2:
Odd number replicas facilitates high voting in the election.So in case u have even replica sets you can add Arbiters.They are just light weight and do not contain data,they can also reside on any other currently running server.
Hope this helps !!!

Uncaught exception 'MongoCursorException' with message 'Couldn't get connection: No candidate servers found'

Problem Description
I have a three member replica set, and a php web front end that a) writes a record, and then b) does a .find() on the collection and returns all documents in the database.
To better understand how replica sets work, I did the following:
stopped the mongo service on the primary server(mongohost1). the web page kept working.
stopped the mongo service on the server that got promoted to primary (mongohost2). At this point, even though I have another mongo host (mongohost3) with the same database, the PHP web app fails with the above error message.
I was expecting that the system would let me at least read the records from the database, even if the write failed.
What I've checked / tried so far:
All of the hosts are reachable. I've trying pinging by hostname from each box and it alll works.
Here's how the replica set has been configured as per mongohost3:
jlrs0:SECONDARY> cfg=rs.config()
{
"_id" : "jlrs0",
"version" : 5,
"members" : [
{
"_id" : 0,
"host" : "monghost1.test.mm.org:27017",
"priority" : 3
},
{
"_id" : 1,
"host" : "mongohost2.test.mm.org:27017",
"priority" : 2
},
{
"_id" : 2,
"host" : "mongohost3.test.mm.org:27017",
"priority" : 2
}
]
}
jlrs0:SECONDARY>
and the status of each member in the replica set per mongohost3:
jlrs0:SECONDARY> rs.status()
{
"set" : "jlrs0",
"date" : ISODate("2014-11-19T15:16:21Z"),
"myState" : 2,
"members" : [
{
"_id" : 0,
"name" : "mongohost1.test.mm.org:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : Timestamp(1416419914, 1),
"optimeDate" : ISODate("2014-11-19T17:58:34Z"),
"lastHeartbeat" : ISODate("2014-11-19T15:16:20Z"),
"lastHeartbeatRecv" : ISODate("2014-11-19T14:06:49Z"),
"pingMs" : 0
},
{
"_id" : 1,
"name" : "mongohost2.test.mm.org:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : Timestamp(1416419914, 5),
"optimeDate" : ISODate("2014-11-19T17:58:34Z"),
"lastHeartbeat" : ISODate("2014-11-19T15:16:17Z"),
"lastHeartbeatRecv" : ISODate("2014-11-19T14:10:58Z"),
"pingMs" : 0
},
{
"_id" : 2,
"name" : "mongohost3.test.mm.org:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 451417,
"optime" : Timestamp(1416419914, 5),
"optimeDate" : ISODate("2014-11-19T17:58:34Z"),
"self" : true
}
],
"ok" : 1
}
Here's the PHP code to connect:
$m = new MongoClient("mongodb://mongohost1.test.mm.org:27017,mongohost2.test.mm.org:27017,mongohost3.test.mm.org:27017/?replicaSet=jlrs0");
I'm still reading up on replica sets etc. so I'm sure it's something that I've missed / neglected to set up.
For example, I haven't set up an arbitor...
not sure if it's related or not but just in case, i thought i'd mention it. I'm not sure what else to check.
Thanks.
You need to set your read preference to primaryPreferred.
You need to specify that it ok to read from secondary when primary is not available.
By default, it is not so.
Link to documentation
Please check also your php mongo pecl lib version.
Before 1.5.6 there were 2 errors related to not selecting primary server by PHP after fail in replica set.
Be sure to have pecl mongo at least 1.5.6.

Mongodb replica set status showing "RECOVERING"

I have setup the replica set over 3 mongo server and imported the 5 GB data.
now status of secondary server showing "RECOVERING".
Could you let me know what is means for "RECOVERING" and how to solve this issue.
Status is as below
rs.status()
{
"set" : "kutendarep",
"date" : ISODate("2013-01-15T05:04:18Z"),
"myState" : 3,
"members" : [
{
"_id" : 0,
"name" : "10.1.4.138:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 86295,
"optime" : Timestamp(1357901076000, 4),
"optimeDate" : ISODate("2013-01-11T10:44:36Z"),
"errmsg" : "still syncing, not yet to minValid optime 50f04941:2",
"self" : true
},
{
"_id" : 1,
"name" : "10.1.4.21:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 86293,
"optime" : Timestamp(1358160135000, 18058),
"optimeDate" : ISODate("2013-01-14T10:42:15Z"),
"lastHeartbeat" : ISODate("2013-01-15T05:04:18Z"),
"pingMs" : 0
},
{
"_id" : 2,
"name" : "10.1.4.88:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 86291,
"optime" : Timestamp(1357900674000, 10),
"optimeDate" : ISODate("2013-01-11T10:37:54Z"),
"lastHeartbeat" : ISODate("2013-01-15T05:04:16Z"),
"pingMs" : 0,
"errmsg" : "still syncing, not yet to minValid optime 50f04941:2"
}
],
"ok" : 1
The message on the "RECOVERING" replica set nodes means that these are still performing the initial sync.
These nodes are not available for reads until they transitions to the Secondary state.
There are several steps in the initial sync.
See here for more information about the replica set synchronization process:
http://docs.mongodb.org/manual/core/replica-set-sync/
Login to RECOVERING instance.
Check RECOVERING instance replication status with,
db.printReplicationInfo()
You will get a result like this,
oplog first event time: Tue Jul 30 2019 17:26:37 GMT+0000 (UTC)
oplog last event time: Wed Jul 31 2019 16:46:53 GMT+0000
now: Thu Aug 22 2019 07:36:38 GMT+0000 (UTC)
If you find the difference between oplog last event time and now.
That means this particular instance is not PRIMARY and SECONDARY and not an active member of the replica set.
Now there are two solutions for this
First,
1. Login to RECOVERING instance
2. Delete data from existing db which will be /data/db
3. Restart this RECOVERING instance
4. (optional) If you find the following error. Remove that mongod.pid from the specified location.
Error starting mongod. /var/run/mongod/mongod.pid
5. Restart instance.
6. Now your recovering instance will be running state and It will show PRIMARY or Secondary in place of RECOVERING.
Second,
Copy other running instance data into RECOVERING instance and restart mongodb.

Should I increase the size of my MongoDB oplog file?

I understand that the oplog file will split multi updates into individual updates but what about batch inserts? Are those also split into individual inserts?
If I have a write intensive collection with batches of ~20K docs being inserted roughly every 30 seconds, do I / should I consider increasing my oplog size beyond the default? I have a 3 member replica set and mongod is running on a 64 bit Ubuntu server install with the Mongodb data sitting on a 100GB volume.
Here is some data which may or may not be helpful:
gs_rset:PRIMARY> db.getReplicationInfo()
{
"logSizeMB" : 4591.3134765625,
"usedMB" : 3434.63,
"timeDiff" : 68064,
"timeDiffHours" : 18.91,
"tFirst" : "Wed Oct 24 2012 22:35:10 GMT+0000 (UTC)",
"tLast" : "Thu Oct 25 2012 17:29:34 GMT+0000 (UTC)",
"now" : "Fri Oct 26 2012 19:42:19 GMT+0000 (UTC)"
}
gs_rset:PRIMARY> rs.status()
{
"set" : "gs_rset",
"date" : ISODate("2012-10-26T19:44:00Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "xxxx:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 77531,
"optime" : Timestamp(1351186174000, 1470),
"optimeDate" : ISODate("2012-10-25T17:29:34Z"),
"self" : true
},
{
"_id" : 1,
"name" : "xxxx:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 76112,
"optime" : Timestamp(1351186174000, 1470),
"optimeDate" : ISODate("2012-10-25T17:29:34Z"),
"lastHeartbeat" : ISODate("2012-10-26T19:44:00Z"),
"pingMs" : 1
},
{
"_id" : 2,
"name" : "xxxx:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 61301,
"optime" : Timestamp(1351186174000, 1470),
"optimeDate" : ISODate("2012-10-25T17:29:34Z"),
"lastHeartbeat" : ISODate("2012-10-26T19:43:59Z"),
"pingMs" : 1
}
],
"ok" : 1
}
gs_rset:PRIMARY> db.printCollectionStats()
dev_fbinsights
{
"ns" : "dev_stats.dev_fbinsights",
"count" : 6556181,
"size" : 3117699832,
"avgObjSize" : 475.53596095043747,
"storageSize" : 3918532608,
"numExtents" : 22,
"nindexes" : 2,
"lastExtentSize" : 1021419520,
"paddingFactor" : 1,
"systemFlags" : 0,
"userFlags" : 0,
"totalIndexSize" : 1150346848,
"indexSizes" : {
"_id_" : 212723168,
"fbfanpage_id_1_date_1_data.id_1" : 937623680
},
"ok" : 1
}
The larger the size of the current primary's oplog, the longer the window of time a replica set member will be able to remain offline without falling too far behind the primary. If it does fall too far behind, it will need a full resync.
The field timeDiffHours as returned by db.getReplicationInfo() reports how many hours worth of data the oplog currently has recorded. After the oplog has filled up and starts overwriting old entries, then start to monitor this value. Do so especially under heavy write load (in which the value will decrease). If you then assume it will never drop below N hours, then N is the maximum number of hours that you can tolerate a replica set member being temporarily offline (e.g. for regular maintenance, or to make an offline backup, or in the event of hardware failure) without performing the full resync. The member would then be able to automatically catch up to the primary after coming back online.
If you're not comfortable with how low N is, then you should increase the size of the oplog. It completely depends on long your maintenance windows are, or how quickly you or your ops team can respond to disaster scenarios. Be liberal in how much disk space you allocate for it, unless you have a compelling need for that space.
I'm assuming here that you're keeping the size of the oplog constant over all replica set members, which is a reasonable thing to do. If not, then plan for the scenario where the replica set member with the smallest oplog gets elected primary.
(To answer your other question: similarly to multi-updates, batch inserts are also fanned out into multiple operations in the oplog)
Edit: Note that data imports and bulk inserts/updates will write data significantly faster to the oplog than your application might at a typical heavy load. To reiterate: be conservative in your estimation for how much time it will take for the oplog to fill.

mongodb - All nodes in replica set are primary

I am trying to configure a replica set with two nodes but when I execute rs.add("node2") and then rs.status() both nodes are set to PRIMARY. Also when I run rs.status() on the other node the only node that appears is the local one.
Edit1:
rs.status() output:
{
"set" : "rs0",
"date" : ISODate("2012-09-22T01:01:12Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "node1:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 70968,
"optime" : Timestamp(1348207012000, 1),
"optimeDate" : ISODate("2012-09-21T05:56:52Z"),
"self" : true
},
{
"_id" : 1,
"name" : "node2:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 68660,
"optime" : Timestamp(1348205568000, 1),
"optimeDate" : ISODate("2012-09-21T05:32:48Z"),
"lastHeartbeat" : ISODate("2012-09-22T01:01:11Z"),
"pingMs" : 0
}
],
"ok" : 1
}
Edit2: I tried doing the same thing with 3 different nodes and I got the same result (rs.status() says I have a replica set with three primary nodes). Is it possible that this problem is caused by some specific configuration of the network?
If you issue rs.initiate() from both of your the members of the replica set before rs.add() then both will come up as primary.
You should only use rs.initiate() on one of the members of the replica set, the one that you intend to be primary initially. Then you can rs.add() the other member to the replica set.
The answer above does not answer how to fix it. I kind of got it done using trial and error.
I have cleaned up the data directory (as in rm -rf *) and restarted these PRIMARY nodes, except one. I added them back. It seems to work.
Edit1
The nice little trick below did not seem to work for me,
So, I logged into the mongod console using mongo <hostname>:27018
here is how the shell looks like:
rs2:PRIMARY> rs.conf()
{
"_id" : "rs2",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "ip-10-159-42-911:27018"
}
]
}
I decided to change it to secondary. So,
rs2:PRIMARY> var c = {
... "_id" : "rs2",
... "version" : 1,
... "members" : [
... {
... "_id" : 1,
... "host" : "ip-10-159-42-911:27018",
... "priority": 0.5
... }
... ]
... }
rs2:PRIMARY> rs.reconfig(c, { "force": true})
Mon Nov 11 19:46:39.244 DBClientCursor::init call() failed
Mon Nov 11 19:46:39.245 trying reconnect to ip-10-159-42-911:27018
Mon Nov 11 19:46:39.245 reconnect ip-10-159-42-911:27018 ok
reconnected to server after rs command (which is normal)
rs2:SECONDARY>
Now it is secondary. I do not know if there is a better way. But this seems to work.
HTH