MongoDB SECONDARY becoming RECOVERING at nighttime - mongodb

I am running a conventional MongoDB Replica Set consisting of 3 members (member1 in datacenter A, member2 and member3 in datacenter B).
member1 is the current PRIMARY and I am adding members 2 and 3 via rs.add(). They are performing their initial sync and become SECONDARY very soon. Everything is fine all day long and the replication delay of both members is 0 seconds until 2 AM at nighttime.
Now: Every night at 2 AM both members shift into the RECOVERING state and stop replication at all, which leads to a replication delay of hours when I am having a look into rs.printSlaveReplicationInfo() in the morning hours. At around 2 AM there are no massive inserts or maintenance tasks known to me.
I get the following log entries on the PRIMARY:
2015-10-09T01:59:38.914+0200 [initandlisten] connection accepted from 192.168.227.209:59905 #11954 (37 connections now open)
2015-10-09T01:59:55.751+0200 [conn11111] warning: Collection dropped or state deleted during yield of CollectionScan
2015-10-09T01:59:55.869+0200 [conn11111] warning: Collection dropped or state deleted during yield of CollectionScan
2015-10-09T01:59:55.870+0200 [conn11111] getmore local.oplog.rs cursorid:1155433944036 ntoreturn:0 keyUpdates:0 numYields:1 locks(micros) r:32168 nreturned:0 reslen:20 134ms
2015-10-09T01:59:55.872+0200 [conn11111] end connection 192.168.227.209:58972 (36 connections now open)
And, which is more interesting, I get the following log entries on both SECONDARYs:
2015-10-09T01:59:55.873+0200 [rsBackgroundSync] repl: old cursor isDead, will initiate a new one
2015-10-09T01:59:55.873+0200 [rsBackgroundSync] replSet syncing to: member1:27017
2015-10-09T01:59:56.065+0200 [rsBackgroundSync] replSet error RS102 too stale to catch up, at least from member1:27017
2015-10-09T01:59:56.066+0200 [rsBackgroundSync] replSet our last optime : Oct 9 01:59:23 5617035b:17f
2015-10-09T01:59:56.066+0200 [rsBackgroundSync] replSet oldest at member1:27017 : Oct 9 01:59:23 5617035b:1af
2015-10-09T01:59:56.066+0200 [rsBackgroundSync] replSet See http://dochub.mongodb.org/core/resyncingaverystalereplicasetmember
2015-10-09T01:59:56.066+0200 [rsBackgroundSync] replSet error RS102 too stale to catch up
2015-10-09T01:59:56.066+0200 [rsBackgroundSync] replSet RECOVERING
Which is also striking - the start of the oplog "resets" itself every night at around 2 AM:
configured oplog size: 990MB
log length start to end: 19485secs (5.41hrs)
oplog first event time: Fri Oct 09 2015 02:00:33 GMT+0200 (CEST)
oplog last event time: Fri Oct 09 2015 07:25:18 GMT+0200 (CEST)
now: Fri Oct 09 2015 07:25:26 GMT+0200 (CEST)
I am not sure if this is somehow correlated to the issue. I am also wondering that such a small delay (Oct 9 01:59:23 5617035b:17f <-> Oct 9 01:59:23 5617035b:1af) lets the members become stale.
Could this also be a server (VM host) time issue or is it something completely different? (Why is the first oplog event being "resetted" every night and not "shifting" to a timestamp like NOW minus 24 hrs?)
What can I do to investigate and to avoid?

Upping the oplog size should solve this (per our comments).
Some references for others who run into this issue
Workloads that Might Require a Larger Oplog Size
Error: replSet error RS102 too stale to catch up link1 & link2

Related

MongoDB data corruption on a replica set

I am working with a MongoDB database running in a replica set.
Unfortunately, I noticed that the data appears to be corrupted.
There should be over 10,000 documents in the database. However, there are several thousand records that are not being returned in queries.
The total count DOES show the correct total.
db.records.find().count()
10793
And some records are returned when querying by RecordID (a custom sequence integer).
db.records.find({"RecordID": 10049})
{ "_id" : ObjectId("5dfbdb35c1c2a400104edece")
However, when querying for a records that I know for a fact should exist, it does not return anything.
db.records.find({"RecordID": 10048})
db.records.find({"RecordID": 10047})
db.records.find({"RecordID": 10046})
The issue appears to be very sporadic, and in some cases entire ranges of records are missing. The entire range from RecordIDs 1500 to 8000 is missing.
Questions: What could be the cause of the issue? What can I do to troubleshoot this issue further and recover the corrupted data? I looked into running repairDatabase but that is for standalone instances only.
UPDATE:
More info on replication:
rs.printReplicationInfo()
configured oplog size: 5100.880859375MB
log length start to end: 14641107secs (4066.97hrs)
oplog first event time: Wed Mar 03 2021 05:21:25 GMT-0500 (EST)
oplog last event time: Thu Aug 19 2021 17:19:52 GMT-0400 (EDT)
now: Thu Aug 19 2021 17:20:01 GMT-0400 (EDT)
rs.printSecondaryReplicationInfo()
source: node2-examplehost.com:27017
syncedTo: Thu Aug 19 2021 17:16:42 GMT-0400 (EDT)
0 secs (0 hrs) behind the primary
source: node3-examplehost.com:27017
syncedTo: Thu Aug 19 2021 17:16:42 GMT-0400 (EDT)
0 secs (0 hrs) behind the primary
UPDATE 2:
We did a restore from a backup and somehow it looks like it fixed the issue.
We did a restore from a backup and somehow it looks like it fixed the issue.

Spark rdd.count() yields inconsistent results

I'm a bit baffled.
A simple rdd.count() gives different results when run multiple times.
Here is the code i run:
val inputRdd = sc.newAPIHadoopRDD(inputConfig,
classOf[com.mongodb.hadoop.MongoInputFormat],
classOf[Long],
classOf[org.bson.BSONObject])
println(inputRdd.count())
It opens a connection to a MondoDb Server and simply counts the Objects.
Seems pretty straight forward to me
According to MongoDb there are 3,349,495 entries
Here is my spark output, all ran the same jar:
spark1 : 3.257.048
spark2 : 3.303.272
spark3 : 3.303.272
spark4 : 3.303.272
spark5 : 3.303.271
spark6 : 3.303.271
spark7 : 3.303.272
spark8 : 3.303.272
spark9 : 3.306.300
spark10: 3.303.272
spark11: 3.303.271
Spark and MongoDb are run on the same cluster.
We are running:
Spark version 1.5.0-cdh5.6.1
Scala version 2.10.4
MongoDb version 2.6.12
Unfortunately we can not update these
Is Spark non-deterministic?
Is there anyone who can enlighten me?
Thanks in advance
EDIT/ Further Info
I just noticed an error in our mongod.log.
Could this error cause the inconsistent behaviour?
[rsBackgroundSync] replSet not trying to sync from hadoop04:27017, it is vetoed for 333 more seconds
[rsBackgroundSync] replSet syncing to: hadoop05:27017
[rsBackgroundSync] replSet not trying to sync from hadoop05:27017, it is vetoed for 600 more seconds
[rsBackgroundSync] replSet not trying to sync from hadoop04:27017, it is vetoed for 333 more seconds
[rsBackgroundSync] replSet not trying to sync from hadoop05:27017, it is vetoed for 600 more seconds
[rsBackgroundSync] replSet not trying to sync from hadoop04:27017, it is vetoed for 333 more seconds
[rsBackgroundSync] replSet error RS102 too stale to catch up, at least from hadoop05:27017
[rsBackgroundSync] replSet our last optime : Jul 2 10:19:44 57777920:111
[rsBackgroundSync] replSet oldest at hadoop05:27017 : Jul 5 15:17:58 577bb386:59
[rsBackgroundSync] replSet See http://dochub.mongodb.org/core/resyncingaverystalereplicasetmember
[rsBackgroundSync] replSet error RS102 too stale to catch up
As you already spotted, the problem does not appear to be with spark (or scala) but with MongoDB.
As such the question regarding the difference seems to be resolved.
You will still want to troubleshoot the actual MongoDB error, the provided link can be a good starting point for that: http://dochub.mongodb.org/core/resyncingaverystalereplicasetmember
count returns an estimated count. As such, the value returned can change even if the number of documents hasn't changed.
countDocuments was added to MongoDB 4.0 to provide an accurate count (that also works in multi-document transactions).

mongo replica failing to sync and start with replicaset

I have a 3 node replicas mongo cluster. I managed to start first two nodes but the thrd one it's failing with:
[rsBackgroundSync] starting rollback: OplogStartMissing our last op time fetched: (term: 33, timestamp: Jan 22 09:34:52:1). source's GTE: (term: 34, timestamp: Jan 22 09:35:25:1) hashes: (-9060984734961038872/2476820215102251535)
2017-01-22T14:01:51.206+0000 F REPL [rsBackgroundSync] need to rollback, but in inconsistent state
2017-01-22T14:01:51.206+0000 I - [rsBackgroundSync] Fatal assertion 28723 UnrecoverableRollbackError need to rollback, but in inconsistent state. minvalid: (term: 38, timestamp: Jan 22 11:13:01:1) > our last optime: (term: 33, timestamp: Jan 22 09:34:52:1) # 18750
I made a mongodump from Primary and remove this third replica (mongoreplica3) from the replicaset and restore it, but after I tried to set back the node ion replica set it's still failing with the same error.
Any idea how can I manually sync and start this mongoreplica3 with my replicaset?
This was solved by removing everything from /data and start the mongoreplica which got synced with the Primary after.

Secondary keeps rolling back

In last 7 days, three times our secondary servers went down with the following message. What these errors mean? Why does it rollback? I have attached the screen shot of the oplog window and replication lag.
Around 4AM the server went down. Around 3:50 the replication lag went to 300 seconds, but that is just 5 mins, the node has more oplog window.
We take backups using MMS from one of the secondary, does this could be the cause of issue?
Mon May 19 03:50:27.146 [rsBackgroundSync] replSet syncing to: xxxx.prod.xxxx.net:17017
Mon May 19 03:50:27.231 [rsBackgroundSync] replSet our last op time fetched: May
19 03:50:16:152
Mon May 19 03:50:27.231 [rsBackgroundSync] replset source's GTE: May 19 03:50:16
:153
Mon May 19 03:50:27.231 [rsBackgroundSync] replSet rollback 0
Mon May 19 03:50:27.231 [rsBackgroundSync] replSet ROLLBACK
Mon May 19 03:50:27.231 [rsBackgroundSync] replSet rollback 1
Mon May 19 03:50:27.231 [rsBackgroundSync] replSet rollback 2 FindCommonPoint
Mon May 19 03:50:27.232 [rsBackgroundSync] replSet info rollback our last optime
: May 19 03:50:16:152
Mon May 19 03:50:27.232 [rsBackgroundSync] replSet info rollback their last opti
me: May 19 03:50:16:155
Mon May 19 03:50:27.232 [rsBackgroundSync] replSet info rollback diff in end of
log times: 0 seconds
Mon May 19 03:50:27.691 [rsBackgroundSync] replSet rollback found matching event
s at Mar 13 06:12:22:11
Mon May 19 03:50:27.691 [rsBackgroundSync] replSet rollback findcommonpoint scan
ned : 222891
Mon May 19 03:50:27.691 [rsBackgroundSync] replSet replSet rollback 3 fixup
Mon May 19 03:50:30.065 [rsBackgroundSync] replSet rollback 3.5
Mon May 19 03:50:30.065 [rsBackgroundSync] replSet rollback 4 n:7018
Mon May 19 03:50:30.065 [rsBackgroundSync] replSet minvalid=May 19 03:50:16 5379
e1e8:155
Mon May 19 03:50:30.065 [rsBackgroundSync] replSet rollback 4.6
Mon May 19 03:50:30.065 [rsBackgroundSync] replSet rollback 4.7
Mon May 19 03:50:30.443 [rsBackgroundSync] ERROR: rollback cannot find object by
id
Mon May 19 03:50:30.444 [rsBackgroundSync] ERROR: rollback cannot find object by
id
Mon May 19 03:50:30.444 [rsBackgroundSync] replSet rollback 5 d:4 u:7016
Mon May 19 03:50:30.460 [rsBackgroundSync] replSet rollback 6
We found oplog in the primary got corrupted somehow. We found it by running hte following queries
db.oplog.rs.find().sort({$natural:1}).explain()
db.oplog.rs.find().sort({$natural:-1}).explain()
So we did a primary step down, and did a fresh sync.

MongoDB: How to remove an index on a replicaset?

I see that the MongoDB documentation says that removing index is by calling db.accounts.dropIndex( { "tax-id": 1 } ). But it does not say whether the node needs to be removed from the replicaset or not.
I tried to take a secondary node in a replicaset offline and restart as a standalone node (in a different port) and tried to drop the index.
But after bringing back the node in the replica set with regular process sudo service mongod start, the mongod process is dying saying the index got corrupted.
Thu Oct 31 19:52:38.098 [repl writer worker 1] Assertion: 15898:error in index possibly corruption consider repairing 382
0xdddd81 0xd9f55b 0xd9fa9c 0x7edb83 0x7fb332 0x7fdc08 0x9d3b50 0x9c796e 0x9deb64 0xac45dd 0xac58df 0xa903fa 0xa924c7 0xa71f6c 0xc273d3 0xc26b18 0xdab721 0xe26609 0x7ff4d05f0c6b 0x7ff4cf9965ed
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdddd81]
/usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x9b) [0xd9f55b]
/usr/bin/mongod() [0xd9fa9c]
/usr/bin/mongod(_ZN5mongo11checkFailedEj+0x143) [0x7edb83]
/usr/bin/mongod(_ZNK5mongo12BucketBasicsINS_12BtreeData_V1EE11basicInsertENS_7DiskLocERiS3_RKNS_5KeyV1ERKNS_8OrderingE+0x222) [0x7fb332]
/usr/bin/mongod(_ZNK5mongo11BtreeBucketINS_12BtreeData_V1EE10insertHereENS_7DiskLocEiS3_RKNS_5KeyV1ERKNS_8OrderingES3_S3_RNS_12IndexDetailsE+0x68) [0x7fdc08]
/usr/bin/mongod(_ZNK5mongo30IndexInsertionContinuationImplINS_12BtreeData_V1EE22doIndexInsertionWritesEv+0xa0) [0x9d3b50]
/usr/bin/mongod(_ZN5mongo14IndexInterface13IndexInserter19finishAllInsertionsEv+0x1e) [0x9c796e]
/usr/bin/mongod(_ZN5mongo24indexRecordUsingTwoStepsEPKcPNS_16NamespaceDetailsENS_7BSONObjENS_7DiskLocEb+0x754) [0x9deb64]
/usr/bin/mongod(_ZN5mongo11DataFileMgr6insertEPKcPKvibbbPb+0x123d) [0xac45dd]
/usr/bin/mongod(_ZN5mongo11DataFileMgr16insertWithObjModEPKcRNS_7BSONObjEbb+0x4f) [0xac58df]
/usr/bin/mongod(_ZN5mongo14_updateObjectsEbPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEPNS_11RemoveSaverEbRKNS_24QueryPlanSelectionPolicyEb+0x2eda) [0xa903fa]
/usr/bin/mongod(_ZN5mongo27updateObjectsForReplicationEPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEbRKNS_24QueryPlanSelectionPolicyE+0xb7) [0xa924c7]
/usr/bin/mongod(_ZN5mongo21applyOperation_inlockERKNS_7BSONObjEbb+0x65c) [0xa71f6c]
/usr/bin/mongod(_ZN5mongo7replset8SyncTail9syncApplyERKNS_7BSONObjEb+0x713) [0xc273d3]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x48) [0xc26b18]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdab721]
/usr/bin/mongod() [0xe26609]
/lib64/libpthread.so.0(+0x7c6b) [0x7ff4d05f0c6b]
/lib64/libc.so.6(clone+0x6d) [0x7ff4cf9965ed]
Thu Oct 31 19:52:38.106 [repl writer worker 1] ERROR: writer worker caught exception: error in index possibly corruption consider repairing 382 on:
xxxxxxxx--deleted content related to the data...xxxxxxxxxxxxx
Thu Oct 31 19:52:38.106 [repl writer worker 1] Fatal Assertion 16360
0xdddd81 0xd9dc13 0xc26bfc 0xdab721 0xe26609 0x7ff4d05f0c6b 0x7ff4cf9965ed
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdddd81]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xd9dc13]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc26bfc]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdab721]
/usr/bin/mongod() [0xe26609]
/lib64/libpthread.so.0(+0x7c6b) [0x7ff4d05f0c6b]
/lib64/libc.so.6(clone+0x6d) [0x7ff4cf9965ed]
Thu Oct 31 19:52:38.108 [repl writer worker 1]
***aborting after fassert() failure
Thu Oct 31 19:52:38.108 Got signal: 6 (Aborted).
Is this due to dropping the index in the offline mode on the secondary? Any suggestions on the proper way to drop the index is highly appreciated.
The proper way to remove index from replica set is to drop it on primary. The idea of replica is having the same copy of data (with small time lags). So whenever you do something on primary is copied to the secondaries. So if you start doing anything on the primary, right after it finishes this process, the process propagates to secondaries.
If you are removing index from primary - the index will be removed on the secondary as well.