MongoDB RelicaSet is in state RS_DOWN and InterruptedDueToReplStateChange Exception - mongodb

We setup a mongodb replica set with 3 nodes (version 3.6). Now We're having this exception thrown by the mongodb client application:
<< ErrorHandlerProcessor >> Query failed with error code 96 and error message 'Executor error during find command: InterruptedDueToReplStateChange: operation was interrupted' on server mongodb-0.mongodb-internal.dp-common-database.svc.cluster.local:27017; nested exception is com.mongodb.MongoQueryException: Query failed with error code 96 and error message 'Executor error during find command: InterruptedDueToReplStateChange: operation was interrupted' on server mongodb-0.mongodb-internal.dp-common-database.svc.cluster.local:27017
After checking the mongodb server logs, we noticed that it elected a new primary due to some reasons at that time, but we cannot find out any errors. Can anyone help point me the cause or how to troubleshoot this issue?
Thanks.
And below is the mongodb logs from 2 nodes for that particular period:
mongodb-0
2020-09-30T06:19:32.238+0000 I NETWORK [conn38321] end connection 172.28.42.10:58362 (115 connections now open)
2020-09-30T06:40:15.730+0000 I COMMAND [PeriodicTaskRunner] task: UnusedLockCleaner took: 259ms
2020-09-30T06:40:15.757+0000 I COMMAND [conn38197] command admin.$cmd command: isMaster { ismaster: 1, $db: "admin" } numYields:0 reslen:793 locks:{} protocol:op_msg 107ms
2020-09-30T06:40:15.849+0000 I REPL [replexec-2645] Member mongodb-1.mongodb-internal.dp-common-database.svc.cluster.local:27017 is now in state RS_DOWN
2020-09-30T06:40:15.854+0000 I REPL [replexec-2645] Member mongodb-2.mongodb-internal.dp-common-database.svc.cluster.local:27017 is now in state RS_DOWN
2020-09-30T06:40:15.854+0000 I REPL [replexec-2645] can't see a majority of the set, relinquishing primary
2020-09-30T06:40:15.854+0000 I REPL [replexec-2645] Stepping down from primary in response to heartbeat
2020-09-30T06:40:15.865+0000 I REPL [replexec-2648] Member mongodb-2.mongodb-internal.dp-common-database.svc.cluster.local:27017 is now in state SECONDARY
2020-09-30T06:40:15.873+0000 I REPL [replexec-2649] Member mongodb-1.mongodb-internal.dp-common-database.svc.cluster.local:27017 is now in state SECONDARY
2020-09-30T06:40:15.885+0000 E QUERY [conn38282] Plan executor error during find command: DEAD, stats: { stage: "LIMIT", nReturned: 1, executionTimeMillisEstimate: 20, works: 1, advanced: 1, needTime: 0, needYield: 0, saveState: 0, restoreState: 0, isEOF: 1, invalidates: 0, limitAmount: 1, inputStage: { stage: "FETCH", nReturned: 1, executionTimeMillisEstimate: 20, works: 1, advanced: 1, needTime: 0, needYield: 0, saveState: 0, restoreState: 0, isEOF: 0, invalidates: 0, docsExamined: 1, alreadyHasObj: 0, inputStage: { stage: "IXSCAN", nReturned: 1, executionTimeMillisEstimate: 20, works: 1, advanced: 1, needTime: 0, needYield: 0, saveState: 0, restoreState: 0, isEOF: 0, invalidates: 0, keyPattern: { consumer: 1.0, channel: 1.0, externalTransactionId: -1.0 }, indexName: "lastsequence", isMultiKey: false, multiKeyPaths: { consumer: [], channel: [], externalTransactionId: [] }, isUnique: false, isSparse: false, isPartial: false, indexVersion: 2, direction: "forward", indexBounds: { consumer: [ "["som", "som"]" ], channel: [ "["normalChannelMobile", "normalChannelMobile"]" ], externalTransactionId: [ "[MaxKey, MinKey]" ] }, keysExamined: 1, seeks: 1, dupsTested: 0, dupsDropped: 0, seenInvalidated: 0 } } }
2020-09-30T06:40:15.887+0000 I REPL [replexec-2645] transition to SECONDARY from PRIMARY
mongodb-1
September 30th 2020, 14:38:40.871 2020-09-30T06:38:40.871+0000 I REPL [replication-343] Canceling oplog query due to OplogQueryMetadata. We have to choose a new sync source. Current source: mongodb-0.mongodb-internal.dp-common-database.svc.cluster.local:27017, OpTime { ts: Timestamp(1601448015, 5), t: 19 }, its sync source index:-1
2020-09-30T06:38:40.871+0000 I REPL [replication-343] Choosing new sync source because our current sync source, mongodb-0.mongodb-internal.dp-common-database.svc.cluster.local:27017, has an OpTime ({ ts: Timestamp(1601448015, 5), t: 19 }) which is not ahead of ours ({ ts: Timestamp(1601448015, 5), t: 19 }), it does not have a sync source, and it's not the primary (sync source does not know the primary)
[replexec-7147] Starting an election, since we've seen no PRIMARY in the past 10000ms

Related

Primary election isn't done after primary is killed on a MongoDB Cluster

I try to test fail over scenario of a mongoDB cluster. When I stopped the primary, I don't see any new primary election on my Java code's logs, and read/write operations are ignore and getting following:
No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=MULTIPLE, serverDescriptions=[ServerDescription{address=mongo1:30001, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused (Connection refused)}}, ServerDescription{address=mongo2:30002, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=3215664, setName='rs0', canonicalAddress=mongo2:30002, hosts=[mongo1:30001], passives=[mongo2:30002, mongo3:30003], arbiters=[], primary='null', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Fri Mar 26 02:08:27 CET 2021, lastUpdateTimeNanos=91832460163658}, ServerDescription{address=mongo3:30003, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=3283858, setName='rs0', canonicalAddress=mongo3:30003, hosts=[mongo1:30001], passives=[mongo2:30002, mongo3:30003], arbiters=[], primary='null', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Fri Mar 26 02:08:27 CET 2021, lastUpdateTimeNanos=91832459878686}]}. Waiting for 30000 ms before timing out
I am using the following config:
var cfg = {
"_id": "rs0",
"protocolVersion": 1,
"version": 1,
"members": [
{
"_id": 0,
"host": "mongo1:30001",
"priority": 4
},
{
"_id": 1,
"host": "mongo2:30002",
"priority": 3
},
{
"_id": 2,
"host": "mongo3:30003",
"priority": 2,
}
]
};
rs.initiate(cfg, { force: true });
rs.secondaryOk();
db.getMongo().setReadPref('primary');
rs.isMaster() returns this:
{
"hosts" : [
"mongo1:30001"
],
"passives" : [
"mongo2:30002",
"mongo3:30003"
],
"setName" : "rs0",
"setVersion" : 1,
"ismaster" : true,
"secondary" : false,
"primary" : "mongo1:30001",
"me" : "mongo1:30001",
"electionId" : ObjectId("7fffffff0000000000000017"),
"lastWrite" : {
"opTime" : {
"ts" : Timestamp(1616719738, 1),
"t" : NumberLong(23)
},
"lastWriteDate" : ISODate("2021-03-26T00:48:58Z"),
"majorityOpTime" : {
"ts" : Timestamp(1616719738, 1),
"t" : NumberLong(23)
},
"majorityWriteDate" : ISODate("2021-03-26T00:48:58Z")
},
"maxBsonObjectSize" : 16777216,
"maxMessageSizeBytes" : 48000000,
"maxWriteBatchSize" : 100000,
"localTime" : ISODate("2021-03-26T00:49:08.019Z"),
"logicalSessionTimeoutMinutes" : 30,
"connectionId" : 28,
"minWireVersion" : 0,
"maxWireVersion" : 8,
"readOnly" : false,
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1616719738, 1),
"signature" : {
"hash" : BinData(0,"/+QXGSyYY+M/OXbZ1UixjrDOVz4="),
"keyId" : NumberLong("6942620613131370499")
}
},
"operationTime" : Timestamp(1616719738, 1)
}
Here what I see is hosts list has primary node and passives list have the secondries. I don't know when is the case that all nodes are considered under hosts in a cluster setup, so passives will be empty. The only related info I found is priority of the secondries should not be 0. Otherwise they won't be considered as candidate for the primary election.
"mongo1:30001"
],
"passives" : [
"mongo2:30002",
"mongo3:30003"
],...
From the docs:
isMaster.passives
An array of strings in the format of "[hostname]:[port]" listing all members of the replica set which have a members[n].priority of 0.
This field only appears if there is at least one member with a members[n].priority of 0.
Those nodes have been set to priority 0 somehow, and will therefore never attempt to become primary.

MongoDB - CPU utilisation goes beyond 70% on huge update

I have a mongodb collection test. An update operation is done on this collection every 30 seconds. Sometimes the cpu utilization goes above 80% and connection times out. Also when I enabled profiling and checked /var/log/mongodb/mongod.log, the time taken for the update operation to complete is more than 40 seconds. Update query is very big and it looks like this.
Update Query
test.update({_id: xxxxxxxx},{ $set: {
c_id: "5803a892b6646ad17b2b7a67", s_id: "58f29ee1d6ee0610543f152e",
c_id: "58f29a38c91379637619605c", m: "78a3512ad29c",
date: new Date(1505520000000), l.0.h.0.t: 0, l.0.m.0.0.t: 0,
l.0.h.0.t: 0, l.0.m.0.0.t: 0, l.0.h.0.r: 0, l.0.m.0.0.r: 0,
l.0.h.0.r: 0, l.0.m.0.0.r: 0, l.1.h.0.t: 0, l.1.m.0.0.t: 0,
l.1.h.0.t: 0, l.1.m.0.0.t: 0, l.1.h.0.r: 0, l.1.m.0.0.r: 0,
l.1.h.0.r: 0, l.1.m.0.0.r: 0, l.0.m.0.1.t: 0, l.0.m.0.1.t: 0,
l.0.m.0.1.rxb: 0, l.0.m.0.1.r: 0,
l.1.m.0.1.t: 0, l.1.m.0.1.t: 0, l.1.m.0.1.r: 0, l.1.m.0.1.r: 0,
l.0.m.0.2.t: 0,
.....................
.....................
.....................
.....................
.....................
}})
The query is very large, I have only posted a part of the query and the schema is also very much involved. How can we improve the performance of this update query? Should I optimize schema to reduce the number of nested documents?
I would like to know what are steps I can take to improve this update query.
.explain() output for Update Query
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.stats",
"indexFilterSet" : false,
"winningPlan" : {
"stage" : "UPDATE",
"inputStage" : {
"stage" : "IDHACK"
}
},
"rejectedPlans" : []
},
"serverInfo" : {
"host" : "abcxyz",
"port" : 27017,
"version" : "3.4.4",
"gitVersion" : "988390515874a9debd1b6c5d36559ca86b4babj"
},
"ok" : 1.0
}

MongoDB explains totalKeysExamined more than limit

I have a very large collection (millions of documents) with data which looks like:
u'timestamp': 1481454871423.0,
u'_id': ObjectId('584d351772c4d8106cc43116'),
u'data': {
...
},
u'geocode': [{u'providerId': 2, u'placeId': 97459515},
{u'providerId': 3, u'placeId': 237},
{u'providerId': 3, u'placeId': 3}]
I want a query which targets a providerId and placeId pair, and returns 10 records only, within a timestamp range.
To that end I perform queries like:
'geocode.providerId': 3,
'geocode.placeId': 3
'timestamp': { '$gte': 1481454868360L,
'$lt': 1481454954839L }
And I provide a hint, to target the index which looks like:
[('geocode.providerId', 1), ('geocode.placeId', 1), ('timestamp', 1)]
where 1 is ascending. Before iterating over the returned cursor, it is limited to 10 records and sorted ascending on timestamp (which should be it's default state due to the index).
A pymongo query looks like:
collection.find(findDic).hint(hint).sort([('timestamp', pymongo.ASCENDING)]).skip(0).limit(10)
The query explains come back looking like:
{
u'executionStats': {
u'executionTimeMillis': 1270,
u'nReturned': 10,
u'totalKeysExamined': 568686,
u'allPlansExecution': [],
u'executionSuccess': True,
u'executionStages': {
u'needYield': 0,
u'saveState': 4442,
u'memUsage': 54359,
u'restoreState': 4442,
u'memLimit': 33554432,
u'isEOF': 1,
u'inputStage': {
u'needYield': 0,
u'saveState': 4442,
u'restoreState': 4442,
u'isEOF': 1,
u'inputStage': {
u'needYield': 0,
u'docsExamined': 284964,
u'saveState': 4442,
u'restoreState': 4442,
u'isEOF': 1,
u'inputStage': {
u'saveState': 4442,
u'isEOF': 1,
u'seenInvalidated': 0,
u'keysExamined': 568686,
u'nReturned': 284964,
u'invalidates': 0,
u'keyPattern': {u'geocode.providerId': 1,
u'timestamp': 1,
u'geocode.placeId': 1},
u'isUnique': False,
u'needTime': 283722,
u'isMultiKey': True,
u'executionTimeMillisEstimate': 360,
u'dupsTested': 568684,
u'restoreState': 4442,
u'direction': u'forward',
u'indexName': u'geocode.providerId_1_geocode.placeId_1_timestamp_1',
u'isSparse': False,
u'advanced': 284964,
u'stage': u'IXSCAN',
u'dupsDropped': 283720,
u'needYield': 0,
u'isPartial': False,
u'indexBounds': {u'geocode.providerId': [u'[3, 3]'
],
u'timestamp': [u'[-inf.0, 1481455513378)'
],
u'geocode.placeId': [u'[MinKey, MaxKey]'
]},
u'works': 568687,
u'indexVersion': 1,
},
u'nReturned': 252823,
u'needTime': 315863,
u'filter': {u'$and': [{u'geocode.placeId': {u'$eq': 3}},
{u'timestamp': {u'$gte': 1481405886510L}}]},
u'executionTimeMillisEstimate': 970,
u'alreadyHasObj': 0,
u'invalidates': 0,
u'works': 568687,
u'advanced': 252823,
u'stage': u'FETCH',
},
u'nReturned': 0,
u'needTime': 315864,
u'executionTimeMillisEstimate': 1150,
u'invalidates': 0,
u'works': 568688,
u'advanced': 0,
u'stage': u'SORT_KEY_GENERATOR',
},
u'nReturned': 10,
u'needTime': 568688,
u'sortPattern': {u'timestamp': 1},
u'executionTimeMillisEstimate': 1200,
u'limitAmount': 10,
u'invalidates': 0,
u'works': 568699,
u'advanced': 10,
u'stage': u'SORT',
},
u'totalDocsExamined': 284964,
},
u'queryPlanner': {
u'parsedQuery': {u'$and': [{u'geocode.placeId': {u'$eq': 3}},
{u'geocode.providerId': {u'$eq': 3}},
{u'timestamp': {u'$lt': 1481455513378L}},
{u'timestamp': {u'$gte': 1481405886510L}}]},
u'rejectedPlans': [],
u'namespace': u'mxp957.tweet_244de17a-aa75-4da9-a6d5-97b9281a3b55',
u'winningPlan': {
u'sortPattern': {u'timestamp': 1},
u'inputStage': {u'inputStage': {u'filter': {u'$and': [{u'geocode.placeId': {u'$eq': 3}},
{u'timestamp': {u'$gte': 1481405886510L}}]},
u'inputStage': {
u'direction': u'forward',
u'indexName': u'geocode.providerId_1_geocode.placeId_1_timestamp_1',
u'isUnique': False,
u'isSparse': False,
u'isPartial': False,
u'indexBounds': {u'geocode.providerId': [u'[3, 3]'],
u'timestamp': [u'[-inf.0, 1481455513378)'
],
u'geocode.placeId': [u'[MinKey, MaxKey]'
]},
u'isMultiKey': True,
u'stage': u'IXSCAN',
u'indexVersion': 1,
u'keyPattern': {u'geocode.providerId': 1,
u'timestamp': 1,
u'geocode.placeId': 1},
}, u'stage': u'FETCH'},
u'stage': u'SORT_KEY_GENERATOR'},
u'limitAmount': 10,
u'stage': u'SORT',
},
u'indexFilterSet': False,
u'plannerVersion': 1,
},
u'ok': 1.0,
u'serverInfo': {
u'host': u'rabbit',
u'version': u'3.2.11',
u'port': 27017,
u'gitVersion': u'009580ad490190ba33d1c6253ebd8d91808923e4',
},
}
I don't understand why all of these documents need to be examined. In the case above, the size of the collection is only 284587 which means that every record was looked at twice! I want totalKeysExamined to only be 10, but am struggling to see how to achieve this.
I am using MongoDB version 3.2.11 and pymongo.
As Astro mentioned, the issue is that MongoDB was not using the index effectively.
MongoDB team say that this is resolved in later versions:
https://jira.mongodb.org/browse/SERVER-27386
Also an option is to remove providerId from the index. In my use case, providerId is one of two values, and most of the time will always be the same value. It represents which API was used to geocode; my system only supports two, and only has one enabled at any one time.
See the commit that would resolve this:
https://github.com/watfordxp/GeoTweetSearch/commit/420536e4a138fb22e0dd0e61ef9c83c23a9263c1

MongoDB hidden secondary stuck in startup?

I am creating a hidden secondary MongoDB instance that will eventually be used for reporting. So far I have taken these steps:
Started up my primary instance (local machine) with replSet = mySet and called rs.initiate()
Started up my secondary instance with with replSet = mySet
Called rs.add("my.secondary.com") from my primary instance
set priority = 0 and hidden = true for the secondary member using rs.reconfigure(cfg)
When I do this and call rs.status() I get the following output:
{
"set": "mySet",
"date": ISODate("2016-03-22T16:40:39.515Z"),
"myState": 1,
"members": [
{
"_id": 0,
"name": "my-machine.local:27017",
"health": 1,
"state": 1,
"stateStr": "PRIMARY",
"uptime": 607,
"optime": Timestamp(1458664559, 1),
"optimeDate": ISODate("2016-03-22T16:35:59Z"),
"electionTime": Timestamp(1458664264, 2),
"electionDate": ISODate("2016-03-22T16:31:04Z"),
"configVersion": 3,
"self": true
},
{
"_id": 1,
"name": "my.secondary.com:27017",
"health": 1,
"state": 0,
"stateStr": "STARTUP",
"uptime": 384,
"optime": Timestamp(0, 0),
"optimeDate": ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat": ISODate("2016-03-22T16:40:38.332Z"),
"lastHeartbeatRecv": ISODate("1970-01-01T00:00:00Z"),
"pingMs": 106,
"configVersion": -2
}
],
"ok": 1
}
Notice that stateStr for my secondary is STARTUP - this never changes and the data never replicates. In a previous attempt I also called rs.iniate() on my secondary, but that made what was intended to be the secondary become the primary. I had to blow everything away and start again.
Why is my secondary stuck in STARTUP and how can I get my data to begin replicating from my primary to my secondary?
Here is checklist from my black book:) compare your steps, it should go without a glitch.
(assuming you initiated mongodb instances with --replSet flag)
// rs.initiate()
// rs.add("host-1:29001")
// rs.add("host-2:30001")
// rs.add("host-n:40001")
// var cfg = rs.config()
// cfg.members[2].priority = 0
// cfg.members[2].hidden = true
// rs.reconfig(cfg)

MongoDB: How to disable logging the warning: ClientCursor::staticYield can't unlock b/c of recursive lock?

I get the warning stated in the title
warning: ClientCursor::staticYield can't unlock b/c of recursive lock
ns....
in the log file for literally gazillion of times (the log file reaches 200 GB in size in a single day with this single log message). As mentioned in this SO question, I want to adopt the "solution" of simply ignoring the message.
What I did (to no avail) to stop it is to:
set the param quiet = true
set the param oplog = 0
set the param logpath=/dev/null (hoping that nothing gets logged anymore)
set the param logappend=false
All of the above are useless - the message still floods the log file.
The solution I use now is to run a cron job every night to simply empty that log file.
Please, is there anything else I can try?
I use MongoDB 2.6.2 on a Debian 6.0 while programming it from Perl
I have recently been looking into this error myself as I was seeing 25Gb a month generated from mongod.log with a similar message. However, I noticed that a query was included in the log message (I've formatted the message to fit in this post, it was actually all on one line):
warning: ClientCursor::yield can't unlock b/c of recursive lock ns: my-database.users top:
{
opid: 1627436260,
active: true,
secs_running: 0,
op: "query",
ns: "my-database",
query:
{
findAndModify: "users",
query: { Registered: false, Completed: 0 },
sort: { Created: 1 },
update: { $set: { NextRefresh: "2014-12-07" } },
new: true
},
client: "10.1.34.175:53582",
desc: "conn10906412",
threadId: "0x7f824f0f9700",
connectionId: 10906412,
locks: { ^: "w", ^my-database: "W" },
waitingForLock: false,
numYields: 0,
lockStats: { timeLockedMicros: {}, timeAcquiringMicros: { r: 0, w: 3 } }
}
A bit of Googling revealed that this message is most commonly raised when the query cannot use any indexes. I tried using .explain() with the query in the log and sure enough it showed that a BasicCursor was being used with no index:
db.users.find( { Registered: false, Completed: 0 } ).sort( { Created: 1 } ).explain()
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 10453,
"nscanned" : 10453,
"nscannedObjectsAllPlans" : 10453,
"nscannedAllPlans" : 10453,
"scanAndOrder" : true,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 7,
"indexBounds" : {
},
"server" : "mongodb-live.eu-west-1a.10_1_2_213:27017"
}
Adding an index for the elements in the query fixed the issue. The log was no longer generated and when I ran .explain() again it showed an index being used:
{
"cursor" : "BtreeCursor Registered_1_Completed_1",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 0,
"nscanned" : 0,
"nscannedObjectsAllPlans" : 0,
"nscannedAllPlans" : 1,
"scanAndOrder" : true,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"Registered" : [
[
false,
false
]
],
"Completed" : [
[
0,
0
]
]
},
"server" : "mongodb-live.eu-west-1a.10_1_2_213:27017"
}