mongoDB log files getlasterror: 1, w: "com.mongodb.WriteConcern.REPLICAS_SAFE" - mongodb

We have mongoDB master slave configuration. I updated spring mongoDB template driver to be replica aware with property com.mongodb.WriteConcern.REPLICAS_SAFE. I am getting following error in the log file.
[conn2788271] runQuery called xxx.$cmd { getlasterror: 1, w: "com.mongodb.WriteConcern.REPLICAS_SAFE", wtimeout: 0 }
[conn2786380] User Assertion: 14830:unrecognized getLastError mode: com.mongodb.WriteConcern.REPLICAS_SAFE
[conn2786380] command xxx.$cmd command: { getlasterror: 1, w: "com.mongodb.WriteConcern.REPLICAS_SAFE", wtimeout: 0 } ntoreturn:1 keyUpdates:0 reslen:182 0ms
Seems that error is coming from this file
https://bitbucket.org/wesc/debian-mongodb/src/97846fbc9d31/mongodb-2.0.2/db/repl_block.cpp
Any clues if I am missing something?

Make sure you're using the integer 1 and not the String "1".
Please check this: https://jira.mongodb.org/browse/JAVA-464

Related

MongoDB: Lost dry run election due to internal error

I am trying to solve a Replication issue here in MongoDB 4.2, but I can no longer get my Primary server back to primary, it always stays in SECONDARY mode. I've tried several types of solutions, like this Or here
But without success.
In the logs, I came across these messages:
Scheduling remote command request for vote request: RemoteCommand 729 -- target:mongo2:27021 db:admin cmd:{ replSetRequestVotes: 1, setName: "MyReplicaSet", dryRun: true, term: 2, candidateIndex: 0, configVersion: 125668, lastCommittedOp: { ts: Timestamp(1570799675, 1), t: 2 } }
2019-10-11T11:17:25.041-0300 I ELECTION [replexec-2] VoteRequester(term 2 dry run) received an invalid response from mongo2:27021: NotYetInitialized: no replset config has been received; response message: { operationTime: Timestamp(0, 0), ok: 0.0, errmsg: "no replset config has been received", code: 94, codeName: "NotYetInitialized", $clusterTime: { clusterTime: Timestamp(1570799675, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } } }
2019-10-11T11:17:25.041-0300 I ELECTION [replexec-2] not running for primary, we received insufficient votes
2019-10-11T11:17:25.041-0300 I ELECTION [replexec-2] Lost dry run election due to internal error
How to fix this?
It looks like this problem was fixed in mongodb 4.2.10: https://jira.mongodb.org/browse/SERVER-47263

Mongodb Replication doesnt start

we are trying to move from mongo 2.4.9 to 3.4, we have a lot of data so we tried to set replication and wait while data will be synced and then swap primary.
Configurations done but when replication is initiated new server cant stabilize replication:
017-07-07T12:07:22.492+0000 I REPL [replication-1] Starting initial sync (attempt 10 of 10)
2017-07-07T12:07:22.501+0000 I REPL [replication-1] sync source candidate: mongo-2.blabla.com:27017
2017-07-07T12:07:22.501+0000 I STORAGE [replication-1] dropAllDatabasesExceptLocal 1
2017-07-07T12:07:22.501+0000 I REPL [replication-1] ******
2017-07-07T12:07:22.501+0000 I REPL [replication-1] creating replication oplog of size: 6548MB...
2017-07-07T12:07:22.504+0000 I STORAGE [replication-1] WiredTigerRecordStoreThread local.oplog.rs already started
2017-07-07T12:07:22.505+0000 I STORAGE [replication-1] The size storer reports that the oplog contains 0 records totaling to 0 bytes
2017-07-07T12:07:22.505+0000 I STORAGE [replication-1] Scanning the oplog to determine where to place markers for truncation
2017-07-07T12:07:22.519+0000 I REPL [replication-1] ******
2017-07-07T12:07:22.521+0000 I REPL [replication-1] Initial sync attempt finishing up.
2017-07-07T12:07:22.521+0000 I REPL [replication-1] Initial Sync Attempt Statistics: { failedInitialSyncAttempts: 9, maxFailedInitialSyncAttempts: 10, initialSyncStart: new Date(1499429233163), initialSyncAttempts: [ { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" } ] }
2017-07-07T12:07:22.521+0000 E REPL [replication-1] Initial sync attempt
failed -- attempts left: 0 cause: CommandNotFound: error while getting last
oplog entry for begin timestamp: no such cmd: find
2017-07-07T12:07:22.521+0000 F REPL [replication-1] The maximum number
of retries have been exhausted for initial sync.
2017-07-07T12:07:22.522+0000 E REPL [replication-0] Initial sync failed,
shutting down now. Restart the server to attempt a new initial sync.
2017-07-07T12:07:22.522+0000 I - [replication-0] Fatal assertion 40088 CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find at src/mongo/db/repl/replication_coordinator_impl.cpp 632
please assits guys, since we have more than 100G of data, so dump and restore will take a lot of downtime
Configurations:
3.4.5 new machine:
storage:
dbPath: /mnt/dbpath
journal:
enabled: true
engine: wiredTiger
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
net:
port: 27017
replication:
replSetName: prodTest
2.4.9 old machine with data:
dbpath=/var/lib/mongodb
logpath=/var/log/mongodb/mongodb.log
logappend=true port = 27017
the task have been solved in such way:
-create replica master-v2.4, 3 slaves-v2.6
-stop app, step down master
-stop new master and upgrade mongo version to v3.0,
start master and upgrade slaves sequentually to 3.2(slave db files
removed new version started on wiredTiger engine)
-step down master, upgrade all slaves to 3.4
This process become very fast because replica slave recovery of 40G db takes around 30m.

User Assertion: 1: Update query failed -- RUNNER_DEAD

We are using MongoDB (v2.6.4) to process some data and everything works great except, once in a while, we get a weird RUNNER_DEAD exception...
MongoDB.Driver.WriteConcernException: WriteConcern detected an error ' Update query failed -- RUNNER_DEAD'. (Response was { "lastOp" : { "$timestamp" : NumberLong("6073471510486450182") }, "connectionId" : 49, "err" : " Update query failed -- RUNNER_DEAD", "code" : 1, "n" : 0, "ok" : 1.0 }).
This is the method that causes the exception:
private void UpdateEntityClassName(EntityClassName myEntity) {
var dateTimeNow = DateTime.UtcNow;
var update = Update<EntityClassName>.Set(p => p.Data, myEntity.Data)
...some more Sets...
.Set(p => p.MetaData.LastModifiedDateTime, dateTimeNow);
var result = _myCollection.Update(Query.EQ("_id", myEntity.Identifier), update, UpdateFlags.Upsert);
}
Exception in MongoDB log:
2014-10-23T13:51:29.989-0500 [conn45] update Database.Table query: { _id: "SameID" } update: { $set: { Data: BinData(0, SomeData...), ...more fields... MetaData.LastModifiedDateTime: new Date(1414090294910) } } nmoved:1 nMatched:1 nModified:1 keyUpdates:0 numYields:0 locks(micros) w:2344 2ms
2014-10-23T13:51:29.989-0500 [conn49] User Assertion: 1: Update query failed -- RUNNER_DEAD
2014-10-23T13:51:29.989-0500 [conn46] update Database.Table query: { _id: "SameID" } update: { $set: { Data: BinData(0, SomeData...), ...more fields... MetaData.LastModifiedDateTime: new Date(1414090294926) } } nMatched:1 nModified:1 fastmod:1 keyUpdates:0 numYields:0 locks(micros) w:249 0ms
2014-10-23T13:51:29.989-0500 [conn49] update Database.Table query: { _id: "SameID" } update: { $set: { Data: BinData(0, SomeData...), ...more fields... MetaData.LastModifiedDateTime: new Date(1414090294864) } } nModified:0 keyUpdates:0 exception: Update query failed -- RUNNER_DEAD code:1 numYields:1 locks(micros) w:285 8ms
I found very little documentation about this exception so any help appreciated.
We are running this in a 3 machine replica set if that changes anything.
We've been running this code for a while and we didn't have that issue before (in our original tests) so we went back to MongoDB 2.4.9 (the one we first tested on) and we don't get this exception anymore. Any ideas as to what might have changed that causes this exception?
Why you couldn't use a regular "update" using capped arrays to
limit the size of the array of queries rather than using some custom
logic).
If you have multiple threads that are doing the same thing, your code
doesn't appear thread-safe - let's say that two threads try to update
the same object with _id XYZ but with different changes. Both fetch
the object, both add a new attribute/value to the array and now both
call save - the first one saves, but the second one's save overwrites
the first one.
But that's not likely to be related to your error with RUNNER_DEAD
error - that's more likely a case where either something is killing
the operation or dropping the collection you're writing to (or the
index being used).
Source: #Asya Kamsky's post.

MongoDs in ReplSet won't start after trying out some MapReduce

I was practicing some MapReduce inside of my Primary's mongo shell when it suddenly became a Secondary. I SSHed into the two other VM's with the other secondaries, and discovered that the mongod's had been rendered inoperaple. I killed them and I issued the mongod --config /etc/mongod.conf to kick them off and I entered the mongo shell. After a few seconds they were interrupted with:
2014-09-14T22:29:54.142-0500 DBClientCursor::init call() failed
2014-09-14T22:29:54.143-0500 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-09-14T22:29:54.143-0500 warning: Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
2014-09-14T22:29:54.143-0500 reconnect 127.0.0.1:27017 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:27017 (127.0.0.1), connection attempt failed
>
This is from their (the two original secondaries in the replicaset) logs:
2014-09-14T22:09:21.879-0500 [rsBackgroundSync] replSet syncing to: vm-billing-001:27017
2014-09-14T22:09:21.880-0500 [rsSync] replSet still syncing, not yet to minValid optime 54165090:1
2014-09-14T22:09:21.882-0500 [rsBackgroundSync] replset setting syncSourceFeedback to vm-billing-001:27017
2014-09-14T22:09:21.886-0500 [rsSync] replSet SECONDARY
2014-09-14T22:09:21.886-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1_inc properties: { v: 1, key: { 0: 1 }, name: "_temp_0", ns: "test.tmp.mr.CCS.nonconforming_1_inc" }
2014-09-14T22:09:21.887-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.887-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1 properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "test.tmp.mr.CCS.nonconforming_1" }
2014-09-14T22:09:21.887-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.888-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1 properties: { v: 1, unique: true, key: { id: 1.0 }, name: "id_1", ns: "test.tmp.mr.CCS.nonconforming_1" }
2014-09-14T22:09:21.888-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.891-0500 [repl writer worker 2] ERROR: writer worker caught exception: :: caused by :: 11000 insertDocument :: caused by :: 11000 E11000 duplicate key error index: cisco.tmp.mr.CCS.nonconforming_1.$id_1 dup key: { : null } on: { ts: Timestamp 1410748561000|46, h: 9014687153249982311, v: 2, op: "i", ns: "cisco.tmp.mr.CCS.nonconforming_1", o: { _id: 14, value: 1.0 } }
2014-09-14T22:09:21.891-0500 [repl writer worker 2] Fatal Assertion 16360
2014-09-14T22:09:21.891-0500 [repl writer worker 2]
I can issue mongo --host ... --port ... from both of the two VMs that can't start the mongo to the original primary mongo, but I do see some connection refused notes above in the error log.
My original primary mongod can still be connected to in the mongo shell, but it is a primary. I can kill it and restart it and it will start up in secondary.
How can I roll back to the last known state and restart my replica set?

Log only errors in MongoDB logs

Is there any options for only logging the errors in MongoDB log files?
With the current configuration, it seems that every request to Mongo server is logged in log files:
Wed Sep 17 08:08:07.030 [conn117] insert my_database.myCol ninserted:1 keyUpdates:0 locks(micros) w:243505 285ms
Wed Sep 17 08:08:54.447 [conn101] command anotherDatabase.$cmd command: { findandmodify: "myCol", query: { ... }, new: 0, remove: 0, upsert: 0, fields: {...}, update: {...} } update: {...} ntoreturn:1 idhack:1 nupdated:1 fastmod:1 keyUpdates:0 locks(micros) w:124172 reslen:248 124ms
Wed Sep 17 08:10:24.370 [conn95] command my_database.$cmd command: { count: "cms.myCol", query: { ... }, fields: null } ntoreturn:1 keyUpdates:0 locks(micros) r:197368 reslen:48 197ms
...
The current configuration is:
# mongodb.conf
dbpath=/var/lib/mongodb
logpath=/var/log/mongodb/mongodb.log
logappend=true
How can be the configuration updated to only log errors?
Running Mongo shell version: 2.4.10:
$ mongo --version
MongoDB shell version: 2.4.10
Appending quiet=true will reduce a lot of output.
Perhaps it is impossible to avoid any output information except error on current stage.
Appending slowms=threshold to configuration file can reduce normal log output further.
threshold is a integer value (milliseconds). It means if one operation duration doesn't exceed this value, normal log information won't output. The default value is 100.
Also, you can change this value by another way if the instance is running.
var slowms = theValueYouWant;
var level = db.getProfilingStatus().was;
db.setProfilingLevel(level, slowms);