User Assertion: 1: Update query failed -- RUNNER_DEAD - mongodb

We are using MongoDB (v2.6.4) to process some data and everything works great except, once in a while, we get a weird RUNNER_DEAD exception...
MongoDB.Driver.WriteConcernException: WriteConcern detected an error ' Update query failed -- RUNNER_DEAD'. (Response was { "lastOp" : { "$timestamp" : NumberLong("6073471510486450182") }, "connectionId" : 49, "err" : " Update query failed -- RUNNER_DEAD", "code" : 1, "n" : 0, "ok" : 1.0 }).
This is the method that causes the exception:
private void UpdateEntityClassName(EntityClassName myEntity) {
var dateTimeNow = DateTime.UtcNow;
var update = Update<EntityClassName>.Set(p => p.Data, myEntity.Data)
...some more Sets...
.Set(p => p.MetaData.LastModifiedDateTime, dateTimeNow);
var result = _myCollection.Update(Query.EQ("_id", myEntity.Identifier), update, UpdateFlags.Upsert);
}
Exception in MongoDB log:
2014-10-23T13:51:29.989-0500 [conn45] update Database.Table query: { _id: "SameID" } update: { $set: { Data: BinData(0, SomeData...), ...more fields... MetaData.LastModifiedDateTime: new Date(1414090294910) } } nmoved:1 nMatched:1 nModified:1 keyUpdates:0 numYields:0 locks(micros) w:2344 2ms
2014-10-23T13:51:29.989-0500 [conn49] User Assertion: 1: Update query failed -- RUNNER_DEAD
2014-10-23T13:51:29.989-0500 [conn46] update Database.Table query: { _id: "SameID" } update: { $set: { Data: BinData(0, SomeData...), ...more fields... MetaData.LastModifiedDateTime: new Date(1414090294926) } } nMatched:1 nModified:1 fastmod:1 keyUpdates:0 numYields:0 locks(micros) w:249 0ms
2014-10-23T13:51:29.989-0500 [conn49] update Database.Table query: { _id: "SameID" } update: { $set: { Data: BinData(0, SomeData...), ...more fields... MetaData.LastModifiedDateTime: new Date(1414090294864) } } nModified:0 keyUpdates:0 exception: Update query failed -- RUNNER_DEAD code:1 numYields:1 locks(micros) w:285 8ms
I found very little documentation about this exception so any help appreciated.
We are running this in a 3 machine replica set if that changes anything.
We've been running this code for a while and we didn't have that issue before (in our original tests) so we went back to MongoDB 2.4.9 (the one we first tested on) and we don't get this exception anymore. Any ideas as to what might have changed that causes this exception?

Why you couldn't use a regular "update" using capped arrays to
limit the size of the array of queries rather than using some custom
logic).
If you have multiple threads that are doing the same thing, your code
doesn't appear thread-safe - let's say that two threads try to update
the same object with _id XYZ but with different changes. Both fetch
the object, both add a new attribute/value to the array and now both
call save - the first one saves, but the second one's save overwrites
the first one.
But that's not likely to be related to your error with RUNNER_DEAD
error - that's more likely a case where either something is killing
the operation or dropping the collection you're writing to (or the
index being used).
Source: #Asya Kamsky's post.

Related

How I can debug MongoDB slow chunk migration?

I'm trying to move chunk inside the cluster:
mongos>db.adminCommand({ moveChunk: "db.col", find: {_id: ObjectId("58171b29b9b4ebfb3e8b4e42")}, to: "shard_v2"});
{ "millis" : 428681, "ok" : 1 }
In log I see following record:
2016-11-08T20:27:05.972+0300 I SHARDING [conn27] moveChunk migrate
commit accepted by TO-shard: { active: false, ns: "db.col", from:
"host:27017", min: { _id: ObjectId('58171b29b9b4ebfb3e8b4e42') }, max:
{ _id: ObjectId('58171f29b9b4eb31408b4b4c') }, shardKeyPattern: { _id:
1.0 }, state: "done", cc, ok: 1.0 }
So I have 23MB of data migrated in 430 sec. It is really slow.
I've uploaded a sample file to "host" and it was uploaded extremely fast (7-8MB per sec), so I do not think it is disk or network issue (cluster also does not have any load (no active queries)). What else I can check to improve chunk migration perfomance?
The performance most certainly is not limited by your setup. It may be MongoDbs migration policy that tries not to effect the normal database tasks.
There is a great answer on this issue on DBA stack exchange: https://dba.stackexchange.com/questions/81545/mongodb-shard-chunk-migration-500gb-takes-13-days-is-this-slow-or-normal

Replica set read preference nearest is still slow

NOTICE: I've also posted this to dba.stackexchange.com. I'm not sure where this question belongs. If it's not here, tell me, and i'll delete it.
I'm testing my replica set, in particular the read preference, and i'm still getting slow reads even with a nearest read preference set.
For the purpose of this question, we can just assume there are 2 mongodb instances (there are in fact 3). PRIMARY is in Amsterdam (AMS). SECONDARY is in Singapore (SG).
I also have 2 application servers in those 2 locations where I am running my test scripts (node+mongoose).
From the AMS app server (so low latency with PRIMARY), if I run a
simple find query, I get a response in under a second.
However, If I run the same query from my app server in SG, I get response times of ~4-7 seconds.
If I just connect to the SG SECONDARY from SG app server, my query times drop to <1s, similar to (1).
Going back to a standard rep set setting (with nearest), and if I look at the logs, I've noticed that if I send a query to SG using 'nearest', i can see the query in there, but I also see an entry for that same query (but fewer lines) in the PRIMARY log. But it is interesting that there is always an entry in the PRIMARY log even when querying the SECONDARY. I'm not sure if that is somehow related.
So, if I connect directly to the nearest machine, I get a response <1s, but when using the replica set, unless i'm next to the PRIMARY, responses times are >4s.
My question is then, why? Have I set up my replica server incorrectly. Is it a problem on the client side (mongoose/mongodb), or is it in fact working as it is mean to, and i've misunderstood how it works under the hood?
Here are my files (apologies for the wall of text):
test.js
mongoose.connect(configDB.url);
var start = new Date().getTime();
Model.find({})
.exec(function(err, betas){
var end = new Date().getTime();
var time = end - start;
console.log(time/1000);
console.log('finished');
console.log(betas.length);
});
config, also tried with server and replSet options
module.exports = {
'url' : 'user:pwd#ip-primary/db,user:pwd#ip-secondary/db,user:pwd#ip-secondary/db'
}
Betas model
var betaSchema = mongoose.Schema({
.. some fields
}, { read: 'n' });
And the log output from doing a read query as above from the SG app server:
LOG OF PRIMARY:
2015-09-16T07:49:23.120-0400 D COMMAND [conn12520] run command db.$cmd { listIndexes: "betas", cursor: {} }
2015-09-16T07:49:23.120-0400 I COMMAND [conn12520] command db.$cmd command: listIndexes { listIndexes: "betas", cursor: {} } keyUpdates:0 writeConflicts:0 numYields:0 reslen:296 locks:{ Global: { acquireC
ount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { R: 1 } } } 0ms
LOG OF SECONDARY
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Running query:
ns=db.betas limit=1000 skip=0
Tree: $and
Sort: {}
Proj: {}
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] Running query: query: {} sort: {} projection: {} skip: 0 limit: 1000
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Beginning planning...
=============================
Options = INDEX_INTERSECTION KEEP_MUTATIONS
Canonical query:
ns=db.betas limit=1000 skip=0
Tree: $and
Sort: {}
Proj: {}
=============================
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Index 0 is kp: { _id: 1 } io: { v: 1, key: { _id: 1 }, name: "_id_", ns: "db.betas" }
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Index 1 is kp: { email: 1 } io: { v: 1, unique: true, key: { email: 1 }, name: "email_1", ns: "db.betas", background: true, safe: null }
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Rated tree:
$and
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Planner: outputted 0 indexed solutions.
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Planner: outputting a collscan:
COLLSCAN
---ns = db.betas
---filter = $and
---fetched = 1
---sortedByDiskLoc = 0
---getSort = []
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] Only one plan is available; it will be run but will not be cached. query: {} sort: {} projection: {} skip: 0 limit: 1000, planSummary: COLLSCAN
2015-09-16T07:49:19.368-0400 D QUERY [conn11831] [QLOG] Not caching executor but returning 109 results.
2015-09-16T07:49:19.368-0400 I QUERY [conn11831] query db.betas planSummary: COLLSCAN ntoreturn:1000 ntoskip:0 nscanned:0 nscannedObjects:109 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:109 resl
en:17481 locks:{ Global: { acquireCount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { R: 1 } } } 0ms
The information in your output states that the database server is processing the query quickly. So this issue will likely lie outside of the database itself, probably in the client.
Are you running the same query multiple times and timing each execution?
I suspect that this may be due to some initial discovery on your MongoDB client's part - how is it to know what is the nearest before responding if it doesn't initially hit every node and time the responses?

Log only errors in MongoDB logs

Is there any options for only logging the errors in MongoDB log files?
With the current configuration, it seems that every request to Mongo server is logged in log files:
Wed Sep 17 08:08:07.030 [conn117] insert my_database.myCol ninserted:1 keyUpdates:0 locks(micros) w:243505 285ms
Wed Sep 17 08:08:54.447 [conn101] command anotherDatabase.$cmd command: { findandmodify: "myCol", query: { ... }, new: 0, remove: 0, upsert: 0, fields: {...}, update: {...} } update: {...} ntoreturn:1 idhack:1 nupdated:1 fastmod:1 keyUpdates:0 locks(micros) w:124172 reslen:248 124ms
Wed Sep 17 08:10:24.370 [conn95] command my_database.$cmd command: { count: "cms.myCol", query: { ... }, fields: null } ntoreturn:1 keyUpdates:0 locks(micros) r:197368 reslen:48 197ms
...
The current configuration is:
# mongodb.conf
dbpath=/var/lib/mongodb
logpath=/var/log/mongodb/mongodb.log
logappend=true
How can be the configuration updated to only log errors?
Running Mongo shell version: 2.4.10:
$ mongo --version
MongoDB shell version: 2.4.10
Appending quiet=true will reduce a lot of output.
Perhaps it is impossible to avoid any output information except error on current stage.
Appending slowms=threshold to configuration file can reduce normal log output further.
threshold is a integer value (milliseconds). It means if one operation duration doesn't exceed this value, normal log information won't output. The default value is 100.
Also, you can change this value by another way if the instance is running.
var slowms = theValueYouWant;
var level = db.getProfilingStatus().was;
db.setProfilingLevel(level, slowms);

mongodb status of index creation job

I'm using MongoDB and have a collection with roughly 75 million records.
I have added a compound index on two "fields" by using the following command:
db.my_collection.ensureIndex({"data.items.text":1, "created_at":1},{background:true}).
Two days later I'm trying to see the status of the index creation. Running db.currentOp() returns {}, however when I try to create another index I get this error message:
cannot add index with a background operation in progress.
Is there a way to check the status/progress of the index creation job?
One thing to add - I am using mongodb version 2.0.6. Thanks!
At the mongo shell, type below command to see the current progress:
rs0:PRIMARY> db.currentOp(true).inprog.forEach(function(op){ if(op.msg!==undefined) print(op.msg) })
Index Build (background) Index Build (background): 1431577/55212209 2%
To do a real-time running status log:
> while (true) { db.currentOp(true).inprog.forEach(function(op){ if(op.msg!==undefined) print(op.msg) }); sleep(1000); }
Index Build: scanning collection Index Build: scanning collection: 43687948/47760207 91%
Index Build: scanning collection Index Build: scanning collection: 43861991/47760228 91%
Index Build: scanning collection Index Build: scanning collection: 44993874/47760246 94%
Index Build: scanning collection Index Build: scanning collection: 45968152/47760259 96%
You could use currentOp with a true argument which returns a more verbose output, including idle connections and system operations.
db.currentOp(true)
... and then you could use db.killOp() to Kill the desired operation.
The following should print out index progress:
db
.currentOp({"command.createIndexes": { $exists : true } })
.inprog
.forEach(function(op){ print(op.msg) })
outputs:
Index Build (background) Index Build (background): 5311727/27231147 19%
Unfortunately, DR9885 answer didn't work for me, it has spaces in the code (syntax error) and even if the spaces are removed, it returns nothing.
This works as of Mongo Shell v3.6.0
db.currentOp().inprog.forEach(function(op){ if(op.msg) print(op.msg) })
Didn't read Bajal answer until after I posted mine, but it's almost exactly the same except that it's slightly shorter code and also works.
I like:
db.currentOp({
'msg' :{ $exists: true },
'command': { $exists: true },
$or: [
{ 'command.createIndexes': { $exists: true } },
{ 'command.reIndex': { $exists: true } }
]
}).inprog.forEach(function(op) {
print(op.msg);
});
Output example:
Index Build Index Build: 84826/335739 25%
Documentation suggests:
db.adminCommand(
{
currentOp: true,
$or: [
{ op: "command", "command.createIndexes": { $exists: true } },
{ op: "none", "msg" : /^Index Build/ }
]
}
)
Active Indexing Operations example.
Simple one to just check progress of a single index going on:
db.currentOp({"msg":/Index/}).inprog[0].progress;
outputs:
{ "done" : 86007212, "total" : 96868386 }
Find progress of index jobs, nice one liner:
> db.currentOp().inprog.map(a => a.msg)
[
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
"Index Build: scanning collection Index Build: scanning collection: 16448156/54469342 30%",
undefined,
undefined
]

mongoDB log files getlasterror: 1, w: "com.mongodb.WriteConcern.REPLICAS_SAFE"

We have mongoDB master slave configuration. I updated spring mongoDB template driver to be replica aware with property com.mongodb.WriteConcern.REPLICAS_SAFE. I am getting following error in the log file.
[conn2788271] runQuery called xxx.$cmd { getlasterror: 1, w: "com.mongodb.WriteConcern.REPLICAS_SAFE", wtimeout: 0 }
[conn2786380] User Assertion: 14830:unrecognized getLastError mode: com.mongodb.WriteConcern.REPLICAS_SAFE
[conn2786380] command xxx.$cmd command: { getlasterror: 1, w: "com.mongodb.WriteConcern.REPLICAS_SAFE", wtimeout: 0 } ntoreturn:1 keyUpdates:0 reslen:182 0ms
Seems that error is coming from this file
https://bitbucket.org/wesc/debian-mongodb/src/97846fbc9d31/mongodb-2.0.2/db/repl_block.cpp
Any clues if I am missing something?
Make sure you're using the integer 1 and not the String "1".
Please check this: https://jira.mongodb.org/browse/JAVA-464