I'm having trouble with the following findOneAndUpdate MongoDB query:
planSummary: IXSCAN { id: 1 } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keysInserted:1 keysDeleted:1 numYields:0 reslen:3044791
locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 } }, Collection: { acquireCount: { w: 1 } } }
storage:{} protocol:op_query 135ms
writeConcern: { w: 0, j: false }
As you can see it has execution time of +100 ms. The query part uses an index and takes less than 1ms (using 'Explain query'). So it's the write part that is slow.
The Mongo instance is the master of a 3 member replica set. Write concern is set to 0 and journaling is disabled.
What could be the cause of the slow write? Could it be the update of indices?
MongoDB version 4.0
Driver: Node.js native mongodb version 3.2
Edit: I think it might be the length of the result. After querying a document smaller in size, the execution time is halved.
reslen:3044791
This was the source of the bad performance. Reducing this by adding a projection option to only return a specific field improved the execution from ~90ms on average to ~7ms.
Related
I have from time to time delayed requests from past few days from my single shard cluster database , but not sure where this comes from , everything seems to be fine on storage/cpu and memory , normally this queries based on index are executed in < 50ms even at 10000 requests/sec , any ideas are highly welcome?
Example delayed query:
2022-04-19T11:11:34.702+0200 I COMMAND [conn7420156] command testdb.testcol command: find { find: "testcol", readConcern: { level: "linearizable" }, filter: { a.p.prid: "37011" }, maxTimeMS: 16000, $db: "test", $clusterTime: { clusterTime: Timestamp(1650359489, 2054), signature: { hash: BinData(0, EFB437ED731ED3ADA61B61AAAD45EB516A82A8A3), keyId: 7036792998071370037 } }, lsid: { id: UUID("6e8e0cbb-a52e-4c92-a259-0e836bde61f3") } } nShards:1 cursorExhausted:1 numYields:0 nreturned:1 reslen:66731 protocol:op_msg 5196ms
I have 2 applications that are using listCollections to get infomation. Both are going through the C client but they are using different versions. In my Mongodb output I can see they are running slighly different statements on the server :-
2019-09-22T04:24:31.707+0000 I COMMAND [conn9] command datalake.$cmd command: listCollections { listCollections: 1, $readPreference: { mode: "secondaryPreferred" }, $db: "test" } numYields:0 reslen:333 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 0ms
2019-09-22T04:26:34.183+0000 I COMMAND [conn12] command datalake.$cmd command: listCollections { listCollections: 1.0, $readPreference: { mode: "secondaryPreferred" }, $db: "test" } numYields:0 reslen:333 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 0ms
The one difference I can see between the two is that one is passing in listCollections: 1.0 and the other passes in listCollections: 1.
Is there any way I can convert the log output above into a db.runCommand() that I can execute to see if I get different results ?
The one difference I can see between the two is that one is passing in listCollections: 1.0 and the other passes in listCollections: 1.
The value parameter for the listCollections command is irrelevant -- it exists to create a key/value pair when constructing a BSON document to pass as a parameter to runCommand but isn't used by listCollections.
All of the following are equivalent (although a numeric value makes more semantic sense):
db.runCommand({'listCollections': 1})
db.runCommand({'listCollections': 1.0})
db.runCommand({'listCollections': 'foo'})
If you are using MongoDB 4.0+, the document format can be useful to provide additional optional fields like filter and nameOnly.
Instead of passing a document as the first parameter for runCommand() in the shell, you can also just use provide the command name to run with the default options:
db.runCommand('listCollections')
Is there any way I can convert the log output above into a db.runCommand() that I can execute to see if I get different results ?
Equivalent commands in the mongo shell to your log output would be:
use datalake
db.getMongo().setReadPref('secondaryPreferred')
db.runCommand({'listCollections': 1})
db.runCommand({'listCollections': 1.0})
The listCollections queries will be processed identically. However, since you are using the secondaryPreferred read preference it is possible that this query might be returned from different members of your replica set when your two applications are running queries concurrently. Normally this will not be an issue, but if your use case requires strong consistency you should use a primary read preference instead.
As per MongoDb documentation the MongoDB shell command:
show dbs
Print a list of all databases on the server.
and
show databases
Print a list of all available databases.
I'm confused - from that what I read and understood these are not the same effect commands - right? show databases is not the alias of the show dbs?
There could be a database listed by show dbs which is not available and not listed by show databases is that right?
If so how it is possible that a database is on the server but is not available - access right of a user? is that what's behind show databases filtering?
I don't think there is a difference between the two commands. Both of the operations call the listDatabases command with the same option.
Increasing the log level, the show dbs command logged:
2018-11-30T15:40:59.539-0800 I COMMAND [conn23] command admin.$cmd appName: "MongoDB Shell" command: listDatabases { listDatabases: 1.0, $clusterTime: { clusterTime: Timestamp(1543621253, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "admin" } numYields:0 reslen:708 locks:{ Global: { acquireCount: { r: 22 } }, Database: { acquireCount: { r: 10 } } } protocol:op_msg 38ms
whereas show databases logged:
2018-11-30T15:41:01.722-0800 I COMMAND [conn23] command admin.$cmd appName: "MongoDB Shell" command: listDatabases { listDatabases: 1.0, $clusterTime: { clusterTime: Timestamp(1543621253, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "admin" } numYields:0 reslen:708 locks:{ Global: { acquireCount: { r: 22 } }, Database: { acquireCount: { r: 10 } } } protocol:op_msg 5ms
For reference, this is from MongoDB 3.6.7.
I did a mongorestore of a gzipped mongodump:
mongorestore -v --drop --gzip --db bigdata /Volumes/Lacie2TB/backup/mongo20170909/bigdata/
But it kept going. I left it, because I figure if I 'just' close it now, my (important) data will be corrupted. Check the percentages:
2017-09-10T14:45:58.385+0200 [########################] bigdata.logs.sets.log 851.8 GB/85.2 GB (999.4%)
2017-09-10T14:46:01.382+0200 [########################] bigdata.logs.sets.log 852.1 GB/85.2 GB (999.7%)
2017-09-10T14:46:04.381+0200 [########################] bigdata.logs.sets.log 852.4 GB/85.2 GB (1000.0%)
And it keeps going!
Note that the other collections have finished. Only this one goes beyond 100%. I do not understand.
This is mongo 3.2.7 on Mac OSX.
There is obviously a problem with the amount of data imported, because there is not even that much diskspace.
$ df -h
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk3 477Gi 262Gi 214Gi 56% 68749708 56193210 55% /
The amount of disk space used could be right, because the gzipped backup is about 200GB. I do not know if this would result in the same amount of data on the WiredTiger database with snappy compression.
However, the log keeps showing inserts:
2017-09-10T16:20:18.986+0200 I COMMAND [conn9] command bigdata.logs.sets.log command: insert { insert: "logs.sets.log", documents: 20, writeConcern: { getLastError: 1, w: 1 }, ordered: false } ninserted:20 keyUpdates:0 writeConflicts:0 numYields:0 reslen:40 locks:{ Global: { acquireCount: { r: 19, w: 19 } }, Database: { acquireCount: { w: 19 } }, Collection: { acquireCount: { w: 19 } } } protocol:op_query 245ms
2017-09-10T16:20:19.930+0200 I COMMAND [conn9] command bigdata.logs.sets.log command: insert { insert: "logs.sets.log", documents: 23, writeConcern: { getLastError: 1, w: 1 }, ordered: false } ninserted:23 keyUpdates:0 writeConflicts:0 numYields:0 reslen:40 locks:{ Global: { acquireCount: { r: 19, w: 19 } }, Database: { acquireCount: { w: 19 } }, Collection: { acquireCount: { w: 19 } } } protocol:op_query 190ms
update
Disk space is still being consumed. This is roughly 2 hours later, and roughly 30 GB later:
$ df -h
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk3 477Gi 290Gi 186Gi 61% 76211558 48731360 61% /
The question is: Is there a bug in the progress indicator, or is there some kind of loop that keeps inserting the same documents?
Update
It finished.
2017-09-10T19:35:52.268+0200 [########################] bigdata.logs.sets.log 1604.0 GB/85.2 GB (1881.8%)
2017-09-10T19:35:52.268+0200 restoring indexes for collection bigdata.logs.sets.log from metadata
2017-09-10T20:16:51.882+0200 finished restoring bigdata.logs.sets.log (3573548 documents)
2017-09-10T20:16:51.882+0200 done
604.0 GB/85.2 GB (1881.8%)
Interesting. :)
It looks similar to this bug: https://jira.mongodb.org/browse/TOOLS-1579
There seems to be a fix backported to 3.5 and 3.4. The fix might not be backported to 3.2. I'm thinking the problem might have something to do with using gzip and/or snappy compression.
I have a query that is seeing some pretty long execution times. Query:
db.legs.find(
{
effectiveDate: {$lte: startDate},
discontinuedDate: {$gte: startDate}
}
).count()
and below is the output in my logs:
2016-11-21T08:58:50.470-0800 I COMMAND [conn2] command myDB.legs
command: count { count: "legs", query: { effectiveDate: { $lte: new Date(1412121600000) }, discontinuedDate: { $gte: new Date(1412121600000) } }, fields: {} }
planSummary: IXSCAN { discontinuedDate: 1 } keyUpdates:0 writeConflicts:0 numYields:82575 reslen:47 locks:{ Global: { acquireCount: { r: 165152 } }, MMAPV1Journal: { acquireCount: { r: 82576 } }, Database: { acquireCount: { r: 82576 } }, Collection: { acquireCount: { R: 82576 } } } protocol:op_command 13940ms
I have an index on {effectiveDate: 1, discontinuedDate: 1} and it is using an IXSCAN to get the data. I'm wondering if anyone can suggest any ways to speed up this query? Isn't IXSCAN the fastest operation we can hope for in this situation?
The explain output doesn’t help much, because dates in the query were compared to strings like “1/1/2015” resulting with 0 matches.
Since you have 2 range filters, index intersection doesn’t work, so basically mongo uses 1 index, fetches documents, and apply the second filter. It may still work for covered queries, but it might be a better idea to try a query without indexes at all:
db.legs.find({
effectiveDate: {$lte: startDate},
discontinuedDate: {$gte: startDate}
})
.hint({$natural:true})
.count()
Even tho it does COLLSCAN, it uses COUNT stage instead of FETCH, which may be quicker.
Store date ms in new field and apply filter on the same. Pass input date also converted to ms and apply filter. this should do faster. link to get date to ms convertion epochconverter