I have quite a bit of experience with Mongo, but am on the verge of tears of frustration about this problem (that of course popped up from nowhere a day before release).
Basically I am querying a database to retrieve a document but it will often be an order of magnitude (or even two) worse than it should be, particularly since the query is returning nothing.
Query:
//searchQuery ex: { "atomic.Basic^SessionId" : "a8297898-7fc9-435c-96be-9c5e60901e40" }
var doc = FindOne(searchQuery);
Explain:
{
"cursor":"BtreeCursor atomic.Basic^SessionId",
"isMultiKey" : false,
" n":0,
"nscannedObjects":0,
"nscanned":0,
"nscannedObjectsAllPlans":0,
"nscannedAllPlans":0,
"scanAndOrder":false,
"indexOnly":false,
"nYields":0,
"nChunkSkips":0,
"millis":0,
"indexBounds":{
"atomic.Basic^SessionId":[
[
"a8297898-7fc9-435c-96be-9c5e60901e40",
"a8297898-7fc9-435c-96be-9c5e60901e40"
]
]
}
}
It is often taking 50-150 ms, even though mongotop reports at most 15 ms of read time (and that should be over several queries). There are only 6k documents in the database (only 2k or so are in the index, and the explain says it's using the index) and since the document being searched for isn't there, it can't be a deserialization issue.
It's not this bad on every query (sub ms most of the time) and surely the B-tree isn't large enough to have that much variance.
Any ideas will have my eternal gratitude.
MongoTop is not reporting the total query time. It reports the amount of time MongoDB is spending holding particular locks.
That query retuned in 0ms according to the explain (which is extremely quickly). What you are describing sounds like network latency. What is the latency when you ping the node? Is it possible that the network is flakey?
What version of MongoDB are you using? Consider upgrading both MongoDB and the C# driver to the latest stable versions.
Related
1st stage:
$Match
{
delFlg: false
}
2nd stage:
$lookup
{
from: "parkings",
localField: "parking_id",
foreignField: "_id",
as: "parking",
}
Using 2 IDX (ID & parking_ID).
In MongoDB /Linux/ Compass explain plan almost 9000ms.
In Local MongoDB server explain plan 1400ms.
Is there any problem? Why this our production server running so slow?
Full Explain Plan
Explain Plan
Query Performance Summary
Documents returned:
71184
Actual query execution time (ms):
8763
Query used the following indexes:
2
_id_
delFlg_1
explainVersion
"1"
stages
Array
serverInfo
Object
serverParameters
Object
command
Object
ok
1
$clusterTime
Object
operationTime
Timestamp({ t: 1666074389, i: 3 })
Using MongoDB 5.0.5 Version
Local MongoDB 6.0.2 Version
Unfortunately the explain output that was provided is mostly truncated (eg Array and Object hiding most of the important information). Even without that there should be some meaningful observations that we can make.
As always I suggest starting with more clearly defining what your ultimate goals are. What is considered "so slow" and how quickly would the operation need to return in order to be acceptable for your use case? As outlined below, I doubt there is much particular room for improvement with the current schema. You may need to rethink the approach entirely if you are looking for improvements that are several order of magnitude in nature.
Let's consider what this operation is doing:
The database goes to an entry in the { delFlg: 1 } index that has a value of false.
It then goes to FETCH that full document from the collection.
Based on the value of the parking_id field, it performs an IXSCAN of the { _id: 1 } index on the parkings collection. If a matching index entry is discovered then the database will FETCH the full document from the other collection and add it to the new parking array field that is being generated.
This full process then repeats 71,183 more times.
Using 8763ms as the duration, that means that the database performed each full iteration of the work described above more than 8 times per millisecond (8.12 ~= 71184 (iters) / 8763 (ms)). I think that sounds pretty reasonable and is unlikely to be something that you can meaningfully improve on. Scaling up the hardware that the database is running on may provide some incremental improvements, but it is generally costly, not scalable, and likely not worthwhile if you are looking for more substantial improvements.
You also mentioned two different timings and database versions in the question. If I understood correctly, it was mentioned that the (explain) operation takes ~9 seconds on version 5.0 and about 1.4 seconds on version 6.0. While there is far too little information here to say for sure, one reason for that may be associated with improvements for $lookup that were introduced in version 6.0. Indeed from their 7 Big Reasons to Upgrade to MongoDB 6.0 article, they claim the following:
The performance of $lookup has also been upgraded. For instance, if there is an index on the foreign key and a small number of documents have been matched, $lookup can get results between 5 and 10 times faster than before.
This seems to match your situation and observations pretty directly.
Apart from upgrading, what else might improve performance here? Well, if many/most documents in the source collection have { delFlg: false } then you may wish to get rid of the { delFlg: 1 } index. Indexes provide benefits when the size of the results they are retrieving are small relative to the overall size of the data in the collection. But as the percentage grows, the overhead of scanning most of the index plus randomly fetching all of the data quickly becomes less effective than just scanning the full collection directly. It is mentioned in the comments that invoices collection contains 70k documents, so this predicate on delFlg doesn't seem to remove hardly any results at all.
One other thing really stands out is this statement from the comments "parkings contains 16 documents". Should the information from those 16 documents just be moved into the parkings documents directly instead? If the two are commonly accessed together and if the parkings information doesn't change very often, then combining the two would reduce a lot of overhead and would further improve performance.
Two additional components about explain that are worth keeping in mind:
The command does not factor in network time to actually transmit the results to the client. Depending on the size of the documents and the physical topology of the environment, this network time could add another meaningful delay to the operation. Be sure that the client really needs all of the data that is being sent.
Depending on the verbosity, explain will measure the time it takes to run the operation through completion. Since there are no blocking stages in your aggregation pipeline, we would expect your initial batch of documents to be returned to the client in much less time. Apart from networking, that time may be around 12ms (which is approximately (101 docs / 71184 ) * 8763) for 5.0.
I'm having issues with my database right now with hundreds of people on our site. Looking at my current operations in my Compose.io dashboard I see that this is one query that has been running for a very long time.
I have an index on _id: 1, owner: 1 and date: -1. Any idea why this query could be taking so long and is locking my entire database? I can also see the number of yields is 268 which is very high which seems problematic.
This is another query that has been running for half a second. It seems to be locking the entire database. Why is this?
I am using mongodb via the mongo shell to query a large collection. For some reason after 90 seconds the mongo shell seems to be stopping my query and nothing is returned.
I have tried the following two commands but neither will return anything. After 90 seconds it just gives me a new line that I can type in another command.
db.cards.find("Field":"Something").maxTimeMS(9999999)
db.cards.find("Field":"Something").addOption(DBQuery.Option.tailable)
db.cards.find() return results, but anything with parameters is timing out at exactly 90 seconds and nothing is being returned.
Any help would be greatly appreciated.
Given the level of detail in your question, I am going to focus on 'query a large collection' and guess that your are using the MMAPv1 storage engine, with no index coverage on your query.
Are you disk bound?
Given the above assumptions, you could be cycling data between RAM and disk. Mongo has a default 100MB RAM limit, so if your query has to examine a lot of documents (no index coverage), paging data from disk to RAM could be the culprit. I have heard of mongo shell acting as you describe or locking/terminating when memory constraints are exceeded.
32bit systems can also impose severe memory limits for large collections.
You could look at your OS specific disk activity monitor to get a clue into whether this is your problem.
Just how large is your collection?
Next, how big is your collection? You can show collections and see the physical size of the collection and also db.cards.count() to see your record count. This helps quantify "large collection".
NOTE: you might need the mongo-hacker extensions to see collection disk use in show collections.
Mongo shell investigation
Within the mongo shell, you have a couple more places to look.
By default, mongo will log slow queries (> 100ms). After your 90 sec timeout:
db.adminCommand({getLog: "global" })
and look for slow query log entries.
Next look at your winning query plan.
var e = db.cards.explain()
e.find("Field":"Something")
I am guessing you will see
"stage": "COLLSCAN",
Which means you are doing a full collection scan and you need index coverage for your query (good idea for queries and sorts).
Suggestions
You should have at least partial index coverage on any production query. A proper index should solve your problem (assuming you don't have documents > 16MB).
Another approach (that I don't recommend - indexing is better) is to use a cursor instead
var cursor = db.cards.find("Field":"Something")
while (cursor.hasNext()) {
print(tojson(cursor.next()));
}
Depending on the root cause, this may work for you.
db.collection.find in my app that uses mongodb java driver (latest) are super slow. I investigated one of them as follows
// about 300 hundred ids at a time (i've tried lower and higher numbers - no impact
db.users.find({_id : {$in : [1,2,3,4,5,6....]}})
Once I get the cursor I do: cursor.toArray() and then iterate of the results
The toArray operation is extremely slow. On average they take about a minute. IMPORTANT: my database is under very heavy load at all times. This particular collection has over 50mm entries.
I've narrowed down the issue in mongo java driver to com.mongodb.Response - specifically to this line:
final byte [] b = new byte[36];
Bits.readFully(in, b);
Incredibly readFully of just 36 bytes takes over a minute some times!
When I bring own the load on the databases, the improvements are drastic. From about a minute to 5-6 seconds. I mean 5-6 seconds to get 300 documents is still super slow, but definitely better then 1 minute.
What can I do to troubleshoot this further? Are there settings on MondoDB that I need to look at?
What happens
You are loading all of the 300 user documents.
What happens is that the _id index is searched and the respective documents are sent completely to your app. So mongoDB will access it's data files, read the first document and send it to you, then it jumps to the next document and sends it to you and so forth. If you used the cursor, you could start iterating over the returned documents as soon as a number of documents equalling your defined cursor size have been returned, as others will be lazily loaded from the cursor on the server on demand. (Bit of a simplification, but sufficient for answering this question). What you do is to explicitly wait until the index is scanned, the documents are located, sent back to your app and have reached it down to the last byte of the last document. As #wdberkeley (who works for 10gen) correctly pointed out, this is a Very Bad Idea™.
What might cause or intensify the problem
Under heavy load, two things might happen. The more likely is that your _id index isn't in RAM any more, causing thousands, if not millions of reads from disk - which is slow. Much slower than if the indices are kept in RAM (by several orders of magnitude). So it is not the code snippet you mentioned, but the response time of MongoDB which causes the delay. Another option under heavy load is that your disk IO is simply too low or (more likely) the random file read latency is too high. I assume you are using spinning disks plus not enough RAM for a database that size.
What to do to find the cause
Try to find out your index size using the db.users.stats(). I am pretty sure that your index size(s combined) exceed your available RAM.
Measure the disk IO and latency. If you use a GNU/Linux OS, you might want to find out how high your IOwait percentage is. A high percentage shows that your disk latency is too high for the load put on the server. It might even be that your are reaching the disk's IO limits.
Do your queries on a mongo shell. In case they are fast, you can be pretty sure that your toArray call is the cause of the problem.
What to do to resolve the problem
If you have not enough RAM, either scale up or scale out.
If your disk latency or throughput is too high, either scale out or ( better and cheaper in most cases ) use SSDs for storing MongoDB's data.
Use a cursor object to iterate over the documents. This is a better solution in almost every use case I can think of.
Upgrading MongoDB driver to 3.6.4 will fetch the data in no-time.
We have around 2 million documents in our collection and with previous version it was taking around ~3 minutes but after after upgrading to 3.6.4 it took only 5-7 sec.So what I feel is that there is some issue with the old version of mongoDB driver.
I have a simple data set, a few collections, not more than 20
documents in each, in MongoDB 2.0 (previously 1.8). I'm getting poor
results when it comes to querying data (at least I think they could be
much better looking at http://mongoid.org/performance.html). At first,
I though that the mapper I use in Ruby (Mongoid) was the problem, but
I made some more tests and it seems more related to the database
itself.
I've made a simple benchmark where I query the same document 10000
times by its ID, first using the Ruby Mongo driver, then Mongoid. The
results:
user system total real
driver 7.670000 0.380000 8.050000 ( 8.770334)
mongoid 9.180000 0.380000 9.560000 ( 10.384077)
The code is here: https://gist.github.com/1303536
The machine I'm testing this on is a Core 2 Duo P8400 2.27 GHz with 4
GB of RAM running Ubuntu 11.04.
I also made a similar test using pymongo to check if the problem lies
in the Ruby driver, but the result was only slightly better (5-6 s for
10000 requests).
The bsonsize of the document I'm fetching is 67. It has some small
embedded documents, but not more than 100. Some of the embedded
documents refer documents from other collections by ID, but AFAIR this
relationship is handled by the mapper, so it shouldn't influence the
performance. Fetching this document directly in the database with explain() results in millis = 0.
The odd thing is that the HDD LED keeps blinking all the time during
the tests. Shouldn't this document be cached in RAM by Mongo after
first read? Is there something obvious I could be missing? Or is this
not a poor result at all (but comparing with http://mongoid.org/performance.html
it does seem bad)?
I dropped and recreated the database. Maybe it was because of going from 1.8 to 2.0. Anyway, the HDD led stopped blinking and everything is now 2-3x times faster.
I also looked carefully at the test that was used to benchmark Mongoid and this result (0.001s) is just for one find(), not a million. I told the Mongoid's author that I think it's not stated clearly on the web site that the number of operations applies only to some of them.
Sorry for the confusion.