There are some questions 1, 2 talk about the MongoDB warning Query Targeting: Scanned Objects / Returned has gone above 1000, however, my question is another case.
The schema of our document is
{
"_id" : ObjectId("abc"),
"key" : "key_1",
"val" : "1",
"created_at" : ISODate("2021-09-25T07:38:04.985Z"),
"a_has_sent" : false,
"b_has_sent" : false,
"updated_at" : ISODate("2021-09-25T07:38:04.985Z")
}
The indexes of this collections are
{
"key" : {
"updated_at" : 1
},
"name" : "updated_at_1",
"expireAfterSeconds" : 5184000,
"background" : true
},
{
"key" : {
"updated_at" : 1,
"a_has_sent" : 1,
"b_has_sent" : 1
},
"name" : "updated_at_1_a_has_sent_1_b_has_sent_1",
"background" : true
}
The total number of documents after 2021-09-24 is over 600000, and the distinct value of key is 5.
The above waning caused by the query
db.collectionname.find({ "updated_at": { "$gte": ISODate("2021-09-24")}, "$or": [{ "a_has_sent": false }, {"b_has_sent": false}], "key": "key_1"})
Our server sends one document to a and b simutinously with batch size 2000. After sending to a successfully, mark a_has_sent to true. The same logic to b. As sending process goes on, the number of documents with a_has_sent: false reduce. And the above warning comes up.
After checking the explain result of this query, the index named updated_at_1 is used rather than updated_at_1_a_has_sent_1_b_has_sent_1.
What we had tried.
We add another new index {"updated_at": 1, "key": 1}, and expect this query could use the new index to reduce the number of scanned documents. Unfortunately, we failed. The index named updated_at_1 is still used.
We try to replace find with aggregate
aggregate([{"$match": { "updated_at": { "$gte": ISODate("2021-09-24") }, "$or": [{ "a_has_sent": false }, { "b_has_sent": false}], "key": "key_1"}}]). Unfortunately, The index named updated_at_1 is still used.
We want to know how to eliminate this warning Scanned Objects / Returned has gone above 1000?
Mongo 4.0 is used in our case.
Follow the ESR rule
For compound indexes, this rule of thumb is helpful in deciding the order of fields in the index:
First, add those fields against which Equality queries are run.
The next fields to be indexed should reflect the Sort order of the query.
The last fields represent the Range of data to be accessed.
We create the index {"action_key" : 1,"adjust_sent" : 1,"facebook_sent" : 1,"updated_at" : 1}, this index could be used by the query now
Update 08/15/2022
Query Targeting alerts indicate inefficient queries.
Query Targeting: Scanned Objects / Returned occurs if the number of documents examined to fulfill a query relative to the actual number of returned documents meets or exceeds a user-defined threshold. The default is 1000, which means that a query must scan more than 1000 documents for each document returned to trigger the alert.
Here are some steps to solve this issue
First, The Performance Advisor provides the easiest and quickest way to create an index. If there is any create Indexes suggestion, you can create this recommended index.
Then, you could check the query profile if there is no recommended index in Performance Advisor. The Query Profiler contains several metrics you can use to pinpoint specific inefficient queries. The Query Profiler can show the Examined : Returned Ratio (index keys examined to documents returned) of logged queries, which might help you identify the queries that triggered a
Query Targeting: Scanned / Returned
alert. The chart shows the number of index keys examined to fulfill a query relative to the actual number of returned documents.
You can use the following resources to determine which query generated the alert:
The Real-Time Performance Panel monitors and displays current network traffic and database operations on machines hosting MongoDB in your Atlas clusters.
The MongoDB logs maintain an account of activity, including queries, for each mongod instance in your Atlas clusters.
The following mongod log entry shows statistics generated from an inefficient query:
<Timestamp> COMMAND <query>
planSummary: COLLSCAN keysExamined:0
docsExamined: 10000 cursorExhausted:1 numYields:234
nreturned:4 protocol:op_query 358ms
This query scanned 10,000 documents and returned only 4 for a ratio of 2500, which is highly inefficient. No index keys were examined, so MongoDB scanned all documents in the collection, known as a collection scan
The cursor.explain() command for mongosh provides performance details for all queries.
The Data Profiler records operations that Atlas considers slow when compared to average execution time for all operations on your cluster.
Note - Enabling the Database Profiler incurs a performance overhead.
MongoDB cannot use a single index to process an $or that looks at different field values.
The index on
{
"updated_at" : 1,
"a_has_sent" : 1,
"b_has_sent" : 1
}
can be used with the $or expression to match either a_has_sent or b_has_sent.
To minimize the number of documents examined, create 2 indexes, one for each branch of the $or, combined with the enclosing $and (the filter implicitly combines the top-level query predicates with and). Such as:
{
"updated_at" : 1,
"a_has_sent" : 1
}
and
{
"updated_at" : 1,
"b_has_sent" : 1
}
Also note that the alert for Query Targeting: Scanned Objects / Returned has gone above 1000 does not refer to a single query.
The MongoDB server keeps a counter (64-bit?) that tracks the number of documents examined since the server was start, and another counter for the number of documents returned.
That scanned per returned ration is derive by simply dividing the examined counter by the returned counter.
This means that if you have something like a count query that requires examining documents, you may have hundreds or thousands of documents examined, but only 1 returned. It won't take many of these kinds of queries to push the ratio over the 1000 alert limit
I am getting the 'Sort operation used more than the maximum 33554432 bytes of RAM' error on some query. However what is even more troublesome, is that I can't even run it with .explain():
> db.collection.find({...}, {...}).limit(5000).sort({...})
Error: error: {
"ok" : 0,
"errmsg" : "errmsg: \"Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.\"",
"code" : 96,
"codeName" : "OperationFailed"
}
> db.collection.find({...}, {...}).limit(5000).sort({...}).explain()
2019-07-22T09:15:38.246+0000 E QUERY [thread1] Error: explain failed: {
"ok" : 0,
"errmsg" : "errmsg: \"Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.\"",
"code" : 96,
"codeName" : "OperationFailed"
} :
_getErrorWithCode#src/mongo/shell/utils.js:25:13
throwOrReturn#src/mongo/shell/explainable.js:31:1
constructor/this.finish#src/mongo/shell/explain_query.js:172:24
DBQuery.prototype.explain#src/mongo/shell/query.js:521:12
#(shell):1:1
Tried to recreate the index but nothing changed.
What could be happening here?
Edit: To clarify, the sort field is indexed and in the correct sort order. I was wondering why the explain fails, which seemed odd to me and I thought there might be data corruption going on here. How are we meant to diagnose a problematic query with explain if it fails on what we are trying to diagnose?
Edit 2: Upon further diagnosis I could literally pinpoint it to .limit(4308) works and .limit(4309) barfs. However there is nothing wrong with the 4309th record...
Furthermore this happens in one env and not the other that are identical expect for data.
For any time travelers from the future:
RE .explain(), seems to be just a quirk in Mongo. To see the query plan the limit must be reduced. I guess as silly as this sounds Mongo actually runs the query and then shows the query plan...
Worth noting that this is Mongo 3.4. Might have changed by now...
Our performance problem came down to having a huge subobject property (let's call it .metaData). However since we know it's problematic we didn't include it in the projection. But - it does appear in the find criteria as {metaData: {$exists: true}}. I guess mongo fetches the whole thing and keeps it in memory only to do {$exists: true} on it. That led the query to blow up the 32M memory limit eventhough the actual result requires much less memory and the sort field is indexed.
So we live to write more bugs another day...
I am using maxTime to terminate the queries in mongo if it exceeds 1000ms.
Is there a way to find out if the query got terminated due to maxtime or it couldn't return any results due to match issues.
MongoDB will throw exception if query timeout.
Like this:
error: { "$err" : "operation exceeded time limit", "code" : 50 }
Hence you are able to distinguish between the exception case and no result case.
Read more here.
I am getting error when I am executing a sort on non indexed field :
com.mongodb.MongoQueryException: Query failed with error code 17144 and error message 'Executor error: Overflow sort stage buffered data usage of 33555047 bytes exceeds internal limit of 33554432 bytes' on server
I tried using
db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes:50151432}) { "was" : 33554432, "ok" : 1 }
When i use getParameter i can see the value set as 50151432
But i still see the error same with 33554432 with same size query.
Any ideas how to get over this
My Mongo version is 3.4
I have run into a problem where I have exceeded the allowed BSON size of 16MB and I am getting this error now whenever I try to do something on my collection.
Now my question is, how do I repair and solve the problem?
How do I check whether it is an individual document within my collection, or the collection itself exceeding the limit
How do I remove the offending document? I just keep getting this error whenever I try doing something with this collection now.
I have already tried db.repairDatabase(), but just keep getting the same error:
"errmsg" : "exception: BSONObj size: 1718558820 (0x666F2064) is invalid. Size must be between 0 and 16793600(16MB) First element: ...: ?type=32",
"code" : 10334,
"ok" : 0
Look at the size. It's obviously not a size, it's four ASCII characters. Go and find your bug.