Why does Mongo FETCH on count() with $nin? - mongodb

I am trying to understand why Mongo can't use a covered index with my query using $nin, and how to resolve it. My issue is with a compound index, but it happens with a simple index too.
Take a simple document:
{b: "text1"}
And a simple index:
{
"v" : 1,
"key" : {
"b" : 1
},
"name" : "b_1",
"ns" : "mytest"
}
And what I thought was a simple count() query:
db.mytest.count( {b: $nin: [ "foo" ]}, {b:1, _id:0} )
The winningPlan unexpectedly includes a FETCH:
"winningPlan" : {
"stage" : "COUNT",
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"b" : 1
},
"indexName" : "b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[MinKey, \"foo\")",
"(\"foo\", MaxKey]"
]
}
}
}
}
But with a simple equality condition it uses COUNT_SCAN (as expected):
> db.mytest.count( {b: "bar" }, {b:1, _id:0} )
"winningPlan" : {
"stage" : "COUNT",
"inputStage" : {
"stage" : "COUNT_SCAN",
"keyPattern" : {
"b" : 1
},
"indexName" : "b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1
}
},
To make things more interesting, a find() instead of a count() doesn't look at any documents:
> db.mytest.find({b:{ $nin: [ 3 ] }}, {b:1, _id:0})
"winningPlan" : {
"stage" : "PROJECTION",
"transformBy" : {
"b" : 1,
"_id" : 0
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"b" : 1
},
"indexName" : "b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[MinKey, 3.0)",
"(3.0, MaxKey]"
]
}
}
}
Why does Mongo need to FETCH with $nin? It should be able to fulfill this exclusively from the index.

So it appears that this is a bug that was fixed in 3.6. There was definitely an unnecessary FETCH in many COUNT situations.

Related

Mongo $or query with ranges is doing an in-memory sort?

I'm running into a unique situation where one query seems to do an in-memory sort. Query 1 is the one that does the in-memory sort, while Query 2 is doing a merge sort correctly.
There are a few parts to the query, so I want to know which part is causing the query sort to be done in memory?
I do have a workaround, but I would like to know the reason behind this. They both have 2 input stages, so I'm not sure what is the cause.
Schema:
schema = {
date: Date, // date that can change
createTime: Date, // create time of document
value: Number
}
Index:
schema.index({value: 1, createTime: -1, date: 1});
Query 1: I have $or at the top level to avoid using incorrect index: MongoDB query to slow when using $or operator
db.getCollection('dates').find({
$or: [
{value: {$in: [1, 2]}, date: null},
{value: {$in: [1, 2]}, date: {$gt: ISODate("2020-06-16T23:59:59.999Z")}}
]
}).sort({createTime:-1}).explain()
Query 1 plan: As you can see it does a sort in-memory. I'm not sure exactly why this is occurring.
{
"stage" : "SUBPLAN",
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "SORT",
"sortPattern" : {
"createTime" : -1.0
},
"inputStage" : {
"stage" : "SORT_KEY_GENERATOR",
"inputStage" : {
"stage" : "OR",
"inputStages" : [
{
"stage" : "FETCH",
"filter" : {
"date" : {
"$eq" : null
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"value" : 1,
"createTime" : -1,
"date" : 1
},
"indexName" : "value_1_createTime_-1_date_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"value" : [],
"createTime" : [],
"date" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"value" : [
"[1.0, 1.0]",
"[2.0, 2.0]"
],
"createTime" : [
"[MaxKey, MinKey]"
],
"date" : [
"[undefined, undefined]",
"[null, null]"
]
}
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"value" : 1,
"createTime" : -1,
"date" : 1
},
"indexName" : "value_1_createTime_-1_date_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"value" : [],
"createTime" : [],
"date" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"value" : [
"[1.0, 1.0]",
"[2.0, 2.0]"
],
"createTime" : [
"[MaxKey, MinKey]"
],
"date" : [
"(new Date(1592351999999), new Date(9223372036854775807)]"
]
}
}
]
}
}
}
}
}
Query 2:
db.getCollection('dates').find({
value: {$in: [1, 2]},
date: {$not: {$lte: ISODate("2020-06-16T23:59:59.999Z")}}
}).sort({createTime:-1}).explain()
Query 2 plan: The workaround query I used, which does a merge sort successfully.
{
"stage" : "FETCH",
"inputStage" : {
"stage" : "SORT_MERGE",
"sortPattern" : {
"createTime" : -1.0
},
"inputStages" : [
{
"stage" : "IXSCAN",
"keyPattern" : {
"value" : 1,
"createTime" : -1,
"date" : 1
},
"indexName" : "value_1_createTime_-1_date_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"value" : [],
"createTime" : [],
"date" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"value" : [
"[1.0, 1.0]"
],
"createTime" : [
"[MaxKey, MinKey]"
],
"date" : [
"[MinKey, true]",
"(new Date(1592351999999), MaxKey]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"value" : 1,
"createTime" : -1,
"date" : 1
},
"indexName" : "value_1_createTime_-1_date_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"value" : [],
"createTime" : [],
"date" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"value" : [
"[2.0, 2.0]"
],
"createTime" : [
"[MaxKey, MinKey]"
],
"date" : [
"[MinKey, true]",
"(new Date(1592351999999), MaxKey]"
]
}
}
]
}
}
Each of the branches of $or could use an index, but then you still have two result sets and if you apply sort on top the database has to sort the results in memory. Seems reasonable that having sort over an $or operator would produce an in-memory sort.

Why aggregation framework is slower than simple find query

I am new to mongodb and came across some strange behaviour of aggregation framework.
I have a collection named 'billingData', this collection has approximately 2M documents.
I am comparing two queries which give me same output but their execution time different.
Query 1:
db.billingData.find().sort({"_id":-1}).skip(100000).limit(50)
Execution Plan:
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "billingDetails.billingData",
"indexFilterSet" : false,
"parsedQuery" : {},
"winningPlan" : {
"stage" : "LIMIT",
"limitAmount" : 50,
"inputStage" : {
"stage" : "SKIP",
"skipAmount" : 100000,
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"_id" : 1
},
"indexName" : "_id_",
"isMultiKey" : false,
"multiKeyPaths" : {
"_id" : []
},
"isUnique" : true,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "backward",
"indexBounds" : {
"_id" : [
"[MaxKey, MinKey]"
]
}
}
}
}
},
"rejectedPlans" : []
},
"serverInfo" : {
"host" : "ip-172-60-62-125",
"port" : 27017,
"version" : "3.6.3",
"gitVersion" : "9586e557d54ef70f9ca4b43c26892cd55257e1a5"
},
"ok" : 1.0
}
Query 2:
db.billingData.aggregate([
{$sort : {"_id":-1}},
{$skip:100000},
{$limit:50}
])
Execution Plan:
{
"stages" : [
{
"$cursor" : {
"query" : {},
"sort" : {
"_id" : -1
},
"limit" : NumberLong(100050),
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "billingDetails.billingData",
"indexFilterSet" : false,
"parsedQuery" : {},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"_id" : 1
},
"indexName" : "_id_",
"isMultiKey" : false,
"multiKeyPaths" : {
"_id" : []
},
"isUnique" : true,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "backward",
"indexBounds" : {
"_id" : [
"[MaxKey, MinKey]"
]
}
}
},
"rejectedPlans" : []
}
}
},
{
"$skip" : NumberLong(100000)
}
],
"ok" : 1.0
}
I was expecting same results from aggregation framework and find query but find query returned results in 2sec and aggregation took 16sec.
Although in both the queries, I am sorting my documents in descending order(on the basis of _id) and fetching 50 records after skipping 100,000 records.
Can someone explain me why aggregation framework is working this way?
What can I do to make it performance wise similar to find query?

what is multiKeyPaths in explain, and why the lack of which causes bad index usage

I have a large collection with 4 shards.
When I run a query over an indexed array field "array.number" like so:
var query = { "array" : { $elemMatch: { "number" : { $gte : "10", $lt : "20" } } } };
and check explain, I'll get these winning plans (abbreviated for clarity):
Shards 0/2/3:
"inputStage": {
"stage": "IXSCAN",
...
"isMultiKey": true,
"indexBounds": {
"array.number": [
"[\"10\", {})"
]
}
}
Shard1:
"inputStage": {
"stage": "IXSCAN",
...
"isMultiKey": true,
"multiKeyPaths": {
"array.number": [
"array"
]
},
"indexBounds": {
"array.number": [
"[\"10\", \"20\")"
]
}
}
So shard1 gives the expected optimal use of the index, limiting the inputStage to go over 10-20 only, while the other shards only use the lower bound on the index. The only difference between the shards objects is the multiKeyPaths part, which is missing in shards 0/2/3.
Any idea why is that, and how we can cause our other shards to properly use our index?
UPDATE
Here's the full explain response for the following query:
var query = { "array" : { $elemMatch: { "number" : { $gte : "10", $lt : "20" } } } };
db.collection.find(query).explain()
Response:
{
"queryPlanner" : {
"mongosPlannerVersion" : 1,
"winningPlan" : {
"stage" : "SHARD_MERGE",
"shards" : [
{
"shardName" : "company_rs0",
"connectionString" : "company_rs0/shard0-db0:27017,shard0-db1:27017",
"serverInfo" : {"host":"shard0-db0","port":27017,"version":"3.4.7","gitVersion":"cf38c1b8a0a8dca4a11737581beafef4fe120bcd"},
"plannerVersion" : 1,
"namespace" : "company_database.collection",
"indexFilterSet" : false,
"parsedQuery" : {"array":{"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}},
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"array" : {
"$elemMatch" : {"$and":[{"number":{"$gte":"10"}},{"number":{"$lt":"20"}}]}
}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"10\", {})"]}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"\", \"20\")"]}
}
}
]
},
{
"shardName" : "company_rs1",
"connectionString" : "company_rs1/shard1-db0:27017,shard1-db1:27017",
"serverInfo" : {"host":"shard1-db0","port":27017,"version":"3.4.7","gitVersion":"cf38c1b8a0a8dca4a11737581beafef4fe120bcd"},
"plannerVersion" : 1,
"namespace" : "company_database.collection",
"indexFilterSet" : false,
"parsedQuery" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"multiKeyPaths" : {"array.number":["array"]},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"10\", \"20\")"]}
}
},
"rejectedPlans" : []
},
{
"shardName" : "company_rs2",
"connectionString" : "company_rs2/shard2-db0:27017,shard2-db1:27017",
"serverInfo" : {"host":"shard2-db0","port":27017,"version":"3.4.7","gitVersion":"cf38c1b8a0a8dca4a11737581beafef4fe120bcd"},
"plannerVersion" : 1,
"namespace" : "company_database.collection",
"indexFilterSet" : false,
"parsedQuery" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$gte":"10"}},{"number":{"$lt":"20"}}]}}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"10\", {})"]}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"\", \"20\")"]}
}
}
]
},
{
"shardName" : "company_rs3",
"connectionString" : "company_rs3/shard3-db0:27017,shard3-db1:27017",
"serverInfo" : {"host":"shard3-db0","port":27017,"version":"3.4.7","gitVersion":"cf38c1b8a0a8dca4a11737581beafef4fe120bcd"},
"plannerVersion" : 1,
"namespace" : "company_database.collection",
"indexFilterSet" : false,
"parsedQuery" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$gte":"10"}},{"number":{"$lt":"20"}}]}}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"10\", {})"]}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"array" : {"$elemMatch":{"$and":[{"number":{"$lt":"20"}},{"number":{"$gte":"10"}}]}}
},
"inputStage" : {
"stage" : "IXSCAN",
"numberPattern" : {"array.number":1.0},
"indexName" : "array.number_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {"array.number":["[\"\", \"20\")"]}
}
}
]
}
]
}
},
"ok" : 1.0
}

explains shows fast execution time, but running the query never returns

I have a query that seems to never return.
When I run explain on that query, it shows me executionStats.executionTimeMillis of 27ms, and that the initial input-stage is IXSCAN that should return 4 objects only.
I've confirmed that querying for the input-stage query returns only 4 results.
This is my query:
{"$or":[
{"field1.key":{"$in":["name1","name2",/^prefix.*suffix$/]},"field2.key":"foobar"},
{"field1.key":{"$in":["name1","name2",/^prefix.*suffix$/]},"field3.key":"foobar"}
]}
This is the explain({ verbose : "executionStats" }) output (sorry for the long paste):
{
"queryPlanner" : {
"mongosPlannerVersion" : 1,
"winningPlan" : {
"stage" : "SHARD_MERGE",
"shards" : [
{
"shardName" : "...",
"plannerVersion" : 1,
"indexFilterSet" : false,
"parsedQuery" : { ... },
"winningPlan" : {
"stage" : "SUBPLAN",
"inputStage" : {
"stage" : "OR",
"inputStages" : [
{
"stage" : "FETCH",
"filter" : {"field1.key":{"$in":["name1","name2",/^prefix.*suffix$/]},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : { "field3.key" : 1.0 },
"indexName" : "field3.key_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"field3.key" : [ "[\"foobar\", \"foobar\"]" ]
}
}
},
{
"stage" : "FETCH",
"filter" : {"field1.key":{"$in":["name1","name2",/^prefix.*suffix$/]},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : { "field2.key" : 1.0 },
"indexName" : "field2.key_1",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"field2.key" : [ "[\"foobar\", \"foobar\"]" ]
}
}
}
]
}
},
"rejectedPlans" : []
},
...
// same plan for the 3 other shards
...
]
}
},
"executionStats" : {
"nReturned" : 0,
"executionTimeMillis" : 27,
"totalKeysExamined" : 4,
"totalDocsExamined" : 4,
"executionStages" : {
"stage" : "SHARD_MERGE",
"nReturned" : 0,
"executionTimeMillis" : 27,
"totalKeysExamined" : 4,
"totalDocsExamined" : 4,
"totalChildMillis" : NumberLong(63),
...
// execution times for each shard
...
},
"allPlansExecution" : []
},
"ok" : 1.0
}
UPDATE
It seems that despite explain mentioning it uses "field2.key" for the first part of the $or and "field3.key" for the second part of the $or, when looking at db.currentOp().inprog it shows:
"planSummary": "IXSCAN { field1.key: 1.0 }, IXSCAN { field3.key: 1.0 }"
so it selected the wrong index for one of the $or parts, and thus making the query scan a huge number of documents.
Any idea why explain gets the indexes right, but the query itself doesn't?
How can we hint mongo to use the correct indexes, when using $or?

Sorting with $in not returning all docs

I have the following query.
db.getCollection('logs').find({'uid.$id': {
'$in': [
ObjectId("580e3397812de36b86d68c04"),
ObjectId("580e33a9812de36b86d68c0b"),
ObjectId("580e339a812de36b86d68c09"),
ObjectId("580e339a812de36b86d68c08"),
ObjectId("580e33a9812de36b86d68c0a"),
ObjectId("580e33bd812de36b86d68c11"),
ObjectId("580e33c0812de36b86d68c13")
]}, levelno: { '$gte': 10 }
}).sort({_id: 1})
This should return 1847 documents. However, when executing it, I only get 1000 documents, which is the cursor's batchSize and then the cursor closes (setting its cursorId to 0), as if all documents were returned.
If I take out the sorting, then I get all 1847 documents.
So my question is, why does it silently fail when using sorting with the $in operator?
EDIT
Using explain gives the following output
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "session.logs",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"levelno" : {
"$gte" : 10
}
},
{
"uid.$id" : {
"$in" : [
ObjectId("580e3397812de36b86d68c04"),
ObjectId("580e339a812de36b86d68c08"),
ObjectId("580e339a812de36b86d68c09"),
ObjectId("580e33a9812de36b86d68c0a"),
ObjectId("580e33a9812de36b86d68c0b"),
ObjectId("580e33bd812de36b86d68c11"),
ObjectId("580e33c0812de36b86d68c13")
]
}
}
]
},
"winningPlan" : {
"stage" : "SORT",
"sortPattern" : {
"_id" : 1
},
"inputStage" : {
"stage" : "SORT_KEY_GENERATOR",
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"uid.$id" : 1,
"levelno" : 1,
"_id" : 1
},
"indexName" : "uid.$id_1_levelno_1__id_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"uid.$id" : [
"[ObjectId('580e3397812de36b86d68c04'), ObjectId('580e3397812de36b86d68c04')]",
"[ObjectId('580e339a812de36b86d68c08'), ObjectId('580e339a812de36b86d68c08')]",
"[ObjectId('580e339a812de36b86d68c09'), ObjectId('580e339a812de36b86d68c09')]",
"[ObjectId('580e33a9812de36b86d68c0a'), ObjectId('580e33a9812de36b86d68c0a')]",
"[ObjectId('580e33a9812de36b86d68c0b'), ObjectId('580e33a9812de36b86d68c0b')]",
"[ObjectId('580e33bd812de36b86d68c11'), ObjectId('580e33bd812de36b86d68c11')]",
"[ObjectId('580e33c0812de36b86d68c13'), ObjectId('580e33c0812de36b86d68c13')]"
],
"levelno" : [
"[10.0, inf.0]"
],
"_id" : [
"[MinKey, MaxKey]"
]
}
}
}
}
},
"rejectedPlans" : [
{
"stage" : "SORT",
"sortPattern" : {
"_id" : 1
},
"inputStage" : {
"stage" : "SORT_KEY_GENERATOR",
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"levelno" : 1,
"_id" : 1,
"uid.$id" : 1
},
"indexName" : "levelno_1__id_1_uid.$id_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"levelno" : [
"[10.0, inf.0]"
],
"_id" : [
"[MinKey, MaxKey]"
],
"uid.$id" : [
"[ObjectId('580e3397812de36b86d68c04'), ObjectId('580e3397812de36b86d68c04')]",
"[ObjectId('580e339a812de36b86d68c08'), ObjectId('580e339a812de36b86d68c08')]",
"[ObjectId('580e339a812de36b86d68c09'), ObjectId('580e339a812de36b86d68c09')]",
"[ObjectId('580e33a9812de36b86d68c0a'), ObjectId('580e33a9812de36b86d68c0a')]",
"[ObjectId('580e33a9812de36b86d68c0b'), ObjectId('580e33a9812de36b86d68c0b')]",
"[ObjectId('580e33bd812de36b86d68c11'), ObjectId('580e33bd812de36b86d68c11')]",
"[ObjectId('580e33c0812de36b86d68c13'), ObjectId('580e33c0812de36b86d68c13')]"
]
}
}
}
}
},
{
"stage" : "FETCH",
"filter" : {
"$and" : [
{
"levelno" : {
"$gte" : 10
}
},
{
"uid.$id" : {
"$in" : [
ObjectId("580e3397812de36b86d68c04"),
ObjectId("580e339a812de36b86d68c08"),
ObjectId("580e339a812de36b86d68c09"),
ObjectId("580e33a9812de36b86d68c0a"),
ObjectId("580e33a9812de36b86d68c0b"),
ObjectId("580e33bd812de36b86d68c11"),
ObjectId("580e33c0812de36b86d68c13")
]
}
}
]
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"_id" : 1
},
"indexName" : "_id_",
"isMultiKey" : false,
"isUnique" : true,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"_id" : [
"[MinKey, MaxKey]"
]
}
}
}
]
},
"ok" : 1
}
What's happening is that this sorted query must be performed in-memory as it's not supported by an index, and this limits the results to 32 MB. This behavior is documented here, with a JIRA about addressing this here.
Furthermore, you can't define an index to support this query as you're sorting on a field that isn't part of the query, and neither of these cases apply:
If the sort keys correspond to the index keys or an index prefix,
MongoDB can use the index to sort the query results. A prefix of a
compound index is a subset that consists of one or more keys at the
start of the index key pattern.
...
An index can support sort operations on a non-prefix subset of the
index key pattern. To do so, the query must include equality
conditions on all the prefix keys that precede the sort keys.
You should be able to work around the limitation by using the aggregation framework which can be instructed to use temporary files for its pipeline stage outputs if required via the allowDiskUse: true option:
db.getCollection('logs').aggregate([
{$match: {'uid.$id': {
'$in': [
ObjectId("580e3397812de36b86d68c04"),
ObjectId("580e33a9812de36b86d68c0b"),
ObjectId("580e339a812de36b86d68c09"),
ObjectId("580e339a812de36b86d68c08"),
ObjectId("580e33a9812de36b86d68c0a"),
ObjectId("580e33bd812de36b86d68c11"),
ObjectId("580e33c0812de36b86d68c13")
]}, levelno: { '$gte': 10 }
}},
{$sort: {_id: 1}}
], { allowDiskUse: true })
You can use objsLeftInBatch() method to determine how many object are left in batch and iterate over it.
You can override the size and limit of the cursor batch size using cursor.batchSize(size) and cursor.limit(limit)