Mongo index not being used (simple one column query) - mongodb

Explain of find query:
> db.datasources.find({nid: 19882}).explain();
{
"cursor" : "BtreeCursor nid_1",
"nscanned" : 10161684,
"nscannedObjects" : 10161684,
"n" : 10161684,
"millis" : 8988,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"nid" : [
[
19882,
19882
]
]
}
}
Here are the indexes for the collection:
> db.datasources.getIndexes()
[
{
"name" : "_id_",
"ns" : "rocdocs_dev.datasources",
"key" : {
"_id" : 1
}
},
{
"_id" : ObjectId("4edcd725c605da5f200000a2"),
"ns" : "rocdocs_dev.datasources",
"key" : {
"nid" : 1
},
"name" : "nid_1"
},
{
"v" : 1,
"key" : {
"is_indexed" : 1
},
"ns" : "rocdocs_dev.datasources",
"name" : "is_indexed_1"
}
]

This is using an index as noted by BtreeCursor If it werent, it would say BasicCursor
Though I do see that the query takes 9 seconds and scans what appears to be the entire collection.
Did you add this index after inserting those documents? Perhaps its not done building yet?
I would consider rebuilding the index
db.datasources.reIndex()

Related

Mongo $near query takes 6s on 1.2 millions documents

I inserted about 1.2 millions identical documents for testing speed of geospatial index in MongoDb
Here is a query:
db.spreads.find({ loc: { '$near': { '$geometry': {type: "Point" , coordinates: [40,40]}, '$maxDistance': 10000000 } } }).explain();
And result
{
"cursor" : "S2NearCursor",
"isMultiKey" : false,
"n" : 1568220,
"nscannedObjects" : 12545154,
"nscanned" : 12545154,
"nscannedObjectsAllPlans" : 12545154,
"nscannedAllPlans" : 12545154,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 11413,
"indexBounds" : {
},
"server" : "s1.heychat.io:27017",
"filterSet" : false
}
Indexes:
db.spreads.getIndexes();
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.spreads"
},
{
"v" : 1,
"key" : {
"loc" : "2dsphere"
},
"name" : "loc_2dsphere",
"ns" : "test.spreads",
"2dsphereIndexVersion" : 2
}
]
Why so slowly?
"n" : 1568220 in the explain output means that the query returned 1.5 million docs. So that would explain why it took so long.
Using a much smaller $maxDistance is probably a better test.

Why indexOnly attribute is false for this covered query

I have a test db with fields _id, name, age, date
Indexes:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "blogger.users"
},
{
"v" : 1,
"key" : {
"name" : 1,
"age" : 1
},
"name" : "name_1_age_1",
"ns" : "blogger.users"
},
{
"v" : 1,
"key" : {
"age" : 1,
"name" : 1
},
"name" : "age_1_name_1",
"ns" : "blogger.users"
}
]
When running the following query:
> db.users.find({"name":"user10"},{"_id":0,"date":0})
.explain()
I get following:
{
"cursor" : "BtreeCursor name_1_age_1",
"isMultiKey" : false,
"n" : 1,
"nscannedObjects" : 1,
"nscanned" : 1,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"name" : [
[
"user10",
"user10"
]
],
"age" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "Johny-PC:27017",
"filterSet" : false
}
Without explain the result is:
{ "name" : "user10", "age" : 68 }
Even though this is a covered query with proper projections, the indexOnly field is still false. I have also tried explicitly providing index using hint, but no change. In that case values of nscannedObjectsAllPlans and nscannedAllPlans are 1 as the query doesnt try other indexes.
For a query to be "indexOnly" or "covered" the only fields returned must be contained in the index. So even though you have an index for "name_1_age_1", the query engine still expects to be "told" that the only fields you want are those in the index. It does not know this about the document until you inspect it:
db.users.find({"name":"user10"},{"_id":0, "name": 1, "age": 1 }).explain()
That will return "indexOnly" as the query engine knows that the selected index contains all of the fields that are required for output. As such there is no need to go back through the collection in case there are other fields to return.

why is mongodb hitting this index

Given that i have an index in my collection asd
> db.system.indexes.find().pretty()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "asd.test", "name" : "_id_" },
{
"v" : 1,
"key" : {
"a" : 1,
"b" : 1,
"c" : 1
},
"ns" : "asd.test",
"name" : "a_1_b_1_c_1"
}
As far as i know in theory the order of the parameters queried is important in order to hit an index...
That is why im wondering how and why im actually hitting the index with this query
> db.asd.find({c:{$gt: 5000},a:{$gt:5000}}).explain()
{
"cursor" : "BtreeCursor a_1_b_1_c_1",
"isMultiKey" : false,
"n" : 90183,
"nscannedObjects" : 90183,
"nscanned" : 94885,
"nscannedObjectsAllPlans" : 90288,
"nscannedAllPlans" : 94990,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 272,
"indexBounds" : {
"a" : [
[
5000,
1.7976931348623157e+308
]
],
"b" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"c" : [
[
5000,
1.7976931348623157e+308
]
]
}
}
Order in which you pass fields in your query does not affect index selection process. If it did, it'd be a very fragile system.
Order of fields in the index definition, on the other hand, is very important. Maybe you confuse these two cases.

force use of index on complex MongoDB query?

i have a large collection of "messages" with 'to', 'from', 'type', and 'visible_to' fields that I want to query against with a fairly complex query that pulls only the messages to/from a particular user of a particular set of types that are visible to that user. Here is an actual example:
{
"$and": [
{
"$and": [
{
"$or": [
{
"to": "52f65f592f1d88ebcb00004f"
},
{
"from": "52f65f592f1d88ebcb00004f"
}
]
},
{
"$or": [
{
"type": "command"
},
{
"type": "image"
}
]
}
]
},
{
"$or": [
{
"public": true
},
{
"visible_to": "52f65f592f1d88ebcb00004f"
}
]
}
]
}
With indexes:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"expires" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "expires_1",
"background" : true,
"safe" : null
},
{
"v" : 1,
"key" : {
"from" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "from_1",
"background" : true,
"safe" : null
},
{
"v" : 1,
"key" : {
"type" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "type_1",
"background" : true,
"safe" : null
},
{
"v" : 1,
"key" : {
"ts" : 1,
"type" : -1
},
"ns" : "n2-mongodb.messages",
"name" : "ts_1_type_-1",
"background" : true,
"safe" : null
},
{
"v" : 1,
"key" : {
"to" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "to_1",
"background" : true,
"safe" : null
},
{
"v" : 1,
"key" : {
"visible_to" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "visible_to_1",
"background" : true,
"safe" : null
},
{
"v" : 1,
"key" : {
"public" : 1,
"visible_to" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "public_1_visible_to_1"
},
{
"v" : 1,
"key" : {
"to" : 1,
"from" : 1
},
"ns" : "n2-mongodb.messages",
"name" : "to_1_from_1"
}
]
And here is the explain(true) output from our MongoDB 2.2.2 instance, which looks like a full scan:
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 35702,
"nscanned" : 35702,
"nscannedObjectsAllPlans" : 35702,
"nscannedAllPlans" : 35702,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 85,
"indexBounds" : {
},
"allPlans" : [
{
"cursor" : "BasicCursor",
"n" : 0,
"nscannedObjects" : 35702,
"nscanned" : 35702,
"indexBounds" : {
}
}
],
"server" : "XXXXXXXX"
}
Looking at the explain output, MongoDB is not using any indexes for this - is there a way to get it to use at least the compound index {to: 1, from: 1} to dramatically narrow the search space? Or is there a better way to optimize this query? Or is MongoDB wholly unsuited for a query like this?
To force the MongoDB query optimizer to adopt a specific approach, you can use the $hint operator.
From the docs,
The $hint operator forces the query optimizer to use a specific index to fulfill the query. Specify the index either by the index name or by document.
The query optimizer in MongoDB 2.6 will include support for applying indexes to complex queries.

Nested queries Date range

I have a project where I embeds date ranges in a document.
Something like the following:
{ "availabilities" : [
{ "start_date" : ISODate("2012-06-28T00:00:00Z"), "end_date" : ISODate("2012-10-03T00:00:00Z") },
{ "start_date" : ISODate("2012-10-08T00:00:00Z"), "end_date" : ISODate("2012-10-28T00:00:00Z") }]
}
What I need to do is find all the documents that are available during a certain period
I use a query like this one:
db.faces.find({"availabilities" : {"$elemMatch" : {"$and" : [{"start_date" : {"$lte" : ISODate('2012-10-01 00:00:00 UTC')}}, {"end_date" : {"$gte": ISODate('2012-10-07 00:00:00 UTC')}}]}}})
But it won't use my indexes:
{
"v" : 1,
"key" : {
"availabilities.start_date" : 1,
"availabilities.end_date" : 1
},
"ns" : "faces_development.faces",
"name" : "availabilities.start_date_1_availabilities.end_date_1"
}
When I do an explain on the query, the output for the indexBounds is quite strange and I don't understand it.
{
"cursor" : "BtreeCursor availabilities.start_date_1_availabilities.end_date_1",
"isMultiKey" : true,
"n" : 71725,
"nscannedObjects" : 143019,
"nscanned" : 143019,
"nscannedObjectsAllPlans" : 143221,
"nscannedAllPlans" : 143221,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 2,
"nChunkSkips" : 0,
"millis" : 1608,
"indexBounds" : {
"availabilities.start_date" : [
[
true,
ISODate("2012-10-01T00:00:00Z")
]
],
"availabilities.end_date" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "foobar.local:27017"
}
Current version of mongoDB: MongoDB shell version: 2.2.0
How must I do to use indexes?
Trying to find related questions and bugs on mongodb without great success.
This will scan less of the index in 2.3: https://jira.mongodb.org/browse/SERVER-3104
Meanwhile, I suggest moving each availability into its own document, instead of having many in one array, for more efficient querying.