Why the indexonly attribute value is false from mongodb explain query output - mongodb

I have a simple collection named customers as shown below
db.customers.find().pretty()
{
"_id" : ObjectId("524eb09ca71b72672e65ebb6"),
"name" : "kiran",
"occupation" : "SelfEmployeed",
"country" : "IND"
}
{
"_id" : ObjectId("524eb0a4a71b72672e65ebb7"),
"name" : "Mark",
"occupation" : "Architect",
"country" : "US"
}
{
"_id" : ObjectId("524eb0aba71b72672e65ebb8"),
"name" : "beast",
"occupation" : "housewife",
"country" : "UK"
}
{
"_id" : ObjectId("524eb0b2a71b72672e65ebb9"),
"name" : "Philip",
"occupation" : "Engineer",
"country" : "SWE"
}
I have created indexes on the name and country fields as shown below
db.customers.ensureIndex({name : 1}, {"unique" : false})
db.customers.ensureIndex({country : 1}, {"unique" : false})
The indexes have been created
db.customers.getIndexKeys()
[ { "_id" : 1 }, { "name" : 1 }, { "country" : 1 } ]
This was the result of my query explain
db.customers.find({name : "Mark"}).explain()
{
"cursor" : "BtreeCursor name_1",
"isMultiKey" : false,
"n" : 1,
"nscannedObjects" : 1,
"nscanned" : 1,
"nscannedObjectsAllPlans" : 1,
"nscannedAllPlans" : 1,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"name" : [
[
"Mark",
"Mark"
]
]
},
"server" : "************"
}
Why the indexonly attribute value is false ??
I have seen similar question which explains it is due to
indexonly is false as It wont use index only because you'll be retrieving other fields via that query that aren't indexed.
Please let me know what does other fields mean here ??

Some searching would have actually have got you a question I answered some time ago on this very subject.
The reason is this:
db.customers.find({name : "Mark"}).explain()
There is no projection, how can MongoDB know that the index covers the return without looking at the actual documents in this case?
It is similar to
SELECT * from customers
And
SELECT d from customers
How can you know * is d without looking?

Related

MongoDB object field and range query index

I have the following structure in the database:
{
"_id" : {
"user" : 14197,
"date" : ISODate("2014-10-24T00:00:00.000Z")
},
...
}
I have a performance problem when I try to select data by user & date-range. Monogo doesn't use index & runs full-scan over collection.
db.timeuse.daily.find({ "_id.user": 289006, "_id.date" : {$gt: ISODate("2014-10-23T00:00:00Z"), $lte: ISODate("2014-10-30T00:00:00Z")}}).explain()
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 6,
"nscannedObjects" : 66967,
"nscanned" : 66967,
"nscannedObjectsAllPlans" : 66967,
"nscannedAllPlans" : 66967,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 523,
"nChunkSkips" : 0,
"millis" : 1392,
"server" : "mongo-shard0003:27018",
"filterSet" : false,
"stats" : {
"type" : "COLLSCAN",
"works" : 66969,
"yields" : 523,
"unyields" : 523,
"invalidates" : 16,
"advanced" : 6,
"needTime" : 66962,
"needFetch" : 0,
"isEOF" : 1,
"docsTested" : 66967,
"children" : [ ]
},
"millis" : 1392
}
So far I found only one way - use $in.
db.timeuse.daily.find({"_id": { $in: [
{"user": 289006, "date": ISODate("2014-10-23T00:00:00Z")},
{"user": 289006, "date": ISODate("2014-10-24T00:00:00Z")}
]}}).explain()
{
"cursor" : "BtreeCursor _id_",
"isMultiKey" : false,
"n" : 2,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"_id" : [
[
{
"user" : 289006,
"date" : ISODate("2014-10-23T00:00:00Z")
},
{
"user" : 289006,
"date" : ISODate("2014-10-23T00:00:00Z")
}
],
[
{
"user" : 289006,
"date" : ISODate("2014-10-24T00:00:00Z")
},
{
"user" : 289006,
"date" : ISODate("2014-10-24T00:00:00Z")
}
]
]
},
If there's a more elegant way to run this kind of query?
TL;DR: Don't put your data in the _id field and use a compound index: db.timeuse.daily.ensureIndex( { "user" : 1, "date": 1 } ).
Explanation:
You're abusing the _id key convention, or more precisely the fact that MongoDB can index entire objects. What you want to achieve requires index intersection or a compound index, that is, either two separate indexes that can be combined (that feature is called index intersection and by now, it should be available in MongoDB, but it has limitations) or a special index for the set of keys which in MongoDB is called a compound index.
The _id field is indexed by default, but it's indexed as a whole, i.e. the _id index with only support equality queries on the entire object, rather than parts of the object. That also explains why the $in query works.
In general, that data structure with the default index will behave oddly. Consider this:
> db.sort.insert({"_id" : {"name" : "foo", value : 1} });
> db.sort.insert({"_id" : {"name" : "foo", value : 1, bla : "foo"} });
> db.sort.find();
{ "_id" : { "name" : "foo", "value" : 4343 } }
{ "_id" : { "name" : "foo", "value" : 4343, "bla" : "fooffo" } }
> db.sort.find({"_id" : {"name" : "foo", value : 4343} });
{ "_id" : { "name" : "foo", "value" : 4343 } }
// no second result here...
Imagine MongoDB basically hashed the entire object and was simply looking for the object hash - such an index can't support range queries based on some part of the hash.

Why indexOnly attribute is false for this covered query

I have a test db with fields _id, name, age, date
Indexes:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "blogger.users"
},
{
"v" : 1,
"key" : {
"name" : 1,
"age" : 1
},
"name" : "name_1_age_1",
"ns" : "blogger.users"
},
{
"v" : 1,
"key" : {
"age" : 1,
"name" : 1
},
"name" : "age_1_name_1",
"ns" : "blogger.users"
}
]
When running the following query:
> db.users.find({"name":"user10"},{"_id":0,"date":0})
.explain()
I get following:
{
"cursor" : "BtreeCursor name_1_age_1",
"isMultiKey" : false,
"n" : 1,
"nscannedObjects" : 1,
"nscanned" : 1,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"name" : [
[
"user10",
"user10"
]
],
"age" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "Johny-PC:27017",
"filterSet" : false
}
Without explain the result is:
{ "name" : "user10", "age" : 68 }
Even though this is a covered query with proper projections, the indexOnly field is still false. I have also tried explicitly providing index using hint, but no change. In that case values of nscannedObjectsAllPlans and nscannedAllPlans are 1 as the query doesnt try other indexes.
For a query to be "indexOnly" or "covered" the only fields returned must be contained in the index. So even though you have an index for "name_1_age_1", the query engine still expects to be "told" that the only fields you want are those in the index. It does not know this about the document until you inspect it:
db.users.find({"name":"user10"},{"_id":0, "name": 1, "age": 1 }).explain()
That will return "indexOnly" as the query engine knows that the selected index contains all of the fields that are required for output. As such there is no need to go back through the collection in case there are other fields to return.

MongoDB index not covering trivial query

I am working on a MongoDB application, and am having trouble with covered queries. After reworking many of my queries to perform better when paginating I found that my previously covered queries were no longer being covered by the index. I tried to distill the working set down as far as possible to isolate the issue but I'm still confused.
First, on a fresh (empty) collection, I inserted the following documents:
devdb> db.test.find()
{ "_id" : ObjectId("53157aa0dd2cab043ab92c14"), "metadata" : { "created_by" : "bcheng" } }
{ "_id" : ObjectId("53157aa6dd2cab043ab92c15"), "metadata" : { "created_by" : "albert" } }
{ "_id" : ObjectId("53157aaadd2cab043ab92c16"), "metadata" : { "created_by" : "zzzzzz" } }
{ "_id" : ObjectId("53157aaedd2cab043ab92c17"), "metadata" : { "created_by" : "thomas" } }
{ "_id" : ObjectId("53157ab9dd2cab043ab92c18"), "metadata" : { "created_by" : "bbbbbb" } }
Then, I created an index for the 'metadata.created_by' field:
devdb> db.test.getIndices()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "devdb.test",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"metadata.created_by" : 1
},
"ns" : "devdb.test",
"name" : "metadata.created_by_1"
}
]
Now, I tried to lookup a document by the field:
devdb> db.test.find({'metadata.created_by':'bcheng'},{'_id':0,'metadata.created_by':1}).sort({'metadata.created_by':1}).explain()
{
"cursor" : "BtreeCursor metadata.created_by_1",
"isMultiKey" : false,
"n" : 1,
"nscannedObjects" : 1,
"nscanned" : 1,
"nscannedObjectsAllPlans" : 1,
"nscannedAllPlans" : 1,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"metadata.created_by" : [
[
"bcheng",
"bcheng"
]
]
},
"server" : "localhost:27017"
}
The correct index is being used and no extraneous documents are being scanned. Regardless of the presence of .hint(), limit(), or sort(), indexOnly remains false.
Digging through the documentation, I've seen that covered indices will fail to cover queries on array elements, but that isn't the case here (and isMultiKey shows false).
What am I missing? Are there other reasons for this behavior (eg. insuffient RAM, disk space, etc.)? And if so, how can I best diagnose these issues in the future?
It is currently not supported. See this Jira issue.

Nested queries Date range

I have a project where I embeds date ranges in a document.
Something like the following:
{ "availabilities" : [
{ "start_date" : ISODate("2012-06-28T00:00:00Z"), "end_date" : ISODate("2012-10-03T00:00:00Z") },
{ "start_date" : ISODate("2012-10-08T00:00:00Z"), "end_date" : ISODate("2012-10-28T00:00:00Z") }]
}
What I need to do is find all the documents that are available during a certain period
I use a query like this one:
db.faces.find({"availabilities" : {"$elemMatch" : {"$and" : [{"start_date" : {"$lte" : ISODate('2012-10-01 00:00:00 UTC')}}, {"end_date" : {"$gte": ISODate('2012-10-07 00:00:00 UTC')}}]}}})
But it won't use my indexes:
{
"v" : 1,
"key" : {
"availabilities.start_date" : 1,
"availabilities.end_date" : 1
},
"ns" : "faces_development.faces",
"name" : "availabilities.start_date_1_availabilities.end_date_1"
}
When I do an explain on the query, the output for the indexBounds is quite strange and I don't understand it.
{
"cursor" : "BtreeCursor availabilities.start_date_1_availabilities.end_date_1",
"isMultiKey" : true,
"n" : 71725,
"nscannedObjects" : 143019,
"nscanned" : 143019,
"nscannedObjectsAllPlans" : 143221,
"nscannedAllPlans" : 143221,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 2,
"nChunkSkips" : 0,
"millis" : 1608,
"indexBounds" : {
"availabilities.start_date" : [
[
true,
ISODate("2012-10-01T00:00:00Z")
]
],
"availabilities.end_date" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "foobar.local:27017"
}
Current version of mongoDB: MongoDB shell version: 2.2.0
How must I do to use indexes?
Trying to find related questions and bugs on mongodb without great success.
This will scan less of the index in 2.3: https://jira.mongodb.org/browse/SERVER-3104
Meanwhile, I suggest moving each availability into its own document, instead of having many in one array, for more efficient querying.

Mongo index not being used (simple one column query)

Explain of find query:
> db.datasources.find({nid: 19882}).explain();
{
"cursor" : "BtreeCursor nid_1",
"nscanned" : 10161684,
"nscannedObjects" : 10161684,
"n" : 10161684,
"millis" : 8988,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"nid" : [
[
19882,
19882
]
]
}
}
Here are the indexes for the collection:
> db.datasources.getIndexes()
[
{
"name" : "_id_",
"ns" : "rocdocs_dev.datasources",
"key" : {
"_id" : 1
}
},
{
"_id" : ObjectId("4edcd725c605da5f200000a2"),
"ns" : "rocdocs_dev.datasources",
"key" : {
"nid" : 1
},
"name" : "nid_1"
},
{
"v" : 1,
"key" : {
"is_indexed" : 1
},
"ns" : "rocdocs_dev.datasources",
"name" : "is_indexed_1"
}
]
This is using an index as noted by BtreeCursor If it werent, it would say BasicCursor
Though I do see that the query takes 9 seconds and scans what appears to be the entire collection.
Did you add this index after inserting those documents? Perhaps its not done building yet?
I would consider rebuilding the index
db.datasources.reIndex()