Query without projection is not covered - mongodb

See the shell example below (assumes db.test does not exist):
db.test.ensureIndex({info: 1, _id: 1})
db.test.insert({info: "info1"})
db.test.insert({info: "info2"})
db.test.insert({info: "info3"})
db.test.find({info: "info1"}).explain().indexOnly //is false
db.test.find({info: "info1"}, {_id: 1, info: 1}).explain().indexOnly //is true
The first explain has indexOnly : false whereas the second has indexOnly : true although the two queries are strictly equivalent.
Why isn't db.test.find({info: "info1"}) a covered query?

I have been thinking and testing this more and it does make sense now. If you add no projection MongoDB has no way of "knowing" if the index you have actually fills the entire return; I mean how can it know the index covers the projection without looking at the documents?
It is the same as select * and select d,e in SQL. How can you know that * is the same as d,e without actually looking?
If you supply a projection then MongoDB can "know" that looking at the index will give you your full result set however, without a projection it cannot.
So after some thinking I do not think this is a bug it is just a "quirk".

Related

Mongodb searching on array / indexing

I'm using the airbnb sample set and it has a field that looks like:
"amenities": ["TV", "Cable TV", "Wifi"....
So I'm trying to do a case-INsensitive, wildcard search (on one or more values passed in).
Only thing I've found that works is:
{ amenities: { $in: [ /wi/ ] }}
Is that the best way?
So I ran it in Compass as the dataset was imported (5600 docs), and the Explain says it took ~20ms on my machine and warned there was no index. I then created an index on the amenities column and the same search jumped up to ~100ms. I just created the index through the Compass UI, so not sure why its taking 5x as long with an index? Or if there is a better way to do this?
The way to run that query is:
{ amenities: /wi/i }
//better but not always useful
{ amenities: /wi/i }, { amenities:1, _id:0 }
It already traverses the array, and to be case insensitive it must be on the options.
For multikey indexes the second query won't be a covered query. Otherwise, it would be blazing fast.
I've tested a similar search with and without index though. Exec. time is reduced 10X. (1500ms to 150ms, in a huge collection). Measure with Mongo Hacker.
As you report executionTimeMilliseconds is not that different. But still smaller.
The reason why you don't see a huge decrease in time is because the index stores each array entry separately. When it finds a match, it comes back to collection to fetch the whole array field, instead of using the indexes.
Probably indexes aren't very useful for arrays.
When querying with an unanchored regex, the query executor will have to scan every index key to see if there is a match.
You might find a collated index to be helpful.
Create an index with the appropriate collation, like:
(strength 1 and 2 are case-insensitive)
db.collection.createIndex({amenities:1},{collation:{locale:"en",strength:1}})
Then query using the same collation:
db.collection.find({amenities:"wifi"}).collation({locale:"en",strength:1})
The search will be case insensitive, and it can efficiently use the index.

MongoDB: Indexes, Sorting

After having read the official documentations on indexes, sort, intersection, i'm a little bit confuse on how everything work together.
I've trouble making my query use the indexes i've created. I work on a mongodb 3.0.3, on a collection having ~4millions of document.
To simplify, let's say my document is composed of 6 fields:
{
a:<text>,
b:<boolean>,
c:<text>,
d:<boolean>,
e:<date>,
f:<date>
}
The query I want to achieve is the following :
db.mycoll.find({ a:"OK", b:true, c:"ProviderA", d:true, e:{ $gte:ISODate("2016-10-28T12:00:01Z"),$lt:ISODate("2016-10-28T12:00:02") } }).sort({f:1});
So intuitively I've created two indexes
db.mycoll.createIndex({a: 1, b: 1, c: 1, d:1, e:1 }, {background: true,name: "test1"})
db.mycoll.createIndex({f:1}, {background: true,name: "test2"})
But the explain() give me that the first index is not used at all.
I known there is some kind of limitation when there is ranges in play in the filter (in the e field), but I can't find my way around it.
Also instead of having a single index on f, I try a compound index on {e:1,f:1} but it didn't change anything.
So What I have misunderstood?
Thanks for your support.
Update: also I find some time the following predicate for mongodb 2.6 :
A good rule of thumb for queries with sort is to order the indexed fields in this order:
First, the field(s) on which you will query for exact values.
Second, the field(s) on which you will sort.
Finally, field(s) on which you will query for a range of values (e.g., $gt, $lt, $in)
An example of using this rule of thumb is in the section on “Sorting the results of a complex query on a range of values” below, including a link to further reading.
Does this also apply for 3.X version?
Update 2: following above predicate, I created the following index
db.mycoll.createIndex({a: 1, b: 1, c: 1, d:1 , f:1, e:1}, {background: true,name: "test1"})
And for the same query :
db.mycoll.find({ a:"OK", b:true, c:"ProviderA", d:true, e:{ $gte:ISODate("2016-10-28T12:00:01Z"),$lt:ISODate("2016-10-28T12:00:02") } }).sort({f:1});
the index is indeed used. However too much keys seems to be scan, I may need to find a better order the fields in the query/index.
Mongo acts sometimes a bit strange when it comes to the index selection.
Mongo automagically decides what index to use. The smaller an index is the more likely it is used (especially indexes with only one field) - this is my experience. May be this happens because it is more often already loaded in RAM? To find out what index to use when Mongo performs test queries when it is idle. However the result is sometimes unexpected.
Therefore if you know what index to use you can force a query to use a specific index using the $hint option. You should try that.
Your two indexes used in the query and the sort does not overlap so MongoDB can not use them for index intersection:
Index intersection does not apply when the sort() operation requires an index completely separate from the query predicate.

what is the meaning of first {} curly brace in mongodb projection query

I want to know what is the need of first curly brace {} after find here in this query.
db.mycol.find({},{"title":1,_id:0})
It is an empty query, in the sense of a limiting boundary. So
db.mycol.find({},{"title":1,_id:0})
would basically translate to
Show me the title, but not the _id (as you would do by default) for all documents of the mycol collection in the current database.
Let's say you want all the titles written by Alan Turing. So you could modify the query part like this:
db.mycol.find({"author":"Alan Turing"},{"title":1,_id:0})
In general, MongoDB's find operation can be described like this
db.collection.find(query,projection)
For more detailed information, you might want to read find's documentation.
The first Curly braces is used as a where condition in MySql
Check out this Link - SQL to MongoDB Mapping Chart
MySQL
SELECT user_id, status FROM users WHERE status = "A"
MongoDB
db.users.find(
{ status: "A" },
{ user_id: 1, status: 1, _id: 0 }
)
This is called projection and tells which fields to show. For example here you show only title. It is something like Select title from mycol; If you will not specify projection it will return everything, which is close to select * from mycol;
_id is always shown, unless you hide it explicitly with _id: 0. Another thing is that you can not mix 0 and 1 (except for _id). But one more time - read FAQ.
You should refer this link for better explanation
http://docs.mongodb.org/manual/reference/operator/projection/positional/
first of all, projection does not return the first result it finds it tells mongo what to return.
.findOne(query) will LIMIT result to one or find(query).limit(1) will also do the same.
You say you are trying to "get all" your data.
a standard find type query will get your started...
find({mongo:query},{mongo:projection})
but limits results to a cursor from the mongo shell*

mongodb: why indexOnly=false when collection is empty

Let's say I have an empty db without any collections. Then I run db.qqq.ensureIndex({a:1}).
In the output of db.qqq.find().explain() I see BasicCursor and "indexOnly" : false. That seems OK.
db.qqq.find({a:"somevalue"}).explain() outputs BtreeCursor a_1, but it also tells "indexOnly" : false. Why does this happen?
Why the given index isn't enough for mongodb to fulfill my query?
UPD: OK, so I need to use projection, since there is no all fields in my index. But what I don't understand -- if Mongo can see from index that there is no any documents matching query, then why should it scan the actual documents?
You need to add projection to that query, index only means it gets ALL data from the index. MongoDB cannot use an index only cursor if you want to get the full document back. So i.e.:
db.qqq.find({a:"somevalue"},{a:1,_id:0}).explain()
Should work.
MongoDB doesn't know that there are no documents until it searches for them, so it will have to at least check in the index if it can. A "BasicCursor" with "n=0" is not really a bad thing of course as no actual documents are read (or index elements, as there are none).
Also, if you want to use a covered index, you need to use a projection so that only fields are returned that are actually part of the index. You do that with:
db.qqq.find({a:"somevalue"},{a:1,_id:0}).explain()

How to do efficent query to replace $exists in MongoDB

I have a MongoDB collection with various data in it. (about millions)
These data have a data struct like {k: {a:1,b:2,c:{},...}} and I don't know extactly what in it.
Now I wanna do a counting on this collection to return me the total elements in the collection that k is not empty by using {k:{$exists:true}} but that's turns out very slow ...
Then I add an index on k and trying to query by : {k:{$gt:{}} but that's not return the correct results.
So, how to do this counting on the collection now?
Note that I don't know the data structure of k.
If you are using a version before version 2, $exists is not able to use an index.
See this answer: https://stackoverflow.com/a/7503114/131809
So, try upgrading your version of MongoDB
From the docs:
Before v2.0, $exists is not able to use an index. Indexes on other
fields are still used.
$exists is not very efficient even with an
index, and esp. with {$exists:true} since it will effectively have to
scan all indexed values.
The second part of that is perhaps the important bit.
It sounds like sparse index may be the key here...
db.collection.count({k:{$ne:null}})
By the way use sparse index on k.
db.collection.ensureIndex({k:1}, {sparse: true});
Try using $ne : null
So, as per your code example:
{k:{$ne : null}}