MongoDB - explain.executionStats - mongodb

Are there any elements within the output of MongoDB's explain("executionStats") that gives an idea or a hint about - whether the query is using a given index for filtering or sorting or for both?
I read the following URLs
Mongodb compound indexes for filtering and sorting on BIG collection [points to below URL and has brief discussion]
https://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ [ this one gives general idea, but the explain output uses older format/elements that don't exist in Mongodb 4.0 that I am using ]
https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/ [documents how to determine the index and leverage index prefixes, but does show explain output confirming the usage]

From MongoDB Docs:
If MongoDB can use an index scan to obtain the requested sort order,
the result will not include a SORT stage. Otherwise, if MongoDB cannot
use the index to sort, the explain result will include a SORT stage.
Example:
Look at the sample data from sortop collection.
Explain plan for a query without index:
Create Index on the collection:
Run the same query and check SORT stage in explain plan:

Related

How does mongodb decide which index to use for a query?

When a certain query is done on a mongodb collection, if there are multiple indexes that can be used to perform the query, how does mongodb choose the index for the query?
for an example, in a 'order' collection, if there are two indexes for columns 'customer' and 'vendor', and a query is issued with both customer and vendor specified, how does mongodb decide whether to use the customer index or the vendor index?
Is there a way to instruct mongodb to prefer a certain index over another, for a given query?
When a certain query is done on a mongodb collection, if there are
multiple indexes that can be used to perform the query, how does
mongodb choose the index for the query?
You can generate a query plan for a query you are trying to analyze - see what indexes are used and how they are used. Use the explain method for this; e.g. db.collection.explain().find(). The explain takes a parameter with values "queryPlanner" (the default), "executionStats" and "allPlansExecution". Each of these have different plan output.
The query optimizer generates plans for all the indexes that could be used for a given query. In your example order collection, the two single field indexes (one each for the fields customer and vendor) are possible candidates (for a query filter with both the fields). The optimizer uses each of the plans and executes them for a certain period of time and chooses the best performing candidate (this is determined based upon factors like - which returned most documents in least time, and other factors). Based upon this it will output the winning and rejected plans and these can be viewed in the plan output. You will see one of the indexes in the winning plan and the other in the rejected plan in the output.
MongoDB caches the plans for a given query shape. Query plans are cached so that plans need not be generated and compared against each other every time a query is executed.
Is there a way to instruct mongodb to prefer a certain index over
another, for a given query?
There are couple of ways you can use:
Force MongoDB to use a specific index using the hint() method.
Set Index Filters to specify which indexes the optimizer will evaluate for a query shape. Note that this setting is not persisted after a server shutdown.
Their official website states:
MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays.
You can checkout This article for more information
For your second query, you can try creating custom indexes for documents. Checkout their Documentation for the same

Why does mongodb not use index scan but collection scan with find()?

I am using mongodb 3.2.4
When I execute db.mytable.find().explain() The winning plan is 'Collscan'
But when I execute db.mytable.find().hint(_id:1).explain() The winning plan is 'IXscan'
So here comes a question: since _id is the default index of a table, why mongodb does not use this index to query?
An index can be used when there is a filter criteria or a sort operation - when the fields in the index are used in the filter predicate and/or the sort. In your case, the find method doesn't have a filter criteria or a sort - so no index is used, and you can see that in the query plan as a collection scan. It is as expected. But, when you provide a hint to the find method the query optimizer tries to use the index, and in your case it did (and you see it in the query plan as an IXSCAN). In either case, with or the without the hint, the find has to scan all the documents or keys in the index.
The _id has a default unique index, yes, but unless you are using the _id field in the query filter predicate or in a sort, the query cannot use it (or, specify explicitly to use index with a hint). You can verify with the following queries, db.mytable.find( { _id: 123 } ) or db.mytable.find( { } ).sort( { _id: -1 } ) the query planner will show index scan even though you do not specify the hint.
The main purpose of the indexes is to make your queries run fast; it is about query performance. It has to be a query with filter predicate and/or a sort operation to use an index (and the fields used in the filter or sort must be indexed for performance). With the find method, in your case, without any of the two you are just accessing all the documents as they are in the collection and the index is of no use (and the query optimizer shows that in the plan).

CosmosDB MongoDB 3.6 fails sort() query with compounded index

Newby MongoDB & CosmosDB user here, I've read the answer to this question How does MongoDB treat find().sort() queries with respect to single and compound indexes? and the offocial MongoDB docs and I believe my index creation mirrors that answer so I am leaning towards this being a CosmosDB issue but reading their documentation CosmosDB 3.6 supports compounded indexes as well, so I am at a loss right now.
I am able to run sort() queries like db.Videos.find().sort({"PublishedOn": 1}) from the mongo command line on a collection with an index created as db.Videos.createIndex({"PublishedOn": 1}) or db.Videos.createIndex({"PublishedOn": -1}).
And when I add a 'where' clause to the find like this db.Videos.find({"IsPinned": false}).sort({"PublishedOn": 1}) the above index still works.
However I now have document look ups which I want to avoid, so I drop the above single field index and create a compounded index like this db.Videos.createIndex({"IsPinned": 1, "PublishedOn": 1}) or db.Videos.createIndex({"PublishedOn": 1, "IsPinned": 1}) but now the query always fails with the error The index path corresponding to the specified order-by item is excluded..
Is this a limitation of CosmosDB or is my 'ordering' in the index bad?
The issue with CosmosDB is that it expects all WHERE fields to be used in the GROUP BY clause as well in exactly the same order else it won't use the index.
Creating an index as db.Videos.createIndex({"IsPinned": 1, "PublishedOn": 1}) and then updating the query to be db.Videos.find({"IsPinned": false}).sort({"IsPinned": 1, "PublishedOn": 1}) works like a charm.
I inferred this from reading the CosmosDB documentation on indexing policies (https://learn.microsoft.com/en-us/azure/cosmos-db/index-policy) as the MongoDB documentation suddenly stops after the index creation (https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb-indexing) section.

Fundamental misunderstanding of MongoDB indices

So, I read the following definition of indexes from [MongoDB Docs][1].
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
Indexes are special data structures that store a small portion of the
collection’s data set in an easy to traverse form. The index stores
the value of a specific field or set of fields, ordered by the value
of the field. The ordering of the index entries supports efficient
equality matches and range-based query operations. In addition,
MongoDB can return sorted results by using the ordering in the index.
I have a sample database with a collection called pets. Pets have the following structure.
{
"_id": ObjectId(123abc123abc)
"name": "My pet's name"
}
I created an index on the name field using the following code.
db.pets.createIndex({"name":1})
What I expect is that the documents in the collection, pets, will be indexed in ascending order based on the name field during queries. The result of this index can potentially reduce the overall query time, especially if a query is strategically structured with available indices in mind. Under that assumption, the following query should return all pets sorted by name in ascending order, but it doesn't.
db.pets.find({},{"_id":0})
Instead, it returns the pets in the order that they were inserted. My conclusion is that I lack a fundamental understanding of how indices work. Can someone please help me to understand?
Yes, it is misunderstanding about how indexes work.
Indexes don't change the output of a query but the way query is processed by the database engine. So db.pets.find({},{"_id":0}) will always return the documents in natural order irrespective of whether there is an index or not.
Indexes will be used only when you make use of them in your query. Thus,
db.pets.find({name : "My pet's name"},{"_id":0}) and db.pets.find({}, {_id : 0}).sort({name : 1}) will use the {name : 1} index.
You should run explain on your queries to check if indexes are being used or not.
You may want to refer the documentation on how indexes work.
https://docs.mongodb.com/manual/indexes/
https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/

Where does compound indexes in mongodb come into play

What are the advantages we get from compound indexes. I mean suppose we have a collection, in which I have to index over 2 fields say key1 and key2. How different is it from having a compound index {key1:1, key2:1}. Whats the problem with having 2 separate indexes. Can't mongodb make use of 2 or more indexes to satisfy a query.
As at MongoDB 2.2:
Every query, including update operations, use one and only one index.
The query optimizer selects the index empirically by occasionally running alternate query plans and by selecting the plan with the best response time for each query type.
An exception to the above rule is $or queries; each clause is executed in parallel and can use a separate index.
For more information see:
Indexing Overview
Query Optimizer
Explain