Order of Fields in Mongo Query vs Ordered Checked In - mongodb

Say you're querying documents based on 2 data points. One is a simple bool parameter, and the other is a complicated $geoWithin calculation.
db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} )
Will mongo reorder these parameters, so that it checks the boolField 1st, before running the complicated check?

MongoDB uses indexes like any other DBs. So the important thing for mongoDB is if any query fields has an index or not, not the order of query fields. At least there is no information in their documentation that mongoDB try to checks primitive query fields first. So for your example if boolField has an index mongoDB first check this field and eliminate documents whose boolField is false. But If geoField has an index then mongoDB first execute query on this field.
So what happens if none of them have index or both of them have? It should be the given order of fields in query because there is no suggestion or info beside of indexes in query optimization page of mongoDB. Additionally you can always test your queries performances with just adding .explain("executionStats").
So check the performance of db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} ) and db.collection.find( { "boolField" : true, "geoField": { "$geoWithin" : ...} } ). And let us know :)

To add to above response, if you want mongo to use specific index you can use cursor.hint . This https://docs.mongodb.com/manual/core/query-plans/ explains how default index selection is done.

Related

Using object as _id in MongoDb causes collscan on queries

I'm having some issues with using a custom object as my _id value in MongoDb.
The objects I'm storing in _id looks like this:
"_id" : {
"EDIEL" : "1010101010101",
"StartDateTicks" : NumberLong(636081120000000000)
}
Now, when I'm performing the following query:
.find({
"_id.EDIEL": { $eq: "1010101010101" },
"_id.StartDateTicks": { $gte: 636082776000000000, $lt: 636108696000000000 }
}).explain()
I does a COLLSCAN. I can't figure out why exactly. Is it because I'm not querying against the _id object with an object?
Does anyone know what I'm doing wrong here? :-)
Edit:
Tried to create a compound index containing the EDIEL and StartDateTicks fields, ran the query again and now it uses the index instead of a column scan. While this works, it would still be nice to avoid having the extra index and just having the _id (since it's basically a "free" index) So, the question still stands: why can't I query against the _id.EDIEL and _id.StartDateTicks and make use of the index?
Indexes are used on keys and not on objects, so when you use object for _id, the indexing on object can't be used for the specific query you do on the field of the object.
This is true not only for _id but subdocument also.
{
"name":"awesome book",
"detail" :{
"pages":375,
"alias" : "AB"
}
}
Now when you have index on detail and you query by detail.pages or detail.alias, the index on detail cannot be used and certainly not for range queries. You need to have indexes on detail.pages and detail.alias.
when index is applied on object it maintains the index of object as a whole and not per field, that's why queries on object fields are not able to use object indexes.
Hope that helps
You will need to index the two fields separately, since indexes cant be on embedded documents. Thus creating a compound index is the only option available, or creating multiple indexes on the fields which in turn use intersection index are the options for you.

In MongoDB, is db.collection.find() same as db.collection.find().sort({$natural:1})?

I'm sure this is an easy one, but I just wanted to make sure. Is find() with some search and projection criterion same as applying a sort({$natural:1}) on it?
Also, what is the default natural sort order? How is it different from a sort({_id:1}), say?
db.collection.find() has the result as same as db.collection.find().sort({$natural:1})
{"$natural" : 1} forces the find query to do a table scan (default sort), it specifies hard-disk order when used in a sort.
When you are updating your document, mongo could move your document to another place of hard-disk.
for example insert documents as below
{
_id : 0,
},
{
_id : 1,
}
then update:
db.collection.update({ _id : 0} , { $set : { blob : BIG DATA}})
And when you perform the find query you will get
{
"_id" : 1
},
{
"_id" : 0,
"blob" : BIG DATA
}
as you see the order of documents has changed => the default order is not by _id
If you don't specify the sort then mongodb find() will return documents in the order they are stored on disk. Document storage on disk may coincide with insertion order but thats not always going to be true. It is also worth noting that the location of a document on disk may change. For instance in case of update, mongodb may move a document from one place to another if needed.
In case of index - The default order will be the order in which indexes are found if the query uses an index.
The $natural is the order in which documents are found on disk.
It is recommended that you specifiy sort explicitly to be sure of sorting order.

How to create index for a query which does sorting on some fields?

We are using Mongodb for our Application
I have got a query regarding creation of indexes on a collection .
First of all below is my query which will do a find on DB and returns the data
db.mycollection.find({ symbol: "UGHNG", date: "2013-11-08", mainsymbol: "HIJ" }).sort( { "price": 1,"surv": 1} ).pretty()
As you can see from the above query that there are two fields price and surv which are used for sorting purpose only .
What is the best way for creating index for the above
Create a compound Index which includes every field of the above query .
db.mycollection.ensureIndex({"symbol":1,"date":1,"mainsymbol":1,"price":1,"surv":1},{"unique" : false})
OR
Create 2 single indexes and one compund index as shown below to serve the above query
db.mycollection.ensureIndex({"price" : 1}, {"unique" : false})
db.mycollection.ensureIndex({"surv" : 1}, {"unique" : true})
db.mycollection.ensureIndex({"symbol":1,"date":1,"mainsymbol":1},{"unique" : false})
Please share your views as what is the best approach in terms of handling this in all aspects (RAM Size , query performance )
Unless you have queries that need use only one of price and surv, I think the compound index should do the best work.
the best approach is to use .explain()
this will let you know how much work is done with each version of your index
it is really worth the time to explore the explain method and learn to interpret its output.

Can you match sub-fields with $all in Mongo?

I have a collection of document, where each document looks like this:
{'name' : 'John', 'locations' :
[
{'place' : 'Paris', 'been' : true}
{'place' : 'Moscow', 'been' : false}
{'place' : 'Berlin', 'been' : true}
]
}
Where the locations array could have any length.
I want to match documents where the been field is true for all elements in the locations array. Looking at the documentation it looks like I should use $and somehow but I'm not sure if it works with sub-fields.
There are several options:
use $ne: db.destinations.find({"locations.been":{$ne:false}})
change your business logic to precompute that value before saving the document. Otherwise, this search must look through all records and then all places. This value could be indexed.
use the $where operator, but, understand the performance implications. It may require a full table scan. In this case, it would.
write a map-reduce function with the filter logic and only emit those that are valid. You'd need to incrementally update it per the docs.
write a query using the aggregation framework. There are a lot of good examples here. Although, like other solutions, this could end up looping through the entire collection.
I think it's impossible to do with standart MongoDB operators like $elemMatch or $all. The only possible way is to write custom JS query:
db.test.find("return this.locations.every(function(loc){return loc.been});")

Sorting on Multiple fields mongo DB

I have a query in mongo such that I want to give preference to the first field and then the second field.
Say I have to query such that
db.col.find({category: A}).sort({updated: -1, rating: -1}).limit(10).explain()
So I created the following index
db.col.ensureIndex({category: 1, rating: -1, updated: -1})
It worked just fined scanning as many objects as needed, i.e. 10.
But now I need to query
db.col.find({category: { $ne: A}}).sort({updated: -1, rating: -1}).limit(10)
So I created the following index:
db.col.ensureIndex({rating: -1, updated: -1})
but this leads to scanning of the whole document and when I create
db.col.ensureIndex({ updated: -1 ,rating: -1})
It scans less number of documents:
I just want to ask to be clear about sorting on multiple fields and what is the order to be preserved when doing so. By reading the MongoDB documents, it's clear that the field on which we need to perform sorting should be the last field. So that is the case I assumed in my $ne query above. Am I doing anything wrong?
The MongoDB query optimizer works by trying different plans to determine which approach works best for a given query. The winning plan for that query pattern is then cached for the next ~1,000 queries or until you do an explain().
To understand which query plans were considered, you should use explain(1), eg:
db.col.find({category:'A'}).sort({updated: -1}).explain(1)
The allPlans detail will show all plans that were compared.
If you run a query which is not very selective (for example, if many records match your criteria of {category: { $ne:'A'}}), it may be faster for MongoDB to find results using a BasicCursor (table scan) rather than matching against an index.
The order of fields in the query generally does not make a difference for the index selection (there are a few exceptions with range queries). The order of fields in a sort does affect the index selection. If your sort() criteria does not match the index order, the result data has to be re-sorted after the index is used (you should see scanAndOrder:true in the explain output if this happens).
It's also worth noting that MongoDB will only use one index per query (with the exception of $ors).
So if you are trying to optimize the query:
db.col.find({category:'A'}).sort({updated: -1, rating: -1})
You will want to include all three fields in the index:
db.col.ensureIndex({category: 1, updated: -1, rating: -1})
FYI, if you want to force a particular query to use an index (generally not needed or recommended), there is a hint() option you can try.
That is true but there are two layers of ordering you have here since you are sorting on a compound index.
As you noticed when the first field of the index matches the first field of sort it worked and the index was seen. However when working the other way around it does not.
As such by your own obersvations the order needed to be preserved is query order of fields from first to last. The mongo analyser can sometimes move around fields to match an index but normally it will just try and match the first field, if it cannot it will skip it.
try this code it will sort data first based on name then keeping the 'name' in key holder it will sort 'filter'
var cursor = db.collection('vc').find({ "name" : { $in: [ /cpu/, /memo/ ] } }, { _id: 0, }).sort( { "name":1 , "filter": 1 } );
Sort and Index Use
MongoDB can obtain the results of a sort operation from an index which
includes the sort fields. MongoDB may use multiple indexes to support
a sort operation if the sort uses the same indexes as the query
predicate. ... Sort operations that use an index often have better
performance than blocking sorts.
db.restaurants.find().sort( { "borough": 1, "_id": 1 } )
more information :
https://docs.mongodb.com/manual/reference/method/cursor.sort/