MongoDB index for nested document is not being used - mongodb

In my collection, I've say the following structure
{
_id: ObjectId("ssxxdfasfsadf"),
a: {
b: "somevalue"
}
}
I've created an index for a.b, which works fine if I use find query as db.collection.find({"a.b": "someothervalue"}).
If I change my query to db.collection.find({a: {b: "somevalue"}}), it's doing a complete collection scan. (Source - find().explain())
Sure, I can modify my application to do the query as "a.b", but I want to avoid that, as I've few other fields in a, on which in future I may need to query.
Is there anyway {a: {b: "somevalue"}} could work with tweaking the index?
Also, is there any advantage/disadvantage of using one or the other?

I would go with the first approach. A quick read through MongoDB's documentation, states the following:
MongoDB uses the dot notation to access the elements of an array and to access the fields of an embedded document.
See MongoDB Dot Notation and Query on Embedded/Nested Documents.
About tweaking the index, you could index the embedded document as a whole:
db.myColl.createIndex({ "a": 1 });
But I don't see the reason of doing this if you only need specific properties indexed. I would be sensitive on the Index Size, especially if the property will be holding a lot of data.

Related

Mongodb: searching embedded documents by the '_id' field

If I have a data with a structure like this as a single document in a collection:
{
_id: ObjectId("firstid"),
"name": "sublimetest",
"child": {
_id: ObjectId("childid"),
"name": "materialtheme"
}
}
is there a way to search for the embedded document by the id "childid" ?
because mongo doesn't index the _id fields of embedded documents (correct me if I am wrong here),
as this query doesn't work :
db.collection.find({_id:"childid"});
Also please suggest me if there is any other document database that would be suitable for this kind of retreiving data that is structured as a tree, where the requirement is to :
query children without having to issue joins
find any node in the tree as fast as you would find the root node, as if all these nodes were stored as separate documents in a collection.
Why this is not a duplicate of question(s) suggested :
the potential-duplicate-question, queries document by using dot notation. But what if the document is nested 7 levels deep ? In such case it would not be suitable to write a query using dot notation. what I want is that, all documents, whether top level, or nested, if they have the _id field, should be in the bucket of _id indexes, so that when you search db.collection.find({_id: "asdf"}), it should take into account documents that are nested too that have the _id field matching "asdf". In short, it should be as if the inner document weren't nested, but present parallel to the outer one.
You can use the dot notation:
db.posts.find({"child._id": "childid"})

Fundamental misunderstanding of MongoDB indices

So, I read the following definition of indexes from [MongoDB Docs][1].
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
Indexes are special data structures that store a small portion of the
collection’s data set in an easy to traverse form. The index stores
the value of a specific field or set of fields, ordered by the value
of the field. The ordering of the index entries supports efficient
equality matches and range-based query operations. In addition,
MongoDB can return sorted results by using the ordering in the index.
I have a sample database with a collection called pets. Pets have the following structure.
{
"_id": ObjectId(123abc123abc)
"name": "My pet's name"
}
I created an index on the name field using the following code.
db.pets.createIndex({"name":1})
What I expect is that the documents in the collection, pets, will be indexed in ascending order based on the name field during queries. The result of this index can potentially reduce the overall query time, especially if a query is strategically structured with available indices in mind. Under that assumption, the following query should return all pets sorted by name in ascending order, but it doesn't.
db.pets.find({},{"_id":0})
Instead, it returns the pets in the order that they were inserted. My conclusion is that I lack a fundamental understanding of how indices work. Can someone please help me to understand?
Yes, it is misunderstanding about how indexes work.
Indexes don't change the output of a query but the way query is processed by the database engine. So db.pets.find({},{"_id":0}) will always return the documents in natural order irrespective of whether there is an index or not.
Indexes will be used only when you make use of them in your query. Thus,
db.pets.find({name : "My pet's name"},{"_id":0}) and db.pets.find({}, {_id : 0}).sort({name : 1}) will use the {name : 1} index.
You should run explain on your queries to check if indexes are being used or not.
You may want to refer the documentation on how indexes work.
https://docs.mongodb.com/manual/indexes/
https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/

MongoDB: Indexes, Sorting

After having read the official documentations on indexes, sort, intersection, i'm a little bit confuse on how everything work together.
I've trouble making my query use the indexes i've created. I work on a mongodb 3.0.3, on a collection having ~4millions of document.
To simplify, let's say my document is composed of 6 fields:
{
a:<text>,
b:<boolean>,
c:<text>,
d:<boolean>,
e:<date>,
f:<date>
}
The query I want to achieve is the following :
db.mycoll.find({ a:"OK", b:true, c:"ProviderA", d:true, e:{ $gte:ISODate("2016-10-28T12:00:01Z"),$lt:ISODate("2016-10-28T12:00:02") } }).sort({f:1});
So intuitively I've created two indexes
db.mycoll.createIndex({a: 1, b: 1, c: 1, d:1, e:1 }, {background: true,name: "test1"})
db.mycoll.createIndex({f:1}, {background: true,name: "test2"})
But the explain() give me that the first index is not used at all.
I known there is some kind of limitation when there is ranges in play in the filter (in the e field), but I can't find my way around it.
Also instead of having a single index on f, I try a compound index on {e:1,f:1} but it didn't change anything.
So What I have misunderstood?
Thanks for your support.
Update: also I find some time the following predicate for mongodb 2.6 :
A good rule of thumb for queries with sort is to order the indexed fields in this order:
First, the field(s) on which you will query for exact values.
Second, the field(s) on which you will sort.
Finally, field(s) on which you will query for a range of values (e.g., $gt, $lt, $in)
An example of using this rule of thumb is in the section on “Sorting the results of a complex query on a range of values” below, including a link to further reading.
Does this also apply for 3.X version?
Update 2: following above predicate, I created the following index
db.mycoll.createIndex({a: 1, b: 1, c: 1, d:1 , f:1, e:1}, {background: true,name: "test1"})
And for the same query :
db.mycoll.find({ a:"OK", b:true, c:"ProviderA", d:true, e:{ $gte:ISODate("2016-10-28T12:00:01Z"),$lt:ISODate("2016-10-28T12:00:02") } }).sort({f:1});
the index is indeed used. However too much keys seems to be scan, I may need to find a better order the fields in the query/index.
Mongo acts sometimes a bit strange when it comes to the index selection.
Mongo automagically decides what index to use. The smaller an index is the more likely it is used (especially indexes with only one field) - this is my experience. May be this happens because it is more often already loaded in RAM? To find out what index to use when Mongo performs test queries when it is idle. However the result is sometimes unexpected.
Therefore if you know what index to use you can force a query to use a specific index using the $hint option. You should try that.
Your two indexes used in the query and the sort does not overlap so MongoDB can not use them for index intersection:
Index intersection does not apply when the sort() operation requires an index completely separate from the query predicate.

Mongo _id for subdocument array

I wish to add an _id as property for objects in a mongo array.
Is this good practice ?
Are there any problems with indexing ?
I wish to add an _id as property for objects in a mongo array.
I assume:
{
g: [
{ _id: ObjectId(), property: '' },
// next
]
}
Type of structure for this question.
Is this good practice ?
Not normally. _ids are unique identifiers for entities. As such if you are looking to add _id within a sub-document object then you might not have normalised your data very well and it could be a sign of a fundamental flaw within your schema design.
Sub-documents are designed to contain repeating data for that document, i.e. the addresses or a user or something.
That being said _id is not always a bad thing to add. Take the example I just stated with addresses. Imagine you were to have a shopping cart system and (for some reason) you didn't replicate the address to the order document then you would use an _id or some other identifier to get that sub-document out.
Also you have to take into consideration linking documents. If that _id describes another document and the properties are custom attributes for that document in relation to that linked document then that's okay too.
Are there any problems with indexing ?
An ObjectId is still quite sizeable so that is something to take into consideration over a smaller, less unique id or not using an _id at all for sub-documents.
For indexes it doesn't really work any different to the standard _id field on the document itself and a unique index across the field should work across the collection (scenario dependant, test your queries).
NB: MongoDB will not add an _id to sub-documents for you.

Add _id when ensuring index?

I am building a webapp using Codeigniter (PHP) and MongoDB.
I am creating indexes and have one question.
If I am querying on three fields (_id, status, type) and want to
create an index do I need to include _id when ensuring the index like this:
db.comments.ensureIndex({_id: 1, status : 1, type : 1});
or will this due?
db.comments.ensureIndex({status : 1, type : 1});
You would need to explicitly include _id in your ensureIndex call if you wanted to include it in your compound index. But because filtering by _id already provides selectivity of a single document that's very rarely the right thing to do. I think it would only make sense if your documents are very large and you're trying to use covered indexes.
MongoDB will currently only use one index per query with the exception of $or queries. If your common query will always be searching on those three fields (_id, status, type) then a compound index would be helpful.
From within the DB shell you can use the explain() command on your query to get information on the indexes used.
You don't need to implicitly create index on the _id field, it's done automatically. See the mongo documentation:
The _id Index
For all collections except capped collections, an index is automatically created for the _id field. This index is special and cannot be deleted. The _id index enforces uniqueness for its keys (except for some situations with sharding).