MongoDB multikeys on _id + some value - mongodb

In MongoDB I have a query which looks like this to find out for which comments the user has already voted:
db.comments.find({
_id: { $in: [...some ids...] },
votes.uid: "4fe1d64d85d4f4c00d000002"
});
As the documentation says you should have
One index per query
So what's better creating a multikey on _id + votes.uid or is it enough to just index on votes.uid because Mongo handles _id automatically in any way?

There is automatically an index on _id.
Depending of your queries (how many ids you have in the $in array) and your data, (how many votes you have on one object) you may create a index on votes.uid.
Take care of which index is used during query execution and remember you can force Mongo to use the index you want by adding .hints(field:1) or hints('indexname')

Related

MongoDB createIndex() when using populate() in my query

I have a query into my MongoDB database like this:
WineUniversal.find({"scoreTotal": {
"$gte": 50
}})
.populate([{
path: 'userWines',
model: 'Wine',
match: {$and: [{mode: {$nin: ['wishlist', 'cellar']}}, {scoreTotal: {$gte: 50}}] }
}])
Do I need to create an compound index for this query?
OR
Do I just need a single-item index, and then do a separate index for the other collection I am populating?
In MongoDB it is not possible to index across collections, so the most optimal solution would be to create different indexes for both collections you are joining.
PS: You could use explain to see the performance of your query and the indexes you add: https://www.mongodb.com/docs/manual/reference/command/explain/#mongodb-dbcommand-dbcmd.explain
mongoose's "populate" method just executes an additional find query behind the scenes into the other collection. it's not all "magically" executed as one query.
Now with this knowledge is very easy to decide what would be optimal.
For the first query you just need to make sure "WineUniversal" has an index for scoreTotal field.
For the second "populated" query assuming you use _id as the population field (which is standard) this field is already indexed.
mongoose creates a condition similar to this:
const idsToPopulate = [...object id list from wine matches];
const match = {$and: [{mode: {$nin: ['wishlist', 'cellar']}}, {scoreTotal: {$gte: 50}}] };
match._id = {$in: idsToPopulate}
const userWinesToPopulate = collection.find(match);
So as long as you're using _id then creating additional indexes could help performance, but in a very very negligible way as the _id index does the heavy lifting as it is.
If you're using a different field to populate yes, you should make sure userWines collection has a compound index on that field.

How to index and sorting with Pagination using custom field in MongoDB ex: name instead of id

https://scalegrid.io/blog/fast-paging-with-mongodb/
Example : {
_id,
name,
company,
state
}
I've gone through the 2 scenarios explained in the above link and it says sorting by object id makes good performance while retrieve and sort the results. Instead of default sorting using object id , I want to index for my own custom field "name" and "company" want to sort and pagination on this two fields (Both fields holds the string value).
I am not sure how we can use gt or lt for a name, currently blocked on how to resolve this to provide pagination when a user sort by name.
How to index and do pagination for two fields?
Answer to your question is
db.Example.createIndex( { name: 1, company: 1 } )
And for pagination explanation the link you have shared on your question is good enough. Ex
db.Example.find({name = "John", country = "Ireland"}). limit(10);
For Sorting
db.Example.find().sort({"name" = 1, "country" = 1}).limit(userPassedLowerLimit).skip(userPassedUpperLimit);
If the user request to fetch 21-30 first documents after sorting on Name then country both in ascending order
db.Example.find().sort({"name" = 1, "country" = 1}).limit(30).skip(20);
For basic understand of Indexing in MonogDB
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
Indexes are special data structures, that store a small portion of the collection’s data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field.
Default _id Index
MongoDB creates a unique index on the _id field during the creation of a collection. The _id index prevents clients from inserting two documents with the same value for the _id field. You cannot drop this index on the _id field.
Create an Index
Syntax to execute on Mongo Shell
db.collection.createIndex( <key and index type specification>, <options> )
Ex:
db.collection.createIndex( { name: -1 } )
for ascending use 1,for descending use -1
The above rich query only creates an index if an index of the same specification does not already exist.
Index Types
MongoDB provides different index types to support specific types of data and queries. But i would like to mention 2 important types
1. Single Field
In addition to the MongoDB-defined _id index, MongoDB supports the creation of user-defined ascending/descending indexes on a single field of a document.
2. Compound Index
MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes.
The order of fields listed in a compound index has significance. For instance, if a compound index consists of { name: 1, company: 1 }, the index sorts first by name and then, within each name value, sorts by company.
Source for my understanding and answer and to know more about MongoDB indexing MongoDB Indexing

Using object as _id in MongoDb causes collscan on queries

I'm having some issues with using a custom object as my _id value in MongoDb.
The objects I'm storing in _id looks like this:
"_id" : {
"EDIEL" : "1010101010101",
"StartDateTicks" : NumberLong(636081120000000000)
}
Now, when I'm performing the following query:
.find({
"_id.EDIEL": { $eq: "1010101010101" },
"_id.StartDateTicks": { $gte: 636082776000000000, $lt: 636108696000000000 }
}).explain()
I does a COLLSCAN. I can't figure out why exactly. Is it because I'm not querying against the _id object with an object?
Does anyone know what I'm doing wrong here? :-)
Edit:
Tried to create a compound index containing the EDIEL and StartDateTicks fields, ran the query again and now it uses the index instead of a column scan. While this works, it would still be nice to avoid having the extra index and just having the _id (since it's basically a "free" index) So, the question still stands: why can't I query against the _id.EDIEL and _id.StartDateTicks and make use of the index?
Indexes are used on keys and not on objects, so when you use object for _id, the indexing on object can't be used for the specific query you do on the field of the object.
This is true not only for _id but subdocument also.
{
"name":"awesome book",
"detail" :{
"pages":375,
"alias" : "AB"
}
}
Now when you have index on detail and you query by detail.pages or detail.alias, the index on detail cannot be used and certainly not for range queries. You need to have indexes on detail.pages and detail.alias.
when index is applied on object it maintains the index of object as a whole and not per field, that's why queries on object fields are not able to use object indexes.
Hope that helps
You will need to index the two fields separately, since indexes cant be on embedded documents. Thus creating a compound index is the only option available, or creating multiple indexes on the fields which in turn use intersection index are the options for you.

how to build index in mongodb in this situation

I have a mongodb database, which has following fields:
{"word":"ipad", "date":20140113, "docid": 324, "score": 98}
which is a reverse index for a log of docs(about 120 millions).
there are two kinds of queries in my system:
one of which is :
db.index.find({"word":"ipad", "date":20140113}).sort({"score":-1})
this query fetch the word "ipad" in date 20140113, and sort the all docs by score.
another query is:
db.index.find({"word":"ipad", "date":20140113, "docid":324})
to speed up these two kinds of query, what index should I build?
Should I build two indexes like this?:
db.index.ensureIndex({"word":1, "date":1, "docid":1}, {"unique":true})
db.index.ensureIndex({"word":1, "date":1, "score":1}
but I think build the two index use two much hard disk space.
So do you have some good ideas?
You are sorting by score descending (.sort({"score":-1})), which means that your index should also be descending on the score-field so it can support the sorting:
db.index.ensureIndex({"word":1, "date":1, "score":-1});
The other index looks good to speed up that query, but you still might want to confirm that by running the query in the mongo shell followed with .explain().
Indexes are always a tradeoff of space and write-performance for read-performance. When you can't afford the space, you can't have the index and have to deal with it. But usually the write-performance is the larger concern, because drive space is usually cheap.
But maybe you could save one of the three indexes you have. "Wait, three indexes?" Yes, keep in mind that every collection must have an unique index on the _id field which is created implicitely when the collection is initialized.
But the _id field doesn't have to be an auto-generated ObjectId. It can be anything you want. When you have another index with an uniqueness-constraint and you have no use for the _id field, you can move that unique-constraint to the _id field to save an index. Your documents would then look like this:
{ _id: {
"word":"ipad",
"date":20140113,
"docid": 324
},
"score": 98
}

Add _id when ensuring index?

I am building a webapp using Codeigniter (PHP) and MongoDB.
I am creating indexes and have one question.
If I am querying on three fields (_id, status, type) and want to
create an index do I need to include _id when ensuring the index like this:
db.comments.ensureIndex({_id: 1, status : 1, type : 1});
or will this due?
db.comments.ensureIndex({status : 1, type : 1});
You would need to explicitly include _id in your ensureIndex call if you wanted to include it in your compound index. But because filtering by _id already provides selectivity of a single document that's very rarely the right thing to do. I think it would only make sense if your documents are very large and you're trying to use covered indexes.
MongoDB will currently only use one index per query with the exception of $or queries. If your common query will always be searching on those three fields (_id, status, type) then a compound index would be helpful.
From within the DB shell you can use the explain() command on your query to get information on the indexes used.
You don't need to implicitly create index on the _id field, it's done automatically. See the mongo documentation:
The _id Index
For all collections except capped collections, an index is automatically created for the _id field. This index is special and cannot be deleted. The _id index enforces uniqueness for its keys (except for some situations with sharding).