I have a collection with ~ 100k rows of very small objects - 3 fields of integers values and 3 dateTime.
The queries will be on 3 fields - around 3/sec.
How much benefit do i get for adding indexes on those fields?
Is it something that should be consider about or should i just add indexes on all heavily used queries ?
Thanks
Related
I am looking for ways to improve my aggregation queries. In that context, I was wondering if the number of fields that get grouped impacts the performance of the aggregation? Because in certain cases we group by more than 5 to 6 fields.
In other words, as the number of fields to group by in $group pipeline increases will there be any performance impact compared to let's say when grouped by 1 or 2 fields?
More details:
If the answer to this is yes, it could negatively impact performance, then I plan to create a new field let's call it "cocatenated_field" inside which I will string concatenate values of all the multiple fields and then use only this one field for grouping.
I am trying to sort a field, but I need to do it like how mysql does and specify the value,
instead of field number sory by -1, 4 3 2 1.
I need to sort it by 2,1,3,4.
is this possible in mongodb?
This is not possible in MongoDB. MongoDB doesn't know which and how many fields exist in a particular document unless the document is retrieved.
Hence MongoDB practically cannot give numbers to specific fields.
MongoDB is a JSON-style data store. The documents stored in the
database can have varying sets of fields, with different types for
each field.
In Databases the number and types of the column is fixed by particular table definition and it's easier for a database system to give numbers to the columns.
How costly is it to index some fields in MongoDB,
I have a table where i want uniqueness combining two fields, Every where i search they suggested compound index with unique set to true. But what i was doing is " Appending both field1_field2 and making it a key, so that field2 will be always unique for field1.(and add Application logic) As i thought indexing is costly.
And also as MongoDB documentation advices us not to use Custom Object ID like auto incrementing number, I end up giving big numbers to Models like Classes, Students etc, (where i could have used easily used 1,2,3 in sql lite), I didn't think to add a new field for numbering and index that field for querying.
What are the best practices advice for production
The advantage of using compound indexes vs your own indexed field system is that compound indexes allows sorting quicker than regular indexed fields. It also lowers the size of every documents.
In your case, if you want to get the documents sorted with values in field1 ascending and in field2 descending, it is better to use a compound index. If you only want to get the documents that have some specific value contained in field1_field2, it does not really matter if you use compound indexes or a regular indexed field.
However, if you already have field1 and field2 in seperate fields in the documents, and you also have a field containing field1_field2, it could be better to use a compound index on field1 and field2, and simply delete the field containing field1_field2. This could lower the size of every document and ultimately reduce the size of your database.
Regarding the cost of the indexing, you almost have to index field1_field2 if you want to go down that route anyways. Queries based on unindexed fields in MongoDB are really slow. And it does not take much more time adding a document to a database when the document has an indexed field (we're talking 1 millisecond or so). Note that adding an index on many existing documents can take a few minutes. This is why you usually plan the indexing strategy before adding any documents.
TL;DR:
If you have limited disk space or need to sort the results, go with a compound index and delete field1_field2. Otherwise, use field1_field2, but it has to be indexed!
We have two types of high-volume queries. One looks for docs involving 5 attributes: a date (lte), a value stored in an array, a value stored in a second array, one integer (gte), and one float (gte).
The second includes these five attributes plus two more.
Should we create two compound indices, one for each query? Assume each attribute has a high cardinality.
If we do, because each query involves multiple arrays, it doesn't seem like we can create an index because of Mongo's restriction. How do people structure their Mongo databases in this case?
We're using MongoMapper.
Thanks!
Indexes for queries after the first ranges in the query the value of the additional index fields drops significantly.
Conceptually, I find it best to think of the addition fields in the index pruning ever smaller sub-trees from the query. The first range chops off a large branch, the second a smaller, the third smaller, etc. My general rule of thumb is only the first range from the query in the index is of value.
The caveat to that rule is that additional fields in the index can be useful to aid sorting returned results.
For the first query I would create a index on the two array values and then which ever of the ranges will exclude the most documents. The date field is unlikely to provide high exclusion unless you can close the range (lte and gte). The integer and float is hard to tell without knowing the domain.
If the second query's two additional attributes also use ranges in the query and do not have a significantly higher exclusion value then I would just work with the one index.
Rob.
Is MongoDB a good fit when there are several combinations of columns used for querying , thus creating indexes on all of the columns is not feasible? How does MongoDB performs when, say, you have no index on the column and you have millions of entries for that column?
If you have no index, a table scan is performed, as with any database system.
If the documents are in memory this will still be relatively fast but still take a given amount of time based on the number of documents in the collection as the database must look at each one. O(n)
Is the problem that you have a small set of varying keys per document or a large numer of keys that every document must have?
Column oriented datastores must store a large amount of columns to model varying attributes but mongodb is more flexible because of the document data model.
If you have documents that have a small number of varying attributes (out of a large set of attributes), this is indexable and will be O(logn).
Your documents would look like this:
{
"name":"some name",
"attrs":[
{"n":"subject","v":"the subject"},
{"n":"description","v":"Some amazing description"},
{"n":"comments","v":"Comments on this thing"},
]
}
Be indexible like this:
db.mycollection.ensureIndex({"attrs.n":1, "attrs.v":1})
and be queryable like this:
db.mycollection.find({attrs: {$elemMatch: {n: "subject", v: "the subject"}}})