I wonder, does the order of conditions in $or query matter?
E.g. can this query be reliable to get either document where role matches userrole or any other document only if the one with userrole is not found?
{$or: [{role: 'userrole'},{}]}, {limit: 1}
Order does not matter.
Since the operator $or$ gets evaluated document-wise, the query
{$or: [{role: 'userrole'},{}]}
will evaluate to true on each document, hence it will always return every document in the collection.
If, in addition, you use the limit() method, then the first n documents in the collection (according to their internal ordering) will be returned.
Related
Mongo's documentation on $or operator says:
When evaluating the clauses in the
$or
expression, MongoDB either performs a collection scan or, if all the clauses are supported by indexes, MongoDB performs index scans. That is, for MongoDB to use indexes to evaluate an
$or
expression, all the clauses in the
$or
expression must be supported by indexes. Otherwise, MongoDB will perform a collection scan.
So effectively, if you want the query to be efficient, both of the attributes used in the $or condition should be indexed.
However, I'm not sure if this applies to "findOne" operations as well since I can't use Mongo's explain functionality for a findOne operation. It seems logical to me that if you only care about returning one document, you would check an indexed condition first, since you could bail after just finding one without needing to care about the non-indexed fields.
Example
Let's pretend in the below that "email" is indexed, but username is not. It's a contrived example, but bear with me.
db.users.findOne(
{
$or: [
{ email : 'someUser#gmail.com' },
{ username: 'some-user' }
]
}
)
Would the above use the email index, or would it do a full collection scan until it finds a document that matches the criteria?
I could not find anything documenting what would be expected here. Does someone know the answer?
Well I feel a little silly - it turns out that the docs I found saying you can only run explain on "find" may have been just referring to when you're in MongoDB Compass.
I ran the below (userEmail not indexed)
this.userModel
.findOne()
.where({ $or: [{ _id: id }, { userEmail: email }] })
.explain();
this.userModel
.getModelInstance()
.findOne()
.where({ _id: id })
.explain();
...and found that the first does do a COLLSCAN and the second's plan was IDHACK (basically it does a normal ID lookup).
When running performance tests, I can see that the first is about 4x slower in a collection with 20K+ documents (depends on where in the natural order the document you're finding is)
I have a query into my MongoDB database like this:
WineUniversal.find({"scoreTotal": {
"$gte": 50
}})
.populate([{
path: 'userWines',
model: 'Wine',
match: {$and: [{mode: {$nin: ['wishlist', 'cellar']}}, {scoreTotal: {$gte: 50}}] }
}])
Do I need to create an compound index for this query?
OR
Do I just need a single-item index, and then do a separate index for the other collection I am populating?
In MongoDB it is not possible to index across collections, so the most optimal solution would be to create different indexes for both collections you are joining.
PS: You could use explain to see the performance of your query and the indexes you add: https://www.mongodb.com/docs/manual/reference/command/explain/#mongodb-dbcommand-dbcmd.explain
mongoose's "populate" method just executes an additional find query behind the scenes into the other collection. it's not all "magically" executed as one query.
Now with this knowledge is very easy to decide what would be optimal.
For the first query you just need to make sure "WineUniversal" has an index for scoreTotal field.
For the second "populated" query assuming you use _id as the population field (which is standard) this field is already indexed.
mongoose creates a condition similar to this:
const idsToPopulate = [...object id list from wine matches];
const match = {$and: [{mode: {$nin: ['wishlist', 'cellar']}}, {scoreTotal: {$gte: 50}}] };
match._id = {$in: idsToPopulate}
const userWinesToPopulate = collection.find(match);
So as long as you're using _id then creating additional indexes could help performance, but in a very very negligible way as the _id index does the heavy lifting as it is.
If you're using a different field to populate yes, you should make sure userWines collection has a compound index on that field.
I'm using MongoDB 2.6.8. According to the $maxTimeMS reference, these two queries should behave identically:
> db.collection.find({}).maxTimeMS(100)
> db.collection.find({}, {$maxTimeMS: 100})
The first query does exactly what I want, but the second query restricts only returns the object IDs of the documents. I tried increasing $maxTimeMS to 100000000 and there was no change in behavior.
Why am I getting different results for these two commands?
You found a bug in the documentation.
The reason that db.collection.find({}, {$maxTimeMS: 100}) returns only the _id of each object is because mongoDB is interpreting the {$maxTimeMS: 100} portion of the query as a projection.
So it thinks you want to see all the documents and you want to see the fields _id and the field $maxTimeMS. Of course, none of your documents have a $maxTimeMS field, so they only show the _id.
The proper way to perform the query you want without the shortcut is:
db.collection.find({ $query: {}, $maxTimeMS: 100 })
I would like to only return a subdocument if its expire field is in the future. My problem is that I would like to get the root document anyway.
I am currently using:
sub.expire: {"$gte": now}
it only returns the root document if sub.expire is true.
how would I return the root document regardless if sub gets returned or not?
Thanks
This isn't quite how MongoDB is designed. You're phrasing it as, you're trying to query a bunch of subdocuments and return them conditionally while returning the parent documents unconditionally; but as far as MongoDB is concerned there isn't a collection of subdocuments, there's a collection of documents, which each have fields, some of which may be subdocuments. When you say you're "using" this condition on a subdocument -- presumably for find() -- what you're really asking the collection for is documents which satisfy this condition in a subdocument. So documents that don't satisfy that condition aren't returned.
From MongoDB's point of view, what you actually want is closer to, return every document in the collection, but with a certain field (a subdocument, as it happens) stripped out if it's expired, which falls under "projection". There are two ways projection happens in MongoDB, and I'm afraid neither is exactly what you want:
find() lets you project the fields to return, but that's limited to: 1 (do), 0 (don't), or a few special operators if the field is an array, which a subdocument is not.
Aggregation gives you a $project operator to transform the results, but doesn't let you disappear this field (subdocument) entirely: if it has an expiration date in the past, you can replace it with null or (probably better) an empty document, but that's as close as you can get.
Suppose I have a query that looks something like this:
db.things.find({
deleted: false,
type: 'thing',
$or: [{
'creator._id': someid
}, {
'parent._id': someid
}, {
'somerelation._id': someid
}]
}).sort({
'date.created': -1
})
That is, I want to find documents that meets one of those three conditions and sort it by newest. However, $or queries do not use indexes in parallel when used with a sort. Thus, how would I index this query?
http://docs.mongodb.org/manual/core/indexes/#index-behaviors-and-limitations
You can assume the following selectivity:
deleted - 99%
type - 25%
creator._id, parent._id, somerelation._id - < 1%
Now you are going to need more than one index for this query; there is no doubt about that.
The question is what indexes?
Now you have to take into consideration that none of your $ors will be able to sort their data cardinally in an optimal manner using the index due to a bug in MongoDBs query optimizer: https://jira.mongodb.org/browse/SERVER-1205 .
So you know that the $or will have some performance problems with a sort and that putting the sort field into the $or clause indexes is useless atm.
So considering this the first index you want is one that covers the base query you are making. As #Leonid said you could make this into a compound index, however, I would not do it the order he has done it. Instead, I would do:
db.col.ensureIndex({type:-1,deleted:-1,date.created:-1})
I am very unsure about the deleted field being in the index at all due to its super low selectivity; it could, in fact, create a less performant operation (this is true for most databases including SQL) being in the index rather than being taken out. This part will need testing by you; maybe the field should be last (?).
As to the order of the index, again I have just guessed. I have said DESC for all fields because your sort is DESC, but you will need to explain this yourself here.
So that should be able to handle the master clause of your query. Now to deal with those $ors.
Each $or will use an index separately, and the MongoDB query optimizer will look for indexes for them separately too as though they are separate queries altogether, so something worth noting here is a little snag about compound indexes ( http://docs.mongodb.org/manual/core/indexes/#compound-indexes ) is that they work upon prefixes ( an example note here: http://docs.mongodb.org/manual/core/indexes/#id5 ) so you can't make one single compound index to cover all three clauses, so a more optimal method of declaring indexes on the $or (considering the bug above) is:
db.col.ensureindex({creator._id:1});
db.col.ensureindex({aprent._id:1});
db.col.ensureindex({somrelation._id:1});
It should be able to get you started on making optimal indexes for your query.
I should stress however that you need to test this yourself.
Mongodb can use only one index per query, so I can't see the way to use indexes to query someid in your model.
So, the best approach is to add special field for this task:
ids = [creator._id, parent._id, somerelation._id]
In this case you'll be able to query without using $or operator:
db.things.find({
deleted: false,
type: 'thing',
ids: someid
}).sort({
'date.created': -1
})
In this case your index will look something like this:
{deleted:1, type:1, ids:1, 'date.created': -1}
If you had flexibility to adjust the schema, I would suggest adding a new field, associatedIds : [ ] which would hold creator._id, parent._id, some relation._id - you can update that field atomically when you update the main corresponding field, but now you can have a compound index on this field, type and created_date which eliminates the need for $or in your query entirely.
Considering your requirement for indexing , I would suggest you to use $orderBy operator along side your $or query. By that I mean you should be able to index on the criteria's in your $or expressions used in your $or query and then you can $orderBy to sort the result.
For example:
db.things.find({
deleted: false,
type: 'thing',
$or: [{
'creator._id': someid
}, {
'parent._id': someid
}, {
'somerelation._id': someid
}]
},{$orderBy:{'date.created': -1}})
The above query would require compound indexes on each of the fields in the $or expressions combined with the sort object specified in the orderBy criteria.
for example:
db.things.ensureIndex{'parent._id': 1,"date.created":-1}
and so on for other fields.
It is a good practice to specify "limit" for the result to prevent mongodb from performing a huge in memory sort.
Read More on $orderBy operator here