MongoDB query with an 'or' condition - mongodb

So I have an embedded document that tracks group memberships. Each embedded document has an ID pointing to the group in another collection, a start date, and an optional expire date.
I want to query for current members of a group. "Current" means the start time is less than the current time, and the expire time is greater than the current time OR null.
This conditional query is totally blocking me up. I could do it by running two queries and merging the results, but that seems ugly and requires loading in all results at once. Or I could default the expire time to some arbitrary date in the far future, but that seems even uglier and potentially brittle. In SQL I'd just express it with "(expires >= Now()) OR (expires IS NULL)" -- but I don't know how to do that in Mongo.
Any ideas? Thanks very much in advance.

Just thought I'd update in-case anyone stumbles across this page in the future. As of 1.5.3, mongo now supports a real $or operator: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24or
Your query of "(expires >= Now()) OR (expires IS NULL)" can now be rendered as:
{$or: [{expires: {$gte: new Date()}}, {expires: null}]}

In case anyone finds it useful, www.querymongo.com does translation between SQL and MongoDB, including OR clauses. It can be really helpful for figuring out syntax when you know the SQL equivalent.
In the case of OR statements, it looks like this
SQL:
SELECT * FROM collection WHERE columnA = 3 OR columnB = 'string';
MongoDB:
db.collection.find({
"$or": [{
"columnA": 3
}, {
"columnB": "string"
}]
});

MongoDB query with an 'or' condition
db.getCollection('movie').find({$or:[{"type":"smartreply"},{"category":"small_talk"}]})
MongoDB query with an 'or', 'and', condition combined.
db.getCollection('movie').find({"applicationId":"2b5958d9629026491c30b42f2d5256fa8",$or:[{"type":"smartreply"},{"category":"small_talk"}]})

Query objects in Mongo by default AND expressions together. Mongo currently does not include an OR operator for such queries, however there are ways to express such queries.
Use "in" or "where".
Its gonna be something like this:
db.mycollection.find( { $where : function() {
return ( this.startTime < Now() && this.expireTime > Now() || this.expireTime == null ); } } );

db.Lead.find(
{"name": {'$regex' : '.*' + "Ravi" + '.*'}},
{
"$or": [{
'added_by':"arunkrishna#aarux.com"
}, {
'added_by':"aruna#aarux.com"
}]
}
);

Using a $where query will be slow, in part because it can't use indexes. For this sort of problem, I think it would be better to store a high value for the "expires" field that will naturally always be greater than Now(). You can either store a very high date millions of years in the future, or use a separate type to indicate never. The cross-type sort order is defined at here.
An empty Regex or MaxKey (if you language supports it) are both good choices.

Related

Mongodb $elemMatch get documents matching empty array [duplicate]

So I have an embedded document that tracks group memberships. Each embedded document has an ID pointing to the group in another collection, a start date, and an optional expire date.
I want to query for current members of a group. "Current" means the start time is less than the current time, and the expire time is greater than the current time OR null.
This conditional query is totally blocking me up. I could do it by running two queries and merging the results, but that seems ugly and requires loading in all results at once. Or I could default the expire time to some arbitrary date in the far future, but that seems even uglier and potentially brittle. In SQL I'd just express it with "(expires >= Now()) OR (expires IS NULL)" -- but I don't know how to do that in Mongo.
Any ideas? Thanks very much in advance.
Just thought I'd update in-case anyone stumbles across this page in the future. As of 1.5.3, mongo now supports a real $or operator: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24or
Your query of "(expires >= Now()) OR (expires IS NULL)" can now be rendered as:
{$or: [{expires: {$gte: new Date()}}, {expires: null}]}
In case anyone finds it useful, www.querymongo.com does translation between SQL and MongoDB, including OR clauses. It can be really helpful for figuring out syntax when you know the SQL equivalent.
In the case of OR statements, it looks like this
SQL:
SELECT * FROM collection WHERE columnA = 3 OR columnB = 'string';
MongoDB:
db.collection.find({
"$or": [{
"columnA": 3
}, {
"columnB": "string"
}]
});
MongoDB query with an 'or' condition
db.getCollection('movie').find({$or:[{"type":"smartreply"},{"category":"small_talk"}]})
MongoDB query with an 'or', 'and', condition combined.
db.getCollection('movie').find({"applicationId":"2b5958d9629026491c30b42f2d5256fa8",$or:[{"type":"smartreply"},{"category":"small_talk"}]})
Query objects in Mongo by default AND expressions together. Mongo currently does not include an OR operator for such queries, however there are ways to express such queries.
Use "in" or "where".
Its gonna be something like this:
db.mycollection.find( { $where : function() {
return ( this.startTime < Now() && this.expireTime > Now() || this.expireTime == null ); } } );
db.Lead.find(
{"name": {'$regex' : '.*' + "Ravi" + '.*'}},
{
"$or": [{
'added_by':"arunkrishna#aarux.com"
}, {
'added_by':"aruna#aarux.com"
}]
}
);
Using a $where query will be slow, in part because it can't use indexes. For this sort of problem, I think it would be better to store a high value for the "expires" field that will naturally always be greater than Now(). You can either store a very high date millions of years in the future, or use a separate type to indicate never. The cross-type sort order is defined at here.
An empty Regex or MaxKey (if you language supports it) are both good choices.

Order of Fields in Mongo Query vs Ordered Checked In

Say you're querying documents based on 2 data points. One is a simple bool parameter, and the other is a complicated $geoWithin calculation.
db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} )
Will mongo reorder these parameters, so that it checks the boolField 1st, before running the complicated check?
MongoDB uses indexes like any other DBs. So the important thing for mongoDB is if any query fields has an index or not, not the order of query fields. At least there is no information in their documentation that mongoDB try to checks primitive query fields first. So for your example if boolField has an index mongoDB first check this field and eliminate documents whose boolField is false. But If geoField has an index then mongoDB first execute query on this field.
So what happens if none of them have index or both of them have? It should be the given order of fields in query because there is no suggestion or info beside of indexes in query optimization page of mongoDB. Additionally you can always test your queries performances with just adding .explain("executionStats").
So check the performance of db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} ) and db.collection.find( { "boolField" : true, "geoField": { "$geoWithin" : ...} } ). And let us know :)
To add to above response, if you want mongo to use specific index you can use cursor.hint . This https://docs.mongodb.com/manual/core/query-plans/ explains how default index selection is done.

How to index an $or query with sort

Suppose I have a query that looks something like this:
db.things.find({
deleted: false,
type: 'thing',
$or: [{
'creator._id': someid
}, {
'parent._id': someid
}, {
'somerelation._id': someid
}]
}).sort({
'date.created': -1
})
That is, I want to find documents that meets one of those three conditions and sort it by newest. However, $or queries do not use indexes in parallel when used with a sort. Thus, how would I index this query?
http://docs.mongodb.org/manual/core/indexes/#index-behaviors-and-limitations
You can assume the following selectivity:
deleted - 99%
type - 25%
creator._id, parent._id, somerelation._id - < 1%
Now you are going to need more than one index for this query; there is no doubt about that.
The question is what indexes?
Now you have to take into consideration that none of your $ors will be able to sort their data cardinally in an optimal manner using the index due to a bug in MongoDBs query optimizer: https://jira.mongodb.org/browse/SERVER-1205 .
So you know that the $or will have some performance problems with a sort and that putting the sort field into the $or clause indexes is useless atm.
So considering this the first index you want is one that covers the base query you are making. As #Leonid said you could make this into a compound index, however, I would not do it the order he has done it. Instead, I would do:
db.col.ensureIndex({type:-1,deleted:-1,date.created:-1})
I am very unsure about the deleted field being in the index at all due to its super low selectivity; it could, in fact, create a less performant operation (this is true for most databases including SQL) being in the index rather than being taken out. This part will need testing by you; maybe the field should be last (?).
As to the order of the index, again I have just guessed. I have said DESC for all fields because your sort is DESC, but you will need to explain this yourself here.
So that should be able to handle the master clause of your query. Now to deal with those $ors.
Each $or will use an index separately, and the MongoDB query optimizer will look for indexes for them separately too as though they are separate queries altogether, so something worth noting here is a little snag about compound indexes ( http://docs.mongodb.org/manual/core/indexes/#compound-indexes ) is that they work upon prefixes ( an example note here: http://docs.mongodb.org/manual/core/indexes/#id5 ) so you can't make one single compound index to cover all three clauses, so a more optimal method of declaring indexes on the $or (considering the bug above) is:
db.col.ensureindex({creator._id:1});
db.col.ensureindex({aprent._id:1});
db.col.ensureindex({somrelation._id:1});
It should be able to get you started on making optimal indexes for your query.
I should stress however that you need to test this yourself.
Mongodb can use only one index per query, so I can't see the way to use indexes to query someid in your model.
So, the best approach is to add special field for this task:
ids = [creator._id, parent._id, somerelation._id]
In this case you'll be able to query without using $or operator:
db.things.find({
deleted: false,
type: 'thing',
ids: someid
}).sort({
'date.created': -1
})
In this case your index will look something like this:
{deleted:1, type:1, ids:1, 'date.created': -1}
If you had flexibility to adjust the schema, I would suggest adding a new field, associatedIds : [ ] which would hold creator._id, parent._id, some relation._id - you can update that field atomically when you update the main corresponding field, but now you can have a compound index on this field, type and created_date which eliminates the need for $or in your query entirely.
Considering your requirement for indexing , I would suggest you to use $orderBy operator along side your $or query. By that I mean you should be able to index on the criteria's in your $or expressions used in your $or query and then you can $orderBy to sort the result.
For example:
db.things.find({
deleted: false,
type: 'thing',
$or: [{
'creator._id': someid
}, {
'parent._id': someid
}, {
'somerelation._id': someid
}]
},{$orderBy:{'date.created': -1}})
The above query would require compound indexes on each of the fields in the $or expressions combined with the sort object specified in the orderBy criteria.
for example:
db.things.ensureIndex{'parent._id': 1,"date.created":-1}
and so on for other fields.
It is a good practice to specify "limit" for the result to prevent mongodb from performing a huge in memory sort.
Read More on $orderBy operator here

How to optimize this query with $and and $or

Coming from SQL I have this search condition
WHERE (col1 LIKE "%foo%" OR col2 LIKE "%foo%") AND
(col1 LIKE "%bar%" OR col2 LIKE "%bar%")
which I want to convert to MongoDB.
I came up wit this, hopefully semantically identical query:
{
$and: [
{
$or: [
{ col1: /.*foo.*/ },
{ col2: /.*foo.*/ }
]
},
{
$or: [
{ col1: /.*bar.*/ },
{ col2: /.*bar.*/ }
]
}
]
}
Is this the correct way or can it be improved?
Any suggestions about indexes (if they can be used at all)?
Yes, that's the correct way to implement that query for MongoDB.
If you want an index to fully assist the query, it needs to be a compound index that includes both fields, because MongoDB queries can only use one index per query. So an index like this:
db.coll.ensureIndex({col1: 1, col2: 1})
You can confirm your query is using the index you expect by using explain().
and and or queries are optimised in different ways. Quoting directly from 50 tips and tricks MongoDB:
Tip #27: AND-queries should match as little as possible as fast
as possible
and
Tip #28: OR-queries should match as much as possible as soon
as possible
So depending on the complexity and flexibility of your actual queries, you should make foo and bar more generic, yet try to limit the results from both $or statements.
Hope it helps
You have three problems here:
$and is two evaluated queries
$or is two evaluated queries
None pre-fixed regex
$and does not use a multiple index plan, only a single index plan however $or does as such using a single compound index here will not help that much.
In fact no index will help since MongoDB cannot use indexes for none pre-prefixed regexs.
So actually adding any index at all would be useless to this query.
There is no way to optimise this query in it's current form.
Normally a good way of doing searches like this is to split out the words you are searching for into a subdocument of words that can be searched on directly and indexed. Take a look at: http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo for examples.
When using this you would probably house one index on the array of words and that's it. MongoDB should be able to use that index for the entire query (twice over for the $or).
edit
Also an $or works different to how you have posted.
You don't need the $and and the two $ors. You only need one $or with a multi condition:
Edit again
I decided to take out the $or altogether. Note this uses the index unfriendly regex but still it is an interesting concept.
{ col1: /.*(foo|bar).*/ , col2: /.*(bar|foo).*/ }
If you want to use index friendly stuff then you will need to change the way your query works entirely as described above.

In MongoDB how do you query for records that contain ONLY certain fields and no others

In MongoDB,
To query for records that contain certain fields you can do:
collection.find({'field_name1': {'$exists': true}})
And that will return any record that has the 'field_name1' field set...
But how do you query mongo to find records that contains ONLY 'field_name1' (and no other fields)? I'd like to be able to do this for, say, a list of fields.
The sad answer, as you'll often find with MongoDB and other NoSQL databases is probably that it would be best to structure your data in a way that allows you to query it as simply as possible.
That said, there are ways of doing this, but as far as I know, it requires you execute JavaScript server side. This will be slow, and cannot possibly take advantage of indexes and other logical features of MongoDB, so use it only if it's absolutely necessary, if performance is at all important.
So, the easiest way to do this, is probably to create a function that returns the number of fields in an object, which we can use with the $where query syntax. This allows you to run arbitrary JavaScript queries against your data, and can be combined with normal syntax queries.
Sadly, my JavaScript-fu is a little weak, so I don't know how (or if) you can get at the count of members of an object in JS in a one-liner, so to do this, I would store a function server side.
From the mongo shell, execute the following:
db.system.js.save(
{
"_id" : "countFields",
"value" : function(x) { i=0; for(p in x) { i++; } return i}
}
)
With that, you have a saved JavaScript function, server side, called countFields that returns the number of elements in an object. Now, you need to execute your find-operation with the $where query:
db.collection.find({
'field_name1': {'$exists': True},
'$where' : 'countFields(this)==2'
})
This would give you only the documents that meet both the $exists condition, and the $where clause. Note that I'm comparing with 2 in the example, since the countFields function counts _id as a field.