I ran following two queries and they returned different results.
// my query 1
> db.events.count({"startTimeUnix":{$lt:1533268800000},"startTimeUnix":{$gte:1533182400000}})
131
// existing app query 2
> db.events.count({"startTimeUnix":{"$lt":1533268800000,"$gte":1533182400000}})
0
The query 2 is already being used in the batch application but it reported to pulling less records which I confirmed from these queries.
//these counts are confusing
> db.events.count()
2781
> db.events.count({"startTimeUnix":{$lt:1533268800000}})
361
> db.events.count({"startTimeUnix":{$gte:1533182400000}})
2780
Use the second query. You can add explain() to find out the query plans. The first query
db.events.count({"startTimeUnix":{$lt:1533268800000},"startTimeUnix":{$gte:1533182400000}})
is evaluated the same as
db.events.explain().count({"startTimeUnix":{$gte:1533182400000}})
Use the command below to view the query plans.
db.events.explain().count({"startTimeUnix":{$lt:1533268800000},"startTimeUnix":{$gte:1533182400000}})
query 2 is an impicit (and proper) way of building AND condition, query 1 is incorrect in terms of MongoDB syntax. The way it gets analyzed is pretty simple, MongoDB takes first condtion and then overrides it with second one so it has the same meaning as:
db.events.count({"startTimeUnix":{$gte:1533182400000}})
first condition simply gets ignored and that's why you're getting more results (described here)
The problem is that mongo doesn't parse operators if the are in quotes.
db.events.count({"startTimeUnix":{"$lt":1533268800000,"$gte":1533182400000}})
means that it looks for the entries where startTimeUnix is an object and contains fields "$lt" and "$gte"
If you'll the next command, this query starts returning 1:
db.events.insert({"startTimeUnix":{"$lt":1533268800000,"$gte":1533182400000}})
Related
I have two simple queries on a collection of 22 millions documents.
query 1:
db.audits.find({"w.em": /^name.lastname/i})
return in less than 1 second.
query 2:
db.audits.find({"w.d": /^name.lastname/i})
it runs for more than 30seconds (and rightly not found results).
The only difference on the two queries is the field I am searching on. Both fields are indexed, you can find here the explain for both : it is identical!
Here the explains with executionStats
How can the queries perform so differently??
I am on mongodb 3.4.23
My collection name is trial and data size is 112mb
My query is,
db.trial.find()
and i have added limit up-to 10.
db.trial.find.limit(10).
but the limit is not working.the entire query is running.
Replace
db.trial.find.limit(10)
with
db.trial.find().limit(10)
Also you mention that the entire database is being queried? Run this
db.trial.find().limit(10).explain()
It will tell you how many documents it looked at before stopping the query (nscanned). You will see that nscanned will be 10.
The .limit() modifier on it's own will only "limit" the results of the query that is processed, so that works as designed to "limit" the results returned. In a raw form though with no query you should just have the n scanned as the limit you want:
db.trial.find().limit(10)
If your intent is to only operate on a set number of documents you can alter this with the $maxScan modifier:
db.trial.find({})._addSpecial( "$maxScan" , 11 )
Which causes the query engine to "give up" after the set number of documents have been scanned. But that should only really matter when there is something meaningful in the query.
If you are actually trying to do "paging" then you are better of using "range" queries with $gt and $lt and cousins to effectively change the range of selection that is done in your query.
I'm running MongoDB v.2-4-4-pre- under Linux. I use simple find() operation with "skip" parameter to select some elements from my DB.
Is there any way to find out how many objects were skipped by my query?
It skips the the number of objects that YOU provide as an argument to the skip(). Example from here
db.article.aggregate(
{ $skip : 5 }
);
This operation skips the first 5 documents passed to it by the
pipeline. $skip has no effect on the content of the documents it
passes along the pipeline.
EDIT #1
Based on your comment, I believe you need to execute the count command.
How will MongoDB evaluate this query:
db.testCol.find(
{
"$or" : [ {a:1, b:12}, {b:9, c:15}, {c:10, d:"foo"} ]
});
When scanning values in a document if first OR statement is TRUE will the other statements be also be evaluated?
Logically if the MongoDB is optimized other values in OR statement should not be evaluated, but I don't know how MongoDB is implemented.
UPDATE:
I updated my query because it was wrong and it didn't explain correctly what I was trying to accomplish. I need to find a set of documents that have different properties and if an exact combination of these properties is found the document must be returned.
The SQL equivalent of my query would be:
SELECT * FROM testCol
WHERE (a = 1 AND b = 12) OR (b = 9 AND c = 15) OR (c = 10 AND d = 'foo');
MongoDB will execute each clause of the $or operation as a seperate query and remove duplicates as a post processing pass. As such each clause can use a seperate index which is often very useful.
In other words, it will NOT look at 1 document, see which of the OR clauses apply and do an early-out if the first clause is a match. Rather it does a full dataset query per clause and de-dupe after the fact. This may seem less than efficient but in practice it's almost always faster since the first approach would only be able to hit at most one index for all clauses which is rarely efficient.
EDIT: Mongo only skips documents during the de-duplication process, not during the table scans.
Mongo won't check documents that are already part of the result set. So if your first {a:1, b:12} returns 100% of the documents, Mongo is done.
You want to put whatever will grab the most documents as your first evaluated statement because of this. If your first item only grabs 1% of documents, the subsequent item will need to scan the other 99%.
That being said, you are using $or to look for values in a single key. I think you want to use $in for this.
See here for more:
http://books.google.com/books?id=BQS33CxGid4C&lpg=PA48&ots=PqvQJPRUoe&dq=mongo%20tips%20and%20tricks%20%22OR-query%22&pg=PA48#v=onepage&q&f=false
I have used mongodb 1.8.1. In which I have collection which contains more than 1.8 million records. In this collections all records are simple object means not nested objects or array
Like as follows
{ name : "xyz" , "id" : 123 ,"a" : "na" , "c" : "in" , "cmp" : "pq" , "ttl" : "sd"}
All records are like this.
On this collections at time more 5 queries fire in which 2 is simple queries one contains exists in it and another one is simple query which uses index properly.
Another 2 are group queries which in which condition fields are in indexes and one contains exists.
Another one 1 distinct query with proper condition which is also index.
And order of query fire is first qroup queries then 1 simple query then distinct query and last simple query.
So data loads slowly.
If such 2 -3 calls make then it loads very lowly sometimes gives error read time out.
The collections have more than 1 index.
$exists queries do not use indexes (fixed from 1.9.1 onwards)
group commands use the JS context of mongodb which is exclusively locked while it's being used. This will affect performance of concurrent group queries. A new aggregation framework is under development that should help with this (2.1 onwards). Monitor https://jira.mongodb.org/browse/SERVER-447 for progress. In my experience it's usually more performant to do "group" like aggregation app-side.