In Mongodb, how do I get the count of the total results returned, without the limit? - mongodb

Let's say i put a limit and skip on the MongoDB query...I want to know the total results if there was not a limit on there.
Of course, I could do this the shitty way...which is to query twice.

In MongoDB the default behavior of count() is to ignore skip and limit and count the number of results in the entire original query. So running count will give you exactly what you want.
Passing a Boolean true to count or calling size instead would give you a count WITH skip or limit.

There is no way to get the count without executing the query twice.

Related

How to Set Limit for options.FindOne() in go mongo-driver

I see there is a way to SetLimit() for Find() func, But I don't see any options to set limit for FindOne() , Since we are searching single result out of FindOne() we don't even have to limit it ? Automatically it handles limit ?
Tried setting limit using 1options.FindOne()` , But I do not see a way to do that .
It's not documented, but it's common sense that Collection.FindOne() implies a behavior of that of Limit=1. The return value of Collection.FindOne() doesn't give access to multiple result documents, that's why options.FindOne doesn't even have a SetLimit() method.
If you check the source code, it's in there:
// Unconditionally send a limit to make sure only one document is returned and the cursor is not kept open
// by the server.
findOpts = append(findOpts, options.Find().SetLimit(-1))
Note that FindOptions.Limit documents that:
// Limit is the maximum number of documents to return. The default value is 0, which means that all documents matching the
// filter will be returned. A negative limit specifies that the resulting documents should be returned in a single
// batch. The default value is 0.
Limit *int64

Why MongoDB find has same performance as count

I am running tests against my MongoDB and for some reason find has the same performance as count.
Stats:
orders collection size: ~20M,
orders with product_id 6: ~5K
product_id is indexed for improved performance.
Query: db.orders.find({product_id: 6}) vs db.orders.find({product_id: 6}).count()
result the orders for the product vs 5K after 0.08ms
Why count isn't dramatically faster? it can find the first and last elements position with the product_id index
As Mongo documentation for count states, calling count is same as calling find, but instead of returning the docs, it just counts them. In order to perform this count, it iterates over the cursor. It can't just read the index and determine the number of documents based on first and last value of some ID, especially since you can have index on some other field that's not ID (and Mongo IDs are not auto-incrementing). So basically find and count is the same operation, but instead of getting the documents, it just goes over them and sums their number and return it to you.
Also, if you want a faster result, you could use estimatedDocumentsCount (docs) which would go straight to collection's metadata. This results in loss of the ability to ask "What number of documents can I expect if I trigger this query?". If you need to find a count of docs for a query in a faster way, then you could use countDocuments (docs) which is a wrapper around an aggregate query. From my knowledge of Mongo, the provided query looks like a fastest way to count query results without calling count. I guess that this should be preferred way regarding performances for counting the docs from now on (since it's introduced in version 4.0.3).

Why mongo db (version 3.0.6) returns wrong number of records when we use count with limit option?

As per mongo db doc says we can use count with limit.
Limit option is used to specify the maximum number of documents the cursor will return. But if we use limit with count it returns total count and not correct count.
Why?
Suppose we have 50 records in collection then only count option will return 50, and if we apply limit(10) option then it should return 10 and not 50. But count with limit returns 50.
db.collection.find(<query>).count();
You will get count of all records found after executing the query. i.e count=50;
db.collection.find(<query>).limit(10).count(true);
You will get the count of limited documents. i.e count=10.
You should set applySkipLimit to true.
http://docs.mongodb.org/manual/reference/method/cursor.count/

Implementation of limit in mongodb

My collection name is trial and data size is 112mb
My query is,
db.trial.find()
and i have added limit up-to 10.
db.trial.find.limit(10).
but the limit is not working.the entire query is running.
Replace
db.trial.find.limit(10)
with
db.trial.find().limit(10)
Also you mention that the entire database is being queried? Run this
db.trial.find().limit(10).explain()
It will tell you how many documents it looked at before stopping the query (nscanned). You will see that nscanned will be 10.
The .limit() modifier on it's own will only "limit" the results of the query that is processed, so that works as designed to "limit" the results returned. In a raw form though with no query you should just have the n scanned as the limit you want:
db.trial.find().limit(10)
If your intent is to only operate on a set number of documents you can alter this with the $maxScan modifier:
db.trial.find({})._addSpecial( "$maxScan" , 11 )
Which causes the query engine to "give up" after the set number of documents have been scanned. But that should only really matter when there is something meaningful in the query.
If you are actually trying to do "paging" then you are better of using "range" queries with $gt and $lt and cousins to effectively change the range of selection that is done in your query.

Is it faster to use with_limit_and_skip=True when counting query results in pymongo

I'm doing a query where all I want to know if there is at least one row in the collection that matches the query, so I pass limit=1 to find(). All I care about is whether the count() of the returned cursor is > 0. Would it be faster to use count(with_limit_and_skip=True) or just count()? Intuitively it seems to me like I should pass with_limit_and_skip=True, because if there are a whole bunch of matching records then the count could stop at my limit of 1.
Maybe this merits an explanation of how limits and skips work under the covers in mongodb/pymongo.
Thanks!
Your intuition is correct. That's the whole point of the with_limit_and_skip flag.
With with_limit_and_skip=False, count() has to count all the matching documents, even if you use limit=1, which is pretty much guaranteed to be slower.
From the docs:
Returns the number of documents in the results set for this query. Does not take limit() and skip() into account by default - set with_limit_and_skip to True if that is the desired behavior.