How to Set Limit for options.FindOne() in go mongo-driver - mongodb

I see there is a way to SetLimit() for Find() func, But I don't see any options to set limit for FindOne() , Since we are searching single result out of FindOne() we don't even have to limit it ? Automatically it handles limit ?
Tried setting limit using 1options.FindOne()` , But I do not see a way to do that .

It's not documented, but it's common sense that Collection.FindOne() implies a behavior of that of Limit=1. The return value of Collection.FindOne() doesn't give access to multiple result documents, that's why options.FindOne doesn't even have a SetLimit() method.
If you check the source code, it's in there:
// Unconditionally send a limit to make sure only one document is returned and the cursor is not kept open
// by the server.
findOpts = append(findOpts, options.Find().SetLimit(-1))
Note that FindOptions.Limit documents that:
// Limit is the maximum number of documents to return. The default value is 0, which means that all documents matching the
// filter will be returned. A negative limit specifies that the resulting documents should be returned in a single
// batch. The default value is 0.
Limit *int64

Related

Is there a way to know whether .limit() actually removes any documents?

Using mongocxx driver (c++ project).
Working on a mongodb query to paginate some results from a query. I'm trying to return the first 10 results, while also informing whether or not there are more results to fetch with another query (using an offset) - so as to inform the recipient if there are more documents to fetch. The results are stored in a std::vector after the db find query.
Is there any elegant way to do this, preferably without returning all the result documents and then comparing the vector size to the specified page limit?
Current query (without specifics):
db.collection.find({"<some_field>" : <some value>}).limit(10);
This, however, will not inform whether or not any documents were removed, in the case that exactly 10 results were found.
Currently I'm simply returning the full vector of results and looping through it, breaking if the loop goes over 10 iterations (and setting a "more_items" bool to true).
You have 2 ways to do this:
Count all documents found by query:
db.collection.count({"<some_field>" : <some value>});
And then if there is more documents than you need (10 in here) - you can set "more_items" bool to true
Find and set limit to +1 (11 in here):
db.collection.find({"<some_field>" : <some value>}).limit(11);
That way you find 11 documents or less.
If you find 11 documents - this indicates that you have more documents than 10 (actual limit). If you find less than 11 - then you don't have documents to reach actual limit.

Avoid sending full table with MongoDB and Pymongo

I'm trying to get the min and max value from some fields inside a collection. I'm not sure if this:
result = collection.find(date_filter, expected_projection).sort({'attribute': -1}).limit(1)
is equivalent to this:
result_a = collection.find(date_filter, expected_projection)
result_b = result_a.sort({'attribute': -1}).limit(1)
I don't want the server to query all the data in result_a from the database. Is the first line of code actually fetching every document in my collection and THEN sorting it, or just fetching the max element in the attribute field?
No, they aren't equivalent; and MongoDB will not return the entire collection to the client - whether or not the attribute field is indexed.
When you chain operators together in a MongoDB command (e.g. find().sort().limit()), it is not treated by the MongoDB server as a set of separate functions to be called sequentially; it is treated as a single query which should be optimised as a whole and executed as a whole on the MongoDB server.
See the documentation on Combining Cursor Methods for another example of how the chaining is not taken as a sequence of independent operations:
The following statements chain cursor methods limit() and sort():
db.bios.find().sort( { name: 1 } ).limit( 5 )
db.bios.find().limit( 5 ).sort( { name: 1 } )
The two statements are equivalent; i.e. the order in which you chain the limit() and the sort() methods is not significant. Both statements return the first five documents, as determined by the ascending sort order on ‘name’.
The first line of code tells MongoDB to return only the document with the lowest value for "attribute". If "attribute" is indexed, then MongoDB can directly access only that one document, and not even consider the rest of the collection.
Do this once:
collection.create_index([('attribute', 1)])
Having that index in place means you can find the highest-sorting or lowest-sorting document practically instantly.

Implementation of limit in mongodb

My collection name is trial and data size is 112mb
My query is,
db.trial.find()
and i have added limit up-to 10.
db.trial.find.limit(10).
but the limit is not working.the entire query is running.
Replace
db.trial.find.limit(10)
with
db.trial.find().limit(10)
Also you mention that the entire database is being queried? Run this
db.trial.find().limit(10).explain()
It will tell you how many documents it looked at before stopping the query (nscanned). You will see that nscanned will be 10.
The .limit() modifier on it's own will only "limit" the results of the query that is processed, so that works as designed to "limit" the results returned. In a raw form though with no query you should just have the n scanned as the limit you want:
db.trial.find().limit(10)
If your intent is to only operate on a set number of documents you can alter this with the $maxScan modifier:
db.trial.find({})._addSpecial( "$maxScan" , 11 )
Which causes the query engine to "give up" after the set number of documents have been scanned. But that should only really matter when there is something meaningful in the query.
If you are actually trying to do "paging" then you are better of using "range" queries with $gt and $lt and cousins to effectively change the range of selection that is done in your query.

Is it faster to use with_limit_and_skip=True when counting query results in pymongo

I'm doing a query where all I want to know if there is at least one row in the collection that matches the query, so I pass limit=1 to find(). All I care about is whether the count() of the returned cursor is > 0. Would it be faster to use count(with_limit_and_skip=True) or just count()? Intuitively it seems to me like I should pass with_limit_and_skip=True, because if there are a whole bunch of matching records then the count could stop at my limit of 1.
Maybe this merits an explanation of how limits and skips work under the covers in mongodb/pymongo.
Thanks!
Your intuition is correct. That's the whole point of the with_limit_and_skip flag.
With with_limit_and_skip=False, count() has to count all the matching documents, even if you use limit=1, which is pretty much guaranteed to be slower.
From the docs:
Returns the number of documents in the results set for this query. Does not take limit() and skip() into account by default - set with_limit_and_skip to True if that is the desired behavior.

In Mongodb, how do I get the count of the total results returned, without the limit?

Let's say i put a limit and skip on the MongoDB query...I want to know the total results if there was not a limit on there.
Of course, I could do this the shitty way...which is to query twice.
In MongoDB the default behavior of count() is to ignore skip and limit and count the number of results in the entire original query. So running count will give you exactly what you want.
Passing a Boolean true to count or calling size instead would give you a count WITH skip or limit.
There is no way to get the count without executing the query twice.