MongoDB Geospatial Query Count Issue (Always 100) - mongodb

It appears there is an issue with the count operation on a geospatial query that contains more than 100 results. If I run the following query I still get a count of 100 no matter what.
db.locations.find({"loc":{$nearSphere:[50, 50]}}).limit(1000).count()
I understand that the default size limit on a query that uses the "near" syntax is 100 but it appears you cannot return more than that. Am I doing something wrong or is there a workaround for this?

Try "within" instead of "near".
This works for me,
center = [50, 50];
radius = 1/111.12; //convert it to KM.
db.places.count({"loc" : {"$within" : {"$center" : [center, radius]}}})
I found this at https://jira.mongodb.org/browse/SERVER-856

For what it's worth, even though the count reported for the cursor is 100 (regardless of what limit you give the query, even "10") you actually get the "limit" number of results back when you run through the query.
Here's a link to an issue where they assert it isn't broken:
https://jira.mongodb.org/browse/SERVER-739
Hope that helps!

It is not an issue with the mongo query. Try using cursor.size() rather than cursor.count(). Count does not take into account the limit where size does. So if you use .size() instead of count() you should get a print out of the correct number of returned items.
Check out this stack overflow Q and A for a clear solution to your issue.
Difference between cursor.count() and cursor.size() in MongoDB

Related

Can't iterate entire pymongo cursor when using $in query

I'm trying to iterate a result set returned by CosmosDB's MongoDB API. I'm using pymongo to connect to the database.
The query looks something like:
items = item_collection.find(filter={'store_id': 151, 'dept_id': {'$in': [17, 19]}})
However, I am only able to iterate through 101 items before the cursor is empty. Calling items.count() reveals there are definitely more results in the result set:
tally = 0
for item in items:
tally += 1
print('Cursor total: %s --- Tally: %s' % (items.count(), tally))
# prints 'Cursor total: 627 --- Tally: 101'
Perhaps not a co-incidence, 101 is the default size of the first batch returned by a Mongo query.
Now, if I remove the $in part of the query, and issue something like:
item_collection.find(filter={'store_id': 151, 'dept_id': 17})
then tally and items.count() yield the same number.
Any insight into why this is happening would be welcomed!
I've seen similar situations to you, according to the sharing in this case: Cosmos Mongo API "In" Array expression issue ,it seems that the issue of $in is on Microsoft side.You could wait until the bug fixed.
The ms feedback as below:
Thank you David for reporting this! I investigated the issue, it’s a
bug on our side manifesting under a combination of conditions. I
already have made a fix for it and will check it in by end of week
(then it’s up to our deployment cycle to propagate the fix to all
datacenters around the world). Let me know if you have queries that
don’t work and are blocking you. Best regards, Orestis
Hope it helps you.

Implementation of limit in mongodb

My collection name is trial and data size is 112mb
My query is,
db.trial.find()
and i have added limit up-to 10.
db.trial.find.limit(10).
but the limit is not working.the entire query is running.
Replace
db.trial.find.limit(10)
with
db.trial.find().limit(10)
Also you mention that the entire database is being queried? Run this
db.trial.find().limit(10).explain()
It will tell you how many documents it looked at before stopping the query (nscanned). You will see that nscanned will be 10.
The .limit() modifier on it's own will only "limit" the results of the query that is processed, so that works as designed to "limit" the results returned. In a raw form though with no query you should just have the n scanned as the limit you want:
db.trial.find().limit(10)
If your intent is to only operate on a set number of documents you can alter this with the $maxScan modifier:
db.trial.find({})._addSpecial( "$maxScan" , 11 )
Which causes the query engine to "give up" after the set number of documents have been scanned. But that should only really matter when there is something meaningful in the query.
If you are actually trying to do "paging" then you are better of using "range" queries with $gt and $lt and cousins to effectively change the range of selection that is done in your query.

Mongoengine geo spatial query with text search does not work as expected

The following is my query:
items = Item.objects(
location__near=[item_obj.longitude, item_obj.latitude],
location__max_distance=item_obj.range,
status__status=ITEM_STATUS_DISPLAYED
).filter(
Q(title__icontains=item_obj.search) |
Q(description__icontains=item_obj.search
)
).hint([('location', '2dsphere')])
This query does not seem to work as objects outside the range is getting returned. And the item status also seems to be ignored. The range is given in meters.
The strange thing is the following query works without any issues:
items = Item.objects(
location__near=[item_obj.longitude, item_obj.latitude],
location__max_distance=item_obj.range,
status__status=ITEM_STATUS_DISPLAYED
)
I am not sure what is wrong.
I recommend that you stay away from the $near handle since mongoengine is using a deprecated call to PyMongo and your application will fail if you update mongoDB to the latest version.
What I did was to use the geo_within query, specifically the geo_within_sphere since I am finding points within a circle around locations on earth. You can find reference here: MongoEngine geo query
One thing that they don't explain in there is converting the radius. If you use Km then you have to do radius/6371.0 if you use miles then radius/3959.0
My queries look like this:
data = data_set.objects(
location__geo_within_sphere=[[longitude, latitude], radius/6371.0]
)

Is it faster to use with_limit_and_skip=True when counting query results in pymongo

I'm doing a query where all I want to know if there is at least one row in the collection that matches the query, so I pass limit=1 to find(). All I care about is whether the count() of the returned cursor is > 0. Would it be faster to use count(with_limit_and_skip=True) or just count()? Intuitively it seems to me like I should pass with_limit_and_skip=True, because if there are a whole bunch of matching records then the count could stop at my limit of 1.
Maybe this merits an explanation of how limits and skips work under the covers in mongodb/pymongo.
Thanks!
Your intuition is correct. That's the whole point of the with_limit_and_skip flag.
With with_limit_and_skip=False, count() has to count all the matching documents, even if you use limit=1, which is pretty much guaranteed to be slower.
From the docs:
Returns the number of documents in the results set for this query. Does not take limit() and skip() into account by default - set with_limit_and_skip to True if that is the desired behavior.

In Mongodb, how do I get the count of the total results returned, without the limit?

Let's say i put a limit and skip on the MongoDB query...I want to know the total results if there was not a limit on there.
Of course, I could do this the shitty way...which is to query twice.
In MongoDB the default behavior of count() is to ignore skip and limit and count the number of results in the entire original query. So running count will give you exactly what you want.
Passing a Boolean true to count or calling size instead would give you a count WITH skip or limit.
There is no way to get the count without executing the query twice.