Pymongo ignoring my limit parameter - mongodb

I am using Pymongo (v3.5.1) in a Python v3.6.3 Jupyter notebook.
Problem
Even-though I am limiting my results, the db.collection.find() is still retrieving all results before returning
My code:
for post in posts.find({'subreddit_1':"the_donald"}, limit=2):
print(post)
exit
Background
I have imported the Reddit comment data set (RC_2017-01) from files.pushshift.io and created an index on the subreddit field (subreddit_1).
My Indexes

I believe this is caused by the collection having no index on your query term, as exhibited by the line:
planSummary: COLLSCAN
which means that to answer your query, MongoDB is forced to look at each document in the collection one by one.
Creating an index to support your query should help. You can create an index in the mongo shell by executing:
db.posts.createIndex({'subreddit_1': 1})
This is assuming your collection is named posts.
Please note that creating that index would only help with the query you posted. It's likely that different index would be needed for different type of queries.
To read more about how indexing works in MongoDB, check out https://docs.mongodb.com/manual/indexes/

I think you need to change the query, because in find() method 2nd parameter is projection. Find() always return cursor and limit function always works on cursor.
So the syntax should like below:
for post in posts.find({'subreddit_1':"the_donald"})[<start_index>:<end_index>]
print(post)
exit
OR
for post in posts.find({'subreddit_1':"the_donald"}).limit(2)
print(post)
exit
Please read the doc for detail

Related

Return single document in mongo aggregation in Go driver

I am using official mongo driver for Golang: go.mongodb.org/mongo-driver/mongo
Preface
In this driver, I could not found any method for returning one single object from aggregation query.
driver aggregation documentation
Problem
The problem I am facing with this is if I have some documents which should be filtered and only first one should be returned, then I forcefully need to get all documents and return document on 0 index. In my knowledge, this is not optimized.
I have only one method for aggregation here, which returns cursor of multiple objects:
Is it possible to get single object in aggregation in this driver ?
Aggregation always returns a list of documents, but you may use the $limit stage to only return one document.
bson.M{"$limit": 1}

CosmosDB MongoDB 3.6 fails sort() query with compounded index

Newby MongoDB & CosmosDB user here, I've read the answer to this question How does MongoDB treat find().sort() queries with respect to single and compound indexes? and the offocial MongoDB docs and I believe my index creation mirrors that answer so I am leaning towards this being a CosmosDB issue but reading their documentation CosmosDB 3.6 supports compounded indexes as well, so I am at a loss right now.
I am able to run sort() queries like db.Videos.find().sort({"PublishedOn": 1}) from the mongo command line on a collection with an index created as db.Videos.createIndex({"PublishedOn": 1}) or db.Videos.createIndex({"PublishedOn": -1}).
And when I add a 'where' clause to the find like this db.Videos.find({"IsPinned": false}).sort({"PublishedOn": 1}) the above index still works.
However I now have document look ups which I want to avoid, so I drop the above single field index and create a compounded index like this db.Videos.createIndex({"IsPinned": 1, "PublishedOn": 1}) or db.Videos.createIndex({"PublishedOn": 1, "IsPinned": 1}) but now the query always fails with the error The index path corresponding to the specified order-by item is excluded..
Is this a limitation of CosmosDB or is my 'ordering' in the index bad?
The issue with CosmosDB is that it expects all WHERE fields to be used in the GROUP BY clause as well in exactly the same order else it won't use the index.
Creating an index as db.Videos.createIndex({"IsPinned": 1, "PublishedOn": 1}) and then updating the query to be db.Videos.find({"IsPinned": false}).sort({"IsPinned": 1, "PublishedOn": 1}) works like a charm.
I inferred this from reading the CosmosDB documentation on indexing policies (https://learn.microsoft.com/en-us/azure/cosmos-db/index-policy) as the MongoDB documentation suddenly stops after the index creation (https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb-indexing) section.

MONGODB: $in operator not matching any record

community!
I am in a weird situation. The direct equally check returns result, but when using $in I am not getting any records.
db.getCollection("voter").find({"id":{$in:["db1eefc5-09ad-4d4f-a31a-db63d8261913"]}})
db.voter.find({"id":{$in:["db1eefc5-09ad-4d4f-a31a-db63d8261913"]}})
Doesn't return anything.
db.voter.find({id: "db1eefc5-09ad-4d4f-a31a-db63d8261913"})
Returns the desired record.
Being more of a fullstack developer, I don't know what's happening in-depth, but I am sure that both things shall work ideally which is not the case here.
Extra info:
I have defined hashed unique indexes on id.
Thanks.
The problem is pretty simple:
On the first screen you're running your query against admin database
while second query gets executed against crmadmin db

Mongo Get Count While Returning Whole Documents and Should Queries

I am new to Mongo and can't seem to figure out the following after reading posts and the documentation. I am executing the following query:
db.collection.find({'name':'example name'})
Which returns 14 results. I can get the count of correctly by executing:
db.collection.find({'name':'example name'}).count()
However, I want to return the full documents and the count in a single query, similar to the way Elasticsearch does. Is there anyway to do this.
Additionally, is there any equivalence to Elasticsearch's Bool should query (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html). Essentially I would want to rank the results, so that those with attribute 'onSale=True' are returned before 'onSale=False'.
I'm not sure about your second question, whether MongoDB provides some mechanism equivalent to Elasticsearch's Bool should query.
But for your 1st question, I think you can use Cursor.
var cursor = db.collection.find({'name':'example name'});
Once you've got the cursor, you can use it for getting the count in the following way:
cursor.count()
as well as for getting the documents wrapped in an array in the following way:
cursor.toArray()
For more info on cursor, please see the below mentioned link:
http://docs.mongodb.org/manual/tutorial/iterate-a-cursor/

mongoDB Object DBCursor has no method 'sort'

so i created a collection called food with 2 objects that were saved no problem. Using find() yielded no issues either. However, when I entered the following:
db.food.find().sort({averageRating:-1}).select({_id: 1}).limit(2)
I get the error:
JS Error: TypeError: Object DBCursor has no method 'sort'
What am i doing wrong?
Is this what you are looking for?
db.food.find({},{_id:1}).sort({"averageRating":-1}).limit(2);
It selects only 2 id fields ordered by average rating descending.The fields that are to be returned are specified by the second parameter in find(),which in this case is _id.
select is not a valid command in mongoDb as far as I know.
It should be selector, not select. See if that fixes it.
As per shargors' comment, it looks like try.mongodb.org doesn't support sort(). I would recommend downloading and installing mongodb itself, and playing around with the real shell.