Mongo Cursor Hangs after first fetch - mongodb

I am running a lookup query using aggregation pipeline.
While iterating over the cursor returned from aggregation execution, I have noticed that program hangs when Mongo sends getMore command. I believe this command is used to fetch next batch of records from DB.
Initially I tested with Batch size of 100, it brings 100 records fairly quickly, but when it tries to fetch next set of records, cursor hangs.
I tried the same with batch size of 2 and got the same result.
I am using Mongo Java Driver version 3.6 and server 3.6.2.
I also found that there was a similar issue raised for python driver in the past(https://jira.mongodb.org/browse/PYTHON-276).
Has anyone experienced this in the past ? Any suggestions would be helpful.
Let me know if any other details are required

Related

Why is mongoDB not displaying all results

this might seem quite simple but does anyone know why mongoDB is only returning the first 20 results despite importing 800 from a CSV file?
Thanks
The mongo shell is not intended to be a production client, it is an administrative and test tool.
When you run a query, the mongo shell returns and displays the first batch of results. You will need to either request an additional batch (i.e. Type "it" for more), use a method like toArray to exhaust the cursor, or save the cursor to a variable so you can iterate it.

MongoDB closes connection on read operation

I run MongoDB 4.0 on WiredTiger under Ubuntu Server 16.04 to store complex documents. There is an issue with one of the collections: the documents have many images written as strings in base64. I understand this is a bad practice, but I need some time to fix it.
Because of this some find operations fail, but only those which have a non-empty filter or skip. For example db.collection('collection').find({}) runs OK while db.collection('collection').find({category: 1}) just closes connection after a timeout. It doesn't matter how many documents should be returned: if there's a filter, the error will pop every time (even if it should return 0 docs), while an empty query always executes well until skip is too big.
UPD: some skip values make queries to fail. db.collection('collection').find({}).skip(5000).limit(1) runs well, db.collection('collection').find({}).skip(9000).limit(1) takes way much time but executes too, while db.collection('collection').find({}).skip(10000).limit(1) fails every time. Looks like there's some kind of buffer where the DB stores query related data and on the 10000 docs it runs out of the resources. The collection itself has ~10500 docs. Also, searching by _id runs OK. Unfortunately, I have no opportunity to make new indexes because the operation fails just like read.
What temporary solution I may use before removing base64 images from the collection?
This happens because such a problematic data scheme causes huge RAM usage. The more entities there are in the collection, the more RAM is needed not only to perform well but even to run find.
Increasing MongoDB default RAM usage with storage.wiredTiger.engineConfig.cacheSizeGB config option allowed all the operations to run fine.

How to disable MongoDB aggregation timeout

I want to run aggregation on my large data sets. (It's about 361K documents) and Insert them to another collection.
I getting this error:
I tried to increase Max Time but it has maximum and it's not enough for my data sets. I found https://docs.mongodb.com/manual/reference/method/cursor.noCursorTimeout/ but it seems noCursorTimeout only apply on find not aggregation.
please tell me how I can disable cursor timeout or another solution to do this.
I am no MongoDB expert but will interpret what I know.
MongoDB Aggregation Cursors don't have a mechanism to adjust Batch Size or set Cursor Timeouts.
Therefore there is no direct way to alter this and the timeout of an aggregation query solely depends on the cursorTimeoutMillis parameter of the MongoDB or mongos` instance. Its default timeout value is 10 minutes.
Your only option is to change this value by the below command.
use admin
db.runCommand({setParameter:1, cursorTimeoutMillis: 1800000})
However, I strongly advise you against using this command. That's because it's a safety mechanism built into MongoDB. It automatically deletes queries that are running idle for more than 10 minutes, so that there is a lesser load in the MongoDB server. If you change this parameter (say to 30 minutes), MongoDB will allow idle queries to be running in the background for those 30 minutes, which will not only make all the new queries slower to execute, but also increase load and memory on the MongoDB side.
You have a couple of workarounds. Reduct the amount of documents if working on MongoDB Compass or copy and run the commands on Mongo Shell (I had success so far with this method).

With mongodb java driver, maxAwaitTime not working on a change stream

I'm using the Java mongodb driver version 3.8.0 and mongodb 3.6.3.
I created a watch on a collection with this:
MongoCursor<ChangeStreamDocument<Document>> cursor = collection.watch().maxAwaitTime(500, TimeUnit.MILLISECONDS).iterator();
The documentation here states about maxAwaitTime:
The maximum amount of time for the server to wait on new documents to satisfy a change stream query.
However, what I'm seeing is that cursor.hasNext() returns only if there is a change on the collection, not when the time passed to maxAwaitTime has elapsed.
When I turn on mongodb's verbose logging I see maxWaitTime set as expected in the getMore command.
How do I cause my watch to timeout when there are no changes? It seems as though maxAwaitTime should do this. Why doesn't it work?
MongoDB drivers implement change stream as an abstraction of a TAILABLE_AWAIT cursor, and maxAwaitTimeMS is to specify a time limit for a getMore command on on TAILABLE_AWAIT cursor.
The way it works, MongoCursor continues to send getMore command to the server in a loop until either:
A document is found
The cursor is closed
An exception occurs
Only when any of the event above happens, the cursor's method next() or hasNext() will return.
While none of the event above happens, the server's getMore command will continue to be called by the Iterator interface. The maxAwaitTime specifies how long to wait before the getMore command timed out while waiting for documents and returns even if there there are no documents found.
How do I cause my watch to timeout when there are no changes?
If your application requires a time out after maxAwaitTime, the mongo-java-driver offers the tryNext() method on MongoCursor. This method will return null after maxAwaitTime if no documents were found, and can be called repeatedly by the application. See also JAVA-2965.

How do I get feedback on whether a MongoDB query succeeded and how many rows were changed?

I have a MongoDB database and am using MongoChef to write scripts for it. I have a script that reads in data from a collection and inserts the records into another collection. The script runs fine, but I don't get any feedback on what occurred. Is there a way to get acknowledgement that the script has finished running (that is, all the records are inserted)? Is there a way to get an output of how many records were (in this case) inserted? (I realize I could write another statement to count records, but I want to know how many records were actually inserted by the insert statement). What I'd like to see is something like "Script successful. 1200 records inserted into collection properties." Can someone show me how to turn on this output for MongoChef? Thank you.
Below is an image of my script. This is after it's been run. Notice that there's nothing in the results tabs; there's no indication the queries have been run, that they were run succesfully or how many records have been updated.
You can go through the MongoDb documentation for WriteConcerns and see what information matches your Mongodb version. Previously the getLastError was used to get error information about the last executed CRUD statement.
getLastError can give you information if there's any error occurred post execution of any CRUD operation.
You can also use the WriteResult which is return by insert, update, remove and save operation to get the number of updated document. It also contains properties like writeError to get the information specific to that opeartion.
Sample(psuedo not specific to MongoChef) -
var wr = db.properties.insert(doc);
println("Updated %d collections of type %s", wr.getN(), type);