can we set different cursor idle timeout per query in mongodb? - mongodb

Please, advice how to set the cursor idle timeout to some bigger values for only certain queries to avoid the default threshold of 10 min that is working perfectly well for all other queries ?
(I see the only option seems to be to set for those long running queries the db.col.find().noCursorTimeout() , but wondering if it can be customized per query or at least group of queries )

Related

How to disable MongoDB aggregation timeout

I want to run aggregation on my large data sets. (It's about 361K documents) and Insert them to another collection.
I getting this error:
I tried to increase Max Time but it has maximum and it's not enough for my data sets. I found https://docs.mongodb.com/manual/reference/method/cursor.noCursorTimeout/ but it seems noCursorTimeout only apply on find not aggregation.
please tell me how I can disable cursor timeout or another solution to do this.
I am no MongoDB expert but will interpret what I know.
MongoDB Aggregation Cursors don't have a mechanism to adjust Batch Size or set Cursor Timeouts.
Therefore there is no direct way to alter this and the timeout of an aggregation query solely depends on the cursorTimeoutMillis parameter of the MongoDB or mongos` instance. Its default timeout value is 10 minutes.
Your only option is to change this value by the below command.
use admin
db.runCommand({setParameter:1, cursorTimeoutMillis: 1800000})
However, I strongly advise you against using this command. That's because it's a safety mechanism built into MongoDB. It automatically deletes queries that are running idle for more than 10 minutes, so that there is a lesser load in the MongoDB server. If you change this parameter (say to 30 minutes), MongoDB will allow idle queries to be running in the background for those 30 minutes, which will not only make all the new queries slower to execute, but also increase load and memory on the MongoDB side.
You have a couple of workarounds. Reduct the amount of documents if working on MongoDB Compass or copy and run the commands on Mongo Shell (I had success so far with this method).

How to increase pagesize temporarily?

Just for testing purpose I would like to get 100 , 500 , 1000 , 5000 , 10000 , 20000 ... records from a Collection. At the moment the largest pagesize is 1000. How can I increase it to whatever I would like for just testing ?
RESTHeart has a pagesize limit of 1000 pages per request and that's hardcoded into class org.restheart.handlers.injectors.RequestContextInjectorHandler.
If you, for any reason, want to increase that limit then you have to change the source code and build your own jar.
However, RESTHeart speedups the execution of GET requests to collections resources via its db cursors pre-allocation engine. This applies when several documents need to be read from a big collection and moderates the effects of the MongoDB cursor.skip() method that slows downs linearly. So it already optimizes the navigation of large MongoDB collections, if this is what you are looking for.
Please have a look at Speedup Requests with Cursor Pools and Performances page in the official documentation for more information.

What is use of negative Limit in MongoDB

When i was reading about limit method then i found a line that was
A negative limit is similar to a positive limit but closes the cursor
after returning a single batch of results. As such, with a negative
limit, if the limited result set does not fit into a single batch, the
number of documents received will be less than the specified limit.
I can't understand this explanation. So can any one explain this with a suitable Example?
If your query returns 100 elements before applying the limit operation, then you have 10 batches of data for your query ( if you do it via mongo shell). You can iterate through the data via it command until end ( 10 iterations).
If you add limit(30) to your query, you indicate that you want to get only 30 elements. Mongod will keep the connection from the mongo shell open until you have gone through all data.
However, if you set limit(-30), your query result on server still has 30 elements, but mongod only returns the first 10 elements to the shell, and you cannot go through the rest with it command because the connection is closed.

Aggregation-framework: optimization

I have a document structure like this
{
id,
companyid,
fieldA1,
valueA1,
fieldA2,
valueA2,
.....
fieldB15,
valueB15,
fieldF150
valueF150
}
my job is to multiply fieldA1*valueA1 , fieldA2*valueA2 and sum it up to new field A_sum = sum( a fields * a values), B_sum = sum(b fields * b value), C_sum , etc
then in the next step I have to generate final_sum = ( A_sumA_val + B_SumB_val .....)
I have modeled to use aggregation framework with 3 projections for the three steps of calculations - now on this point I get about 100 sec for 750.000 docs, I have index only on _id which is a GUID. CPU is at 15%
I tried to group in order to force parallel ops and load more of cpu but seems is staking longer.
What else can I do to make it faster, means for me to load more cpu, use more paralelism?
I dont need for match as I have to process all docs.
You might get it done using sharding, as the scanning of the documents would be done in parallel.
Simply measure the time your aggregation needs now, and calculate the number of shards you need using
((t/100)+1)*s
where t is the time the aggregation took in seconds and s is the number of existing shards (1 if you have a standalone or replica set), rounded up, of course. The 1 is added to be sure that the overhead of doing an aggregation in a sharded environment is leveraged by the additional shard.
my only solution is to split the collection into smaller collections (same space after all) and command computation per smaller collections (via c# console line) using parallel library so I can raise CPU to 70%.
That reduces the time from aprox 395s, 15%CPU (script via robomongo, all docs) to 25-28s, 65-70%cpu (c# console app with parallelism)
using grouping did not help in my case.
sharding is not an option now.

Set low priority for long mongodb query

guys!
I have a long query which executes 1-2 times at night. This query excracts data only (by operator find) and works about 15-20 minutes.
When this query executing mongodb can't proccess other queries.
Is it impossible to set low priority to query?
It would take some doing, but the best way to execute a long-running query would be against a hidden (read only) replica-set.
http://docs.mongodb.org/manual/core/replica-set-hidden-member/#replica-set-hidden-members
As a hidden replica set, your clients won't hit it in the standard replica set rotation. As it says here: http://docs.mongodb.org/manual/core/read-preference/ you can set your read preference to secondary to force it off of your primary db.