How to automatically kill slow MongoDB queries?

How to automatically kill slow MongoDB queries? - mongodb

Is there a way that I can protect my app against slow queries in MongoDB?
My application has tons of possibilities of filters and I'm monitoring all these queries but at the same time I don't want to compromise performance because of a missing index definition.

The 'notablescan' option, as #ghik mentioned, will prevent you from running queries that are slow due to not using an index. However, that option is global to the server, and it is not appropriate for use in a production environment. It also won't protect you from any other source of slow queries besides table scans.
Unfortunately, I don't think there is a way to directly do what you want right now. There is a JIRA ticket proposing the addition of a $maxTime or $maxScan query parameter, which sounds like it would help you, so please vote for it: https://jira.mongodb.org/browse/SERVER-2212.

There are options available on the client side (maxTimeMS starting in 2.6 release).
On the server side, there is no appealing global option, because it would impact all databases and all operations, even ones that the system needs to be long running for internal operation (for example tailing the oplog for replication). In addition, it may be okay for some of your queries to be long running by design.
The correct way to solve this would be to monitor currently running queries via a script and kill the ones that are long running and user/client initiated - you can then build in exceptions for queries that are long running by design, or have different thresholds for different queries/collections/etc.
You can then use db.currentOp() method (in the shell) to see all currently running operations. The field "secs_running" indicates how long the operation has been running. Be careful not to kill any long running operations that are not initiated by your application/client - it may be a necessary system operation, like chunk migration in a sharded cluster (as just one example).

Right now with version 2.6 this is possible. In their press release you can see the following:
with MaxTimeMS operators and developers can specify auto-cancellation
of queries, providing better control of resource utilization;
Therefore with MaxTimeMS you can specify how much time you allow your query to be executed. For example I do not want a specific query to run more than 200 ms.
db.collection.find({
// my query
}).maxTimeMS(200)
What is cool about this, is that you can specify different timeouts for different operations.
To answer OP's question in the comment. There is not global setting for this. One reason is that different queries can have different maximum tolerating time. For example you can have query that finds userInfo by it's ID. This is very common operation and should run super fast (otherwise we are doing something wrong). So we can not tolerate it to run longer than 200 ms.
But we also have some aggregation query, which we run once a day. For this operation it is ok to run for 4 seconds. But we can not tolerate it longer than 10 seconds. So we can put 10000 as maxTimeMS.

I guess there is currently no support for killing query by passing time argument. Though in your development side, you can set profiler level to 2. It will log every query that has been issued. From there you can see which queries take how much time. I know its not what you exactly wanted but it helps in getting the insight of what all queries are fat and then in your app logic you can have some way to gracefully handle such cases where those queries might originate. I usually go by this approach and it helps.

Just putting this here since I was struggling with the same for a while:
Here is how you can do it in python3
Tested on mongo version 4.0 and pymongo version 3.11.4
import pymongo
client = pymongo.MongoClient("mongodb://mongodb0.example.com:27017")
admin_db = client.get_database("admin")
milliseconds_running = 10000
query = [
{"$currentOp": {"allUsers": True, "idleSessions": True}},
{
"$match": {
"active": True,
"microsecs_running": {
"$gte": milliseconds_running * 1000
},
"ns": {"$in": ["mydb.collection1", "mydb.collection2"]},
"op": {"$in": ["query"]},
}
},
]
ops = admin_db.aggregate(query)
count = 0
for op in ops:
admin_db.command({"killOp": 1, "op": op["opid"]})
count += 1
logging.info("ops found: %d" % count)
I wrote a more robust and configurable script for it here.
It also has a Dockerfile file in case anyone wants to use this as a container. I am currently using this as a periodicallly running cleanup task.

Related

MongoDB closes connection on read operation

I run MongoDB 4.0 on WiredTiger under Ubuntu Server 16.04 to store complex documents. There is an issue with one of the collections: the documents have many images written as strings in base64. I understand this is a bad practice, but I need some time to fix it.
Because of this some find operations fail, but only those which have a non-empty filter or skip. For example db.collection('collection').find({}) runs OK while db.collection('collection').find({category: 1}) just closes connection after a timeout. It doesn't matter how many documents should be returned: if there's a filter, the error will pop every time (even if it should return 0 docs), while an empty query always executes well until skip is too big.
UPD: some skip values make queries to fail. db.collection('collection').find({}).skip(5000).limit(1) runs well, db.collection('collection').find({}).skip(9000).limit(1) takes way much time but executes too, while db.collection('collection').find({}).skip(10000).limit(1) fails every time. Looks like there's some kind of buffer where the DB stores query related data and on the 10000 docs it runs out of the resources. The collection itself has ~10500 docs. Also, searching by _id runs OK. Unfortunately, I have no opportunity to make new indexes because the operation fails just like read.
What temporary solution I may use before removing base64 images from the collection?

This happens because such a problematic data scheme causes huge RAM usage. The more entities there are in the collection, the more RAM is needed not only to perform well but even to run find.
Increasing MongoDB default RAM usage with storage.wiredTiger.engineConfig.cacheSizeGB config option allowed all the operations to run fine.

Continuously run MongoDB aggregation pipeline

I have an ETL pipeline that is sinking timeseries records to MongoDB.
I need to compute timely aggregations for daily, weekly and the like. I assumed the aggregations engine of MongoDB would be the way to go, so after I had the aggregation queries for each resolution I wrapped them with MongoDB views like "daily_view", "weekly_view", etc.
There is REST service to fetch from MongoDB. Depending on what period resolution is requested, it pulls from the different aforementioned views, sampling for start and end dates.
The response times are quite "poor" with these views/aggregations. It can be around 10-15 seconds. I take this lapse might not be outrageous for batch computing a report, but in my case the service needs to issue these requests in a live mode to serve the frontend, so 10 seconds wait is too much.
From the MongoDB reference I know that Views are computed on demand during read operations but I'm a bit disappointed with such response times because the same aggregations took split seconds in Elasticsearch or InfluxDB, which unfortunately are not an option for me at the moment.
I have also exhausted the research about optimizing the queries and there is no room from more improvement there than the way it already is.
My intuition tells me that if the aggregations have to be done via the aggregations engine, I need the pipelines executing continuously on the fly (so the views have records already in for the service), as opposed to be run everytime ad-hoc.
I've tried to drop the views, and instead have and aggregation with a last stage being an $out to a real collection ...but I have still the same problem, it needs to be run "on demand". I composed the pipelines using the Compass UI, and in the $out stage it presents a button to run the aggregation.
Would there be a way to schedule such pipelines/aggregation queries??
Something I can think about is, copy-pasting the code of the aggregations and make it into Javascript functions of the REST service ...but still, something would have to invoke those functions on a regular interval. I know there are libraries I can bring into the service for scheduling, but this option makes me feel a bit discomforted in terms of architecture.
In the worst case scenario, my backup plan is to implement the timely aggregations as part as the logic of the initial ETL and sink all the different resolutions to different collections, so the service will find records to fetch already waiting in the aggregated collections. But the intention was to leverage time aggregations to the datastore engine.
I'm having a bit of last minute architecture distress now

$out aggregation stage. Documentation.
Takes the documents returned by the aggregation pipeline and writes them to a specified collection. The $out operator must be the last stage in the pipeline.
$mongo accepts javascript file as an argument. So this is the easiest way to package your aggregation. Reference.
mongo file.js --username username --password
Then - to execute it on schedule - common tools like cron jobs to the rescue.
You might need to account for the differences between Mongo Shell and Javascrips such as using db = db.getSiblingDB('<db>') instead of use <db>. Write Scripts for the mongo Shell

MongoDB : Kill all tasks in mongoDB that takes too long and exceeds certain wait-time

I want to kill all the queries that are taking too long to execute. I have tried socketTimeOut but it is not the good way to do the same and the expected result is not achieved.
Read this for explanation http://mongodb.github.io/node-mongodb-native/2.2/reference/faq/. There is also an option of maxTimeMS, but we cannot make it general and pass it in mongoClient URI options. It is query specific. Is there any other parameter that I can set for the terminating all the queries that take more time to execute.
Any settings in mongo config can also help, but it should work for all the types of queries like insert,find,update, remove. I have also seen db.currentOp() option, but I want to know if it can be done in a better way and if db.currentOp() is the only option, how can I use it to find and terminate a query with Java Driver.
Thank you

I have gone through all the documentations, there is no way we can implement it.
The thing is maxTimeMS is implemented for each query but not on the Global URI level.

How long will a mongo internal cache sustain?

I would like to know how long a mongo internal cache would sustain. I have a scenario in which i have some one million records and i have to perform a search on them using the mongo-java driver.
The initial search takes a lot of time (nearly one minute) where as the consecutive searches of same query reduces the computation time (to few seconds) due to mongo's internal caching mechanism.
But I do not know how long this cache would sustain, like is it until the system reboots or until the collection undergoes any write operation or things like that.
Any help in understanding this is appreciated!
PS:
Regarding the fields with which search is performed, some are indexed
and some are not.
Mongo version used 2.6.1

It will depend on a lot of factors, but the most prominent are the amount of memory in the server and how active the server is as MongoDB leaves much of the caching to the OS (by MMAP'ing files).
You need to take a long hard look at your log files for the initial query and try to figure out why it takes nearly a minute.

In most cases there is some internal cache invalidation mechanism that will drop your cached query internal record when write operation occurs. It is the simplest describing of process. Just from my own expirience.
But, as mentioned earlier, there are many factors besides simple invalidation that can have place.

MongoDB automatically uses all free memory on the machine as its cache.It would be better to use MongoDB 3.0+ versions because it comes with two Storage Engines MMAP and WiredTiger.
The major difference between these two is that whenever you perform a write operation in MMAP then the whole database is going to lock and whereas the locking mechanism is upto document level in WiredTiger.
If you are using MongoDB 2.6 version then you can also check the query performance and execution time taking to execute the query by explain() method and in version 3.0+ executionStats() in DB Shell Commands.
You need to index on a particular field which you will query to get results faster. A single collection cannot have more than 64 indexes. The more index you use in a collection there is performance impact in write/update operations.

Prioritize specific long-running operation

I have a mongo collection with a little under 2 million documents in it, and I have a query that I wish to run that will delete around 700.000 of them, based on a Date-field.
The remove query looks something like this:
db.collection.remove({'timestamp': { $lt: ISODate('XXXXX') }})
The exact date is not important in this case, the syntax is correct and I know it will work. However, I also know it's going to take forever (last time we did something similar it took a little under 2 hours).
There is another process inserting and updating records at the same time that I cannot stop. However, as long as those insertions/updates "eventually" get executed, I don't mind them being deferred.
My question is: Is there any way to set the priority of a specific query / operation so that it runs faster / before all the queries sent afterwards? In this case, I assume mongo has to do a lot of swapping data in and out of the database which is not helping performance.

I don't know whether the priority can be fine-tuned, so there might be a better answer.
A simple workaround might be what is suggested in the documentation:
Note: For large deletion operations it may be more effect [sic] to copy the documents that you want to save to a new collection and then use drop() on the original collection.
Another approach is to write a simple script that fetches e.g. 500 elements and then deletes them using $in. You can add some kind of sleep() to throttle the deletion process. This was recommended in the newsgroup.
If you will encounter this problem in the future, you might want to
Use a day-by-day collection so you can simply drop the entire collection once data becomes old enough (this makes aggregation harder), or
use a TTL-Collection where items will time out automatically and don't need to be deleted in a bunch.

If your application needs to delete data older than a certain amount of time i suggest using TTL indexes. Ex (from the mongodb site):
db.log.events.ensureIndex( { "status": 1 }, { expireAfterSeconds: 3600 } )
This works like a capped collection, except data is deleted by time. The biggest win for you is that it works in a background thread, your inserts/updates will be mostly unhurt. I use this technique on a SaaS based product in production, works like a charm.
This may not be your use-case, but i hope that helped.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse