Getting opid and killOp using mongodb native client - mongodb

I need to kill specific mongodb operations.
But mongo commands (like aggregate or mapReduce) whether they're called from the native node client or shell do not return their opids; so I can't match the values in db.currentOp() with the specific query that I want to kill.
I tried matching my queries with the query property of db.currentOp().inprog, but it's far from reliable and in many cases the value of this property will be "$msg" : "query not recording (too large)" hence it cannot be matched.
How can I get/assign an ID to my async mongodb queries in order to use this ID to find and kill the query from a (different) connection?

Are you using mongodb 2.6? With 2.6 you can set the query timeout via $maxTimeMS
http://docs.mongodb.org/manual/reference/operator/meta/maxTimeMS/

Related

Get collections with the highest query traffic

I've seen runtime statistics for a given MongoDB instance via the db.currentOp() method. Or the serverwide db.stats() method. But is there a way to retrieve the collections hit by the most queries causing the most read traffic? Currently running on MongoDB v4.4.

How to disable MongoDB aggregation timeout

I want to run aggregation on my large data sets. (It's about 361K documents) and Insert them to another collection.
I getting this error:
I tried to increase Max Time but it has maximum and it's not enough for my data sets. I found https://docs.mongodb.com/manual/reference/method/cursor.noCursorTimeout/ but it seems noCursorTimeout only apply on find not aggregation.
please tell me how I can disable cursor timeout or another solution to do this.
I am no MongoDB expert but will interpret what I know.
MongoDB Aggregation Cursors don't have a mechanism to adjust Batch Size or set Cursor Timeouts.
Therefore there is no direct way to alter this and the timeout of an aggregation query solely depends on the cursorTimeoutMillis parameter of the MongoDB or mongos` instance. Its default timeout value is 10 minutes.
Your only option is to change this value by the below command.
use admin
db.runCommand({setParameter:1, cursorTimeoutMillis: 1800000})
However, I strongly advise you against using this command. That's because it's a safety mechanism built into MongoDB. It automatically deletes queries that are running idle for more than 10 minutes, so that there is a lesser load in the MongoDB server. If you change this parameter (say to 30 minutes), MongoDB will allow idle queries to be running in the background for those 30 minutes, which will not only make all the new queries slower to execute, but also increase load and memory on the MongoDB side.
You have a couple of workarounds. Reduct the amount of documents if working on MongoDB Compass or copy and run the commands on Mongo Shell (I had success so far with this method).

Querying MongoDB with RestHeart returns different result than with MongoDB client

I have configured RestHeart to get data from MongoDB. Most of the request work well and return the same results than if I use a client to query the MongoDB (RoboMongo, MongoDB Compass...). However some request like the following that involved filters with dates as strings take longer than with MongoDB clients and Nginx closes the connection after 60s (the same query with a client takes 0.163s)
## Request
https://IP/DB/Collection/?filter={'DATE_A':'2017-08-24'}
## Query
db.getCollection('collection').find({'DATE_A':'2017-08-24'})
The collection has an index for DATE_A attribute that is used when the query is executed with a client.
The configuration of RestHeart is the same as the default configuration in the documentation with the difference of the connection to MongoDB. In this case I use a cluster with 3 instances (1 Master and 2 slaves). Furthermore the RestHeart log file shows all the request that are executed except these requests so I can't see what happen with them.
Any suggestion in order to discover what and where is the issue with this queries? Thanks in advance.
Restheart also sort result by _id descending by default.
Try adding sort={'date':-1} or build a compound index

ReactiveMongo - Aggregation Framework - first batch behavior not clear

I have a question about cursor batches.
In the MongoDB documentation it is stated:
cursor.batchSize(size)
Specifies the number of documents to return in each batch of the response from the MongoDB instance.
In most cases, modifying the batch size will not affect the user or the application, as the mongo shell and most drivers return results as if MongoDB returned a single batch.
and in the ReactiveMongo documentation we have:
About the type AggregationResult the property documents has been renamed to firstBatch, to clearly indicate it returns the first batch from result (which is frequently the single one). [my emphasis].
My questions are:
In what situation the first batch returned is the single one and in what situations it is not?
How can I control that?
Neither ReactiveMongo, nor MongoDB documentations seem to be clear on the topic.

Mongoid: why fetching count is slower than fetching documents

I noticed a strange behavior. It might be mongoid or mongodb, I am not sure, but Counting documents is slower than fetching the documents. Here are the queries I fired:
Institution.all.any_of(:portaled_at.ne => nil).any_of(portaled: true).order_by(:portaled_at.desc).count
# mongodb query and timing as per mongoid logs,
# times are consistent over multiple runs
# MONGODB (236ms) db['$cmd'].find({"count"=>"institutions", "query"=>{"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}, "fields"=>nil}).limit(-1)
# MONGODB (245ms) db['$cmd'].find({"count"=>"institutions", "query"=>{"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}, "fields"=>nil}).limit(-1)
Institution.all.any_of(:portaled_at.ne => nil).any_of(portaled: true).order_by(:portaled_at.desc).to_a
# mongodb query and timing as per mongoid logs
# times are not so consistent over multiple runs,
# but consistently much lower than count query
# MONGODB (9ms) db['institutions'].find({"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}).sort([[:portaled_at, :desc]])
# MONGODB (18ms) db['institutions'].find({"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}).sort([[:portaled_at, :desc]])
I believe indexes are not used by mongodb for $and and $or queries, but just so if it matters, I have a sparse index on portaled_at in descending order. Out of around 200,000 documents only around 50-60 have portaled_at set.
rails 3.2.12
mongoid 2.6.0
mongodb 2.2.3
This is against my common sense and if anybody can explain what is going on I would really appreciate it.
While the two are running through different subsystems in MongoDB (one is using runCommand and the other the standard query engine), the specific issue in this case is very likely a known issue in the current version of MongoDb.
The quick summary is that counting without fetching is extremely slow as MongoDb is doing a lot of extra work that often isn't necessary. It's been fixed in the development branch, so it should be in 2.4 when it is released.
For some reason Mongo defaults to not counting records using only indexes. However, if you construct a query correctly, Mongo will count from the index. The trick is to only fetch the fields that are in the index, and you have to specify a query.
In Mongo Shell:
db.MyCollection.find({"_id":{$ne:''}},{"_id":1}).count()
You can check with the explain method:
db.MyCollection.find({"_id":{$ne:''}},{"_id":1}).explain()
Which will include "indexOnly" : true in the output.
And similarly the command can be executed via the Moped driver directly like so:
Mongoid::Sessions.default.command(:count => "MyCollection", :query=>{"_id"=>{"$ne"=>""}}, :fields => {:_id=>1})
Which, in my benchmarks (on my live data, YMMV) is about 100x faster than simply doing MyMongoidDocumentClass.count
Unfortunately, there doesn't seem to be a way to do this quickly through the Mongoid gem.