Is db.stats() a blocking call for MongoDB? - mongodb

While researching how to check the size of a MongoDB, I found this comment:
Be warned that dbstats blocks your database while it runs, so it's not suitable in production. https://jira.mongodb.org/browse/SERVER-5714
Looking at the linked bug report (which is still open), it quotes the Mongo docs as saying:
Command takes some time to run, typically a few seconds unless the .ns file is very large (via use of --nssize). While running other operations may be blocked.
However, when I check the current Mongo docs, I don't find that text. Instead, they say:
The time required to run the command depends on the total size of the database. Because the command must touch all data files, the command may take several seconds to run.
For MongoDB instances using the WiredTiger storage engine, after an unclean shutdown, statistics on size and count may off by up to 1000 documents as reported by collStats, dbStats, count. To restore the correct statistics for the collection, run validate on the collection.
Does this mean the WiredTiger storage engine changed this to a non-blocking call by keeping ongoing stats?

a bit late to the game but I found this question while looking for the answer, and the answer is: Yes until 3.6.12 / 4.0.5 it was acquiring a "shared" lock ("R") which block all write requests during the execution. After that it's now an "intent shared" lock ("r") which doesn't block write requests. Read requests were not impacted.
Source: https://jira.mongodb.org/browse/SERVER-36437

Related

Calculate WiredTiger cache miss from db.serverStatus output

Been reading the following https://medium.com/dbkoda/the-notorious-database-cache-hit-ratio-c7d432381229 article which seems to calculate WiredTiger cache miss rate from data taken in db.serverStatus() output.
However, after performing the command (and also checking that the Java API doesn't have such method, don't really know how he is using the API?), just by checking what the method shows I can't really see the properties from the Document he is trying to retrieve, which are basically 'pages requested from the cache' and 'pages read into cache'.
The only metrics I can see related to that are a couple included within extra_fields, which are page_faults and page_reclaims, and if I'm correct those are both cache misses and cache hits respectively, right?
I'm trying to obtain cache performance (if it's hitting the cache or not after performing certain aggregations) when using certain queries.
Is there any way to obtain this metric straight away via MongoDB commands?
The code given is intended to be run in mongo shell.
The driver equivalent is the https://docs.mongodb.com/manual/reference/command/serverStatus/ command.
You would execute it using your driver's facility to run admin commands or arbitrary commands or database commands. For Ruby driver, this is https://docs.mongodb.com/ruby-driver/current/tutorials/ruby-driver-database-tasks/#arbitrary-comands.

Is it possible to see the incoming queries in mongodb to debug/trace issues?

I have mongo running on my macbook (OSX).
Is it possible to run some kind of a 'monitor' that will display any income requests to my mongodb?
I need to trace if I have the correct query formatting from my application.
You will find these tools (or utilities) useful for monitoring as well as diagnosing purposes. All the tools except mtools are packaged with MongoDB server (sometimes they are installed separately).
1. Database Profiler
The profiler stores every CRUD operation coming into the database; it is off, by default. Having it on is quite expensive; it turns every read into a read+insert, and every write into a write+insert. CAUTION: Keeping it on can quickly overpower the server with incoming operations - saturating the IO.
But, it is a very useful tool when used for a short time to find what is going on with database operations. It is recommended to be used in development environments.
The profiler setting can be accessed by using the command db.getProfilingLevel(). To activate the profilre use the db.setProfilingLevel(level) command. Verify what is captured by the profiler in the db.system.profile collection; you can query it like any other collection using the find or aggregate methods. The db.system.profile document field op specifies the type of database operation; e.g., for queries it is "query".
The profiler has three levels:
0is not capturing any info (or is turned off and default). 1 captures every query that takes over 100ms. 2 captures every query;this can be used to find the actual load that is coming in.
2. mongoreplay
mongoreplay is a traffic capture and replay tool for MongoDB that you can use to inspect and record commands sent to a MongoDB instance, and then replay those commands back onto another host at a later time. NOTE: Available for Linux and macOS.
3. mongostat
mongostat commad-line utility provides a quick overview of the status of a currently running mongod instance.
You can view the incoming operations in real-time. The statistics are displated, by default every second. There are various options to customize the output, the time interval, etc.
4. mtools
mtools is a collection of helper scripts to parse, filter, and visualize (thru graphs) MongoDB log files.
You will find the mlogfilter script useful; it reduces the amount of information from MongoDB log files using various command options. For example, mlogfilter mongod.log --operation query filters the log by query operations only.

mongo write lock behavior

I have a question on mongo locks. Basically I have to perform some write operation on the table(insert/ delete/ update). When I read this link Locking in Mongodb. It says "Locks are “writer greedy,” and when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock.
My question is -- The lock is memory block based or we have a single lock on entire db. What I was thinking is concurrently run 2 scripts scanning 2 memory blocks of mongodb (planning to scan 2 million documents in one query) and perform write operation side by side thereby increasing the performance and saving time.
I searched on net about this but didnt find anything satisfactory.
Any help will be deeply appreciated
The write lock has nothing to do with memory, MongoDB is not an in-memory database, the OS merely caches the working set of the mongod process to RAM. MongoDB has no memory hooks in its program.
The write lock is also on database level as such your plan is not feasible.

difference between db.runCommand({getlasterror:1,fsync:true}) and db.runCommand({getlasterror:1}) in MongoDB?

I understand that to getlasterror, it guarantees that the write has been done to a file.
This means that, even the computer power is off, the previous write is still ok.
But what is the use of fsync:true?
Essentially getLastError checking for an error in last database operation for the current connection. If you will run this command with fsync option it will also flush data to the datafiles (by defaul mongodb do it each 60 seconds).
More details you can find here and here

How to track how long some Mongo queries take

I have a few Mongo queries in the JS format, such as:
db.hello.update(params,data);
How do I run them in such a way that I can see exactly how long they've taken to run later?
There are a couple of options:
Do your updates with safe=true, which will cause the update call to block until mongod has written the data (the exact syntax for this depends on the driver you're using). You can add timing code around your updates in your application code, and log as appropriate.
Enable verbose (or more-verbose) logging, and use the log files to determine the time spent during your updates. See the mongo docs on logging for more information.
Enable the profiler, which stores information about queries and updates in a capped collection, db.system.profile, including the time spent servicing the query or update. Note that enabling the profiler affects performance, though not severely. See the mongo docs on profiling for more information.