why did serverstatus have a bad effect on mongod write operation? - mongodb

I have 1 mongos, 3 mongod and 3 config server. when I write some documents, sometimes the insert speed of one of mongods is very slow and there're "serverstatus was very slow" in mongod log file. why?
the version is 2.0.4

That message actually reflects the fact that your server was slow, not that serverStatus is causing the problem. If the serverStatus command (which is run periodically by MMS agents for example) is slow, it will log that warning - it is a symptom rather than a cause.
It is quite lightweight as a command, so if it is returning slowly enough to warn you about it then the host is likely very busy at that time.
The usual places to look for load apply (high inserts/updates, table scans, poorly indexed queries, disk issues, RAM/CPU contention, page faults etc.).

Related

configuring mongodb - WTCheck.tThread writes on disk into WiredTiger.wt

I've installed mongodb 4.4.3 on my Raspberry Pi and I've noticed there are some IO operations even though no client is connected and no queries from my side.
First I've noticed FTDC writing every ~8seconds, so I set diagnosticDataCollectionEnabled: false.
But there still remain writes into WiredTiger.wt (.turtle and index) every minute.
What does it do, is it some journal? Can I disable it?
It is my personal dev webserver, there won't be much writing/reading from my side, so I dont see the point in mongo doing some unnecessary writing, since it will be just slowly killing my SSD.
(Btw. I'm a nub, never really worked with mongo)
storage.journal.commitIntervalMs ( default=100ms , you can increase up to 500ms )

Mongo Db secondary setup

From last 1 week I am trying to setup replica set for my one node mongodb (3.4.2 version) but facing multiple issues. My primary node currently have around 650 gb of data and every day it is growing by 90 gb. First time I added new secondary node with empty data directory after almost a day it failed with too much of lag in oplog issue. Next time I tried manually copying data. After copy when restarted secondary it started giving me the error that I cannot synch from primary (There was not connection problem I was able to ping). I again retried manual copy procedure but this time it failed with below error. As wired tiger issue is with specific collection file. I copied that file again and retried but it failed again with same issue. Can someone please help me in setting up secondary. Everyday it is becoming more difficult as data is growing and I cannot keep primary down for long time (During manual copy I stop all writes in primary).
2017-03-02T16:08:16.315+0000 E STORAGE [initandlisten] WiredTiger error (-31802) [1488470896:315136][17051:0x7ffdbd3d7dc0], file:mcse.45trace/collection-16-7756455024301269277.wt, WT_SESSION.open_cursor: /app/data/mcse.45trace/collection-16-7756455024301269277.wt: handle-read: pread: failed to read 4096 bytes at offset 86474874880: WT_ERROR: non-specific WiredTiger error
2017-03-02T16:08:16.315+0000 I - [initandlisten] Invariant failure: ret resulted in status UnknownError: -31802: WT_ERROR: non-specific WiredTiger error at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 95
If you can solve that first problem with the replication lag, then you will probably get everything running OK. Take a look at the Troubleshooting Replica Sets guide, it has some useful suggestions:
Possible causes of replication lag include:
Network Latency
Check the network routes between the members of your set to ensure that there is no packet loss or network routing issue.
Use tools including ping to test latency between set members and traceroute to expose the routing of packets network endpoints.
Disk Throughput
If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary, then the secondary will have difficulty keeping state. Disk-related issues are incredibly prevalent on multi-tenant systems, including virtualized instances, and can be transient if the system accesses disk devices over an IP network (as is the case with Amazon’s EBS system.)
Use system-level tools to assess disk status, including iostat or vmstat.
Concurrency
In some cases, long-running operations on the primary can block replication on secondaries. For best results, configure write concern to require confirmation of replication to secondaries. This prevents write operations from returning if replication cannot keep up with the write load.
Use the database profiler to see if there are slow queries or long-running operations that correspond to the incidences of lag.
Appropriate Write Concern
If you are performing a large data ingestion or bulk load operation that requires a large number of writes to the primary, particularly with unacknowledged write concern, the secondaries will not be able to read the oplog fast enough to keep up with changes.
To prevent this, request write acknowledgement write concern after every 100, 1,000, or another interval to provide an opportunity for secondaries to catch up with the primary.
For more information see:
• Write Concern
• Replica Set Write Concern
• Oplog Size
WiredTiger error (-31802) file:xxx.wt
This could be related to corrupted .wt files (e.g. WiredTiger.wt/WiredTiger.turtle) as per SERVER-31076 bug report.
Try running:
mongod --repair --dbpath /path/to/data/db
Also make sure all data/db files have the right read and write permission.

MongoDB cant access document above at specific skip

I have a MongoDB instance in a cloud on AWS EC2 t2.micro (30GB storage, 1GB ram) running in Docker and in that database I have a single collection which stores 411 thousand documents, an this takes ~700MB disk space.
On my local computer, if I run this in mongo shell:
db.my_collection.find().skip(200000).limit(1)
then I get the correct results, but if I run this
db.my_collection.find().skip(220000).limit(1)
then MongoDB shuts down. Why? What should I do, to access these data?
It appears that your system doesn't have enough RAM to fulfill mongodb demand. When a Linux system is critically low in memory, kernel starts killing processes to avoid system crash itself.
I believe, this is what happening in your case too. Mongodb is not even getting chance to write a log. I'd recommend to increase RAM or if it's not feasible, add more swap space. This will prevent system crash but mongodb will keep working though very very slow.
Please visit these excellent resources on Linux and it's behavior.
https://unix.stackexchange.com/questions/136291/will-linux-start-killing-my-processes-without-asking-me-if-memory-gets-short
https://serverfault.com/questions/480266/how-to-know-if-the-server-runs-out-of-ram-before-crashing-down

Memory issues with new version of Mongodb

I'm using Mongodb on my Windows server 2012 for more than two years. Since the last update some weird issues started to happen which in the end lead to usage of the entire RAM memory.
The service Iv'e configured for Mongodb is as follows:
logpath=d:\data\log\mongod.log
dbpath=d:\data\db
storageEngine=wiredTiger
rest=true
#override port
port=27017
#configsvr = true
shardsvr = true
And in order to limit the Cache memory usage Iv'e added the following line:
wiredTigerCacheSizeGB=10
And this is where the weird stuff started happening. When I check the task manager it says that now Mongodb is really limited to 10GB as I defined in the service but it is actually using a lot more than 10GB.
In the first image you can see the memory consumption sorted by RAM consumption
While in fact the machine I'm using has 28GB in total
This crazy consumption leads to failure in the scripts I'm running, even the most basic ones, even when I only run simple queries like 'count' or 'distinct', I believe that this is a direct results of the memory consumption.
When I checked the log files I saw that there are many open connections that even when the session ends it indicates that still the same amount of connections is opened:
So in the end I have two major questions:
1. Is there a way of solving this issue without downgrading the Mongodb version?
2. The config file looks right? is everything there is necessary?
Memory usage in WiredTiger is a two-level cache:
First is the WiredTiger cache as controlled by --wiredTigerCacheSizeGB
Second is the Operating System filesystem cache. MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes
See also WiredTiger memory usage
For OS filesystem cache, MongoDB doesn't manage the memory it uses directly - it lets the OS manage it. Windows will try to use every last scrap of physical memory if it can - but lots of it should and will be thrown out if other processes request memory.
An alternative is to run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system.
Having said the above:
You are also running another database in the server i.e. mysqld. MongoDB, like some databases will perform better on a dedicated server to reduce memory contention.
Task Manager shows mongod is using 10GB, although the machine is using up to ~28GB. This may or may not be mongod as you have other processes as well.
Useful resources:
FAQ: Memory diagnostics for WiredTiger
FAQ: MongoDB Cache Handling
MongoDB Production Notes

mongodb flushing mmap takes around 20 secs with no updates being required

Hi One of our customers is running mongodb V2.2.3 on a 64 bit windows server 2008 R2 Enterprise.
We're currently seeing mmap flush times of over 20 seconds every minute.
What is confusing me is that it isn't doing any writes to the disk. (Disk write bytes is next to 0)
Our programme which access the data has been temporary turned off.
so all that is connected is a mongo shell.
Mongostat and mongotop aresn't showing anything
The database has 130 million records. There are 356 files for mmap.
Any sugestions on what could be causing this?
Thanks
If your working set is significantly larger than memory, and MongoDB is constantly going to disk for reads (and not just the normal spikes when syncing writes to disk), then you really should be sharding to spread the data across multiple machines/instances.
Given the behaviour you have described and that you have a large number of files for mmap, I suspect the underlying performance issue is SERVER-12401 in the MongoDB Jira issue tracker:
On Windows, Memory Mapped File flushes are synchronous operations. When the OS Virtual Memory Manager is asked to flush a memory mapped file, it makes a synchronous write request to the file cache manager in the OS. This causes large I/O stalls on Windows systems with high Disk IO latency, while on Linux the same writes are asynchronous.
There are a few possible ways to improve the flush performance on Windows, including code changes in both the MongoDB server and the Windows O/S. There is some ongoing work to address these issues, now that the synchronous flushing behaviour on Windows has been confirmed.
If you are using higher latency local storage (for example, spinning disks) you may be able to mitigate the issue by upgrading to SSD or better spec'd drives.
I would suggest upvoting/watching SERVER-12401 and the related Jira issues for updates.
It would also be worth upgrading from MongoDB 2.2 to a newer version as 2.2 is now past end-of-life for updates. There have been two major production release branches since then, including significant improvements in general performance/features as well as Windows support.