How should I set up mongodb cluster to handle 20K+ simultaneous - mongodb

My application uses MongoDB as database. We are expecting 20K+ simultaneous connections to mongodb cluster. How should I config the server if I want to run the mongodb on 20 servers and shard the cluster 20 ways?
Here is what I've done so far:
On each of my 20 servers, I have one mongos (router) running on port 30000, and on 3 servers I run mongo config servers on port 20000. Then on each server, I run 3 instances of mongod. One of the is the primary. In order words, I have 20 mongos, 3 mongo-config, 60 mongod servers (20 primary and 40 replica).
Then in my application (which also run on each server and connect to the localhost:30000 mongos), I set the mongoOptions such that the connectionsPerHost = 1000.
10-15 minutes after all services start, some of them became no longer ssh-able. These servers are still ping-able. I suspect there were too many connections, and it caused the server to die.
My own analysis is as follows:
1K connections per connection pool means for each shard's primary, it will have 1K * 20 (shards) = 20K simultaneous connections open. A few servers will probably have more than one primary running on it, which will double or triple the number of connections to 60K. Somehow mongod cannot handle these many connections although I changed my system settings to allow each process to open way more files.
Here are what 'ulimit -a' shows:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) 16382
max locked memory (kbytes, -l) 64000000
max memory size (kbytes, -m) unlimited
open files (-n) 320000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
BTW, I didn't specify --maxConns when I start up mongod/mongos, I also didn't change MONGO.POOLSIZE.
A side question: if my reasoning is correct, the total number of simultaneous connection requirement will be posed on each primary, which doesn't seem right to me, it almost means mongodb cluster is not scalable at all. Someone tell me I'm wrong please?

Aout your cluster architecture :
Running several instances of mongod on the same server is usually not a good idea, do you any particular reason to do this ? The primary server of each shard will put some heavy pressure on your server, the replication also add io pressure, so mixing them won't be really good for performance. IMO, you should rather have 6 shards (1 master - 2 secondaries) and give each instance their own server. (Conf and arbiter instance are not very resources consomming so its ok to leave them on the same servers).

Sometimes the limits don't apply to the process itself. As a test go onto one of the servers and get the pid for the mongo service you want to check on by doing
ps axu | grep mongodb
and then do
cat /proc/{pid}/limit
That will tell you if the limits have taken effect. If the limit isn't un effect then you need to specify the limit in the startup file and then stop - start the mongo service and test again.
A sure-fire way to know if this is happening is to tail -f the mongo log on a dying server and watch for those "too many files" messages.
We set our limit to 20000 per server and do the same on all mongod and mongos instances and this seems to work.

We're running a 4-shard replicaset on 4 machines. We have 2 shard primaries on 2 hosts, 2 shard replicas on the other 2 boxes, arbiters and config servers spread out).
We're getting messages:
./checkMongo.bash: fork: retry: Resource temporarily unavailable
./checkMongo.bash: fork: retry: Resource temporarily unavailable
./checkMongo.bash: fork: retry: Resource temporarily unavailable
Write failed: Broken pipe
Checking ulimit -a:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 773713
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Okay, so we're possibly hitting a process limit because of the fork message. Here's how to check that:
$ ps axo pid,ppid,rss,vsz,nlwp,cmd | egrep mongo
27442 1 36572 59735772 275 /path/mongod --shardsvr --replSet shard-00 --dbpath /path/rs-00-p --port 30000 --logpath /path/rs-00-p.log --fork
27534 1 4100020 59587548 295 /path/mongod --shardsvr --replSet shard-02 --dbpath /path/rs-02-p --port 30200 --logpath /path/rs-02-p.log --fork
27769 1 57948 13242560 401 /path/mongod --configsvr --dbpath /path/configServer_1 --port 35000 --logpath /path/configServer_1.log --fork
So, you can see the mongod's have 275, 295, and 401 subprocesses/threads each. though I'm not hitting a limit now, I probably was earlier. So, the solution: change the system's ulimit for the user we're running under from 1024 to 2048 (or even unlimited). You can't change via
ulimit -u unlimited
unless you sudo first or something; I don't have privs to do that.

Related

how to increase the value of ulimit -n in linode ubuntu server?

I have created mongodb database server in linode dedicated host and while tuning the database server for optimal permormence i could not increase the value of ulimit -n
I have asked chat GPT, and search alot of forum and google for help no none of them work for me.
Also i want to know what else i have to do to optimize the permormence of my server (one main server and one replica set are configured each having 2 core cpu 20 GB Storage and 24 GB RAM)
i haved tried following configuration:
file name: /etc/security/limits.conf
* soft nofile 64000
* hard nofile 64000
file name:/etc/sysctl.conf
fs.file-max = 64000
file name: /etc/systemd/system.conf
DefaultLimitNOFILE=64000
and restart my server
still i got the following output
root#host:# sysctl fs.file-max
fs.file-max = 64000
root#host:# cat /etc/security/limits.conf
* soft nofile 64000
* hard nofile 64000
root#host:# ulimit -n
1024
the only command works is
ulimit -n 64000 but once i logout and login again the ulimit reset back to 1024.

PGbouncer ERROR accept() failed: Too many open files

I am running a server with 20 cpu cores and 96 GB of ram. I have configured Postgresql and Pgbouncer to handle 1000 connections at a time.
However when the connections increase (even though they are well below the 1000 limit I have set) I start getting failed connections. I checked the pgbouncer log and I noticed the following
ERROR accept() failed: Too many open files
What limit do I need to increase to solve this issue? I am running Debian 8
Increate the operating system limit of the maximum number of open files for the user under which pgBouncer is running.
I added the below parameters in the pgBouncer service. After that, pgbanch was run again. So problems was solved.
The file limit size depend on your Linux file size. You checked your system with these codes.
cat /proc/sys/fs/file-max
ulimit -n
ulimit -Sn
ulimit -Hn
vim /lib/systemd/system/pgbouncer.service
[Service]
LimitNOFILE=64000
LimitNOFILESoft=64000

how could I relaunch mongodb service after killed by OOM

how could I relaunch mongodb service after killed by OOM
I couldn't find the reason why the mongodb service killed by OS about every 2 days.
is there any workaround to launch the mongodb service when it got killed
I've already set some system configs to avoid killing the mongodb service
thanks so much
/etc/security/limits.conf
* hard nofile unlimited
* soft nofile unlimited
root hard nofile unlimited
root soft nofile unlimited
/etc/sysctl.conf
vm.oom-kill = 0
vm.overcommit_memory = 1
Try to add these lines in /etc/init/mongod.conf
respawn
respawn limit 10 100
it will retry restart the service in 100 seconds

After tuning Postgresql, PgBench results are worse

I am testing PostgreSQL on an 8gb Ram/4 CPUs/ 80gb SSD cloud server from Digital Ocean. I originally ran PgBench with default settings in the postgresql.conf, and then altered some common settings--shared_buffers, work_mem, maintenance_work_mem, effective_cache_size--to reflect the 8gb of RAM. After running the 2nd set of tests, I noticed that some of my results were actually worse. Any suggestions on why this might be? I am rather new to PgBench and tuning PostgreSQL in general.
Settings:
shared_buffers = 2048mb
work_mem = 68mb
maintenance_work_mem = 1024mb
effective_cache_size = 4096mb
Tests:
pgbench -i -s 100
pgbench -c 16 -j 2 -T 60 -U postgres postgres
pgbench -S -c 16 -j 2 -T 60 -U postgres postgres
pgbench -c 16 -j 4 -T 60 -U postgres postgres
pgbench -S -c 16 -j 4 -T 60 -U postgres postgres
pgbench -c 16 -j 8 -T 60 -U postgres postgres
pgbench -S -c 16 -j 8 -T 60 -U postgres postgres
How effective are these tests? Is this an effective way to employ PgBench? How should I customize tests to properly reflect my data and server instance?
What is mean "worse"? How long time you run pgbench? This test should be executed about 2hour as minimum for realistic values. What version of PostgreSQL do you have?
Attention: You should be very careful about interpretation pgbench result. Probably you should to optimize execution of your application, not pgbench. pgbench is good for hw or sw checking, is bad tool for optimizing of PostgreSQL configuration.
A mentioned configuration variables are basic for configuration and you probably cannot to be wrong there (server must not use a swap actively ever - and these variables ensure it).
A formula that I use:
-- Dedicated server 8GB RAM
shared_buffers = 1/3 .. 1/4 dedicated RAM
effecttive_cache_size = 2/3 dedicated RAM
maintenance_work_mem > higher than the most big table (if possible)
else 1/10 RAM
else max_connection * 1/4 * work_mem
work_mem = precious setting is based on slow query analyse
(first setting about 100MB)
--must be true
max_connection * work_mem * 2 + shared_buffers
+ 1GB (O.S.) + 1GB (filesystem cache) <= RAM size
Usually default values of WAL buffer size and checkpoint segments is too low too. And you can increase it.

increase item max size in memcached?

i am using memcached on my centos server , my project is large and has objects more than 1MB which i need to save to memcached , well , i can't ! because the max_item_size is 1MB , anyway to edit that ?
Thank you
You can change the limit quickly by edit the configuration file [/etc/memcached.conf] adding:
# Increase limit
-I 128M
Or if you have trouble with SO config run it with command line directly
memcached -I 128M
If you are using Memcache >= 1.4.2, this is now configurable. Here is an example of how to set this in your init script for starting Memcache on CentOS: http://www.alphadevx.com/a/387-Changing-the-maximum-item-size-allowed-by-Memcache
You can compile memcached and change the memory allocation setting to use POWER_BLOCK's, in the slabs.c file (or you can recompile and user malloc/free, but that is the greater of the evils).
http://code.google.com/p/memcached/wiki/FAQ#Why_are_items_limited_to_1_megabyte_in_size?
I would seriously consider what you are caching and how it can be more modular, > 1MB in active memory is large.
Spent tons of time to figure this out:
in /etc/sysconfig/memcached edit options
OPTIONS="-l 127.0.0.1 -I 3m"
then systemctl restart memcached to take effect.
Would recommend option -l 127.0.0.1 it secures to localhost usage only and -I 3m increases the limit as described above.
With Centos 7 I had no luck with these paths /etc/memcached.conf /etc/default/memcached