Mongodb connection pools with cross region replicas

Mongodb connection pools with cross region replicas - mongodb

We have a cluster hosted on Mongo Atlas (M50, AWS) with cross region replicas in 5 other regions. This allows my application servers in those regions to read from a local replica using readPreference=nearest which is much faster.
The M50 instance size has a maximum of 16000 connections.
The issue I face is that the connection pool creates a connection to each node in the cluster. With 5 other regions, each having 10 application servers and each server having an application pool of 100 (as is default), that's 5000 connections to every single node when they will only ever read from the replica in the local region. These connections take away from the connections available to application server in the the primary region which is doing all the writes (5000k writes per second). The primary region has 20 application servers. These serves are set to have a minimum of 500 connections each which creates a minimum of 10000 connections. That's a total of 15000 connections.
This creates a problem as we sometimes run out of connections when traffic spikes as those pools increase in size to cope with the additional load. The minimum pool size is required to ensure the application remains responsive when the traffic spikes (if we don't do this we see a lot of MongoWaitQueueFullExceptions and unacceptably high response times). We could set the maximum pool size but this limits throughput and we see timeouts.
Is there any way that the application pools in the replica regions can be prevented from creating connections to every single node in the cluster?
We don't want to increase the instance size as it doubles the cost of the cluster.
The application is written in .NET 6 (minimal API) with MongoDB driver version 2.16.

Related

How to close SQL connections of old Cloud Run revisions?

Context
I am running a SpringBoot application on Cloud Run which connects to a postgres11 CloudSQL database using a Hikari connection pool. I am using the smallest PSQL instance (1vcpu/614mb/25connection limit). For the setup, I have followed these resources:
Connecting to Cloud SQL from Cloud Run
Managing database connections
Problem
After deploying the third revision, I get the following error:
FATAL: remaining connection slots are reserved for non-replication superuser connections
What I found out
Default connection pool size is 10, hence why it fails on the third deployment (30 > 25).
When deleting an old revision, active connections shown in the Cloud SQL admin panel drop by 10, and the next deployment succeeds.
Question
It seems, that old Cloud Run revisions are being kept in a "cold" state, maintaining their connection pools. Is there a way to close these connections without deleting the revisions?
In the best practices section it says:
...we recommend that you use a client library that supports connection pools that automatically reconnect broken client connections."
What is the recommended way of managing connection pools in Cloud Run, given that it seems old revisions somehow manage to maintain their connections?
Thanks!

Currently, Cloud Run doesn't provide any guarantees on how long it will remain warm after it's started up. When not in use, the instance is severely throttled by not necessarily shutdown. Thus, you have some revisions that are holding up connections even when not being directed traffic.
Even in this situation, I disagree that with the idea that you should avoid using connection pooling. Connection pooling can lower latency, improve stability, and help put an upper limit on the number of open connections. Alternatively, you can use some of the following configuration options to help keep your pool in check:
minimumIdle - This property controls the minimum number of idle connections that HikariCP tries to maintain in the pool. If the idle connections dip below this value and total connections in the pool are less than maximumPoolSize, HikariCP will make a best effort to add additional connections quickly and efficiently.
maximumPoolSize - This property controls the maximum size that the pool is allowed to reach, including both idle and in-use connections.
idleTimeout - This property controls the maximum amount of time that a connection is allowed to sit idle in the pool. This setting only applies when minimumIdle is defined to be less than maximumPoolSize. Idle connections will not be retired once the pool reaches minimumIdle connections.
If you set minimumIdle to 0, your application will still be able to use up to maximumPoolSize connections at once. However, once a connection is idle in the pool for idleTimeout seconds, it will be closed. If you set idleTimeout to something small like 1 minute, it will allow the number of connections your pool is using to scale down to 0 when not in use.
Hope this helps!

The issue here is that the connections don't get closed by HikariCP when they are opened. I don't know much about Hikari but I found this which explains how connections should be handled through Hikari. I hope that helps!

Number of concurrent database connections

We are using amazon r3.8xlarge postgres RDS for our production server.I checked the max connections limit of the RDS, it happens to be 8192 max connections limit.
I have a service which is deployed in ECS and each ECS tasks can take one database connection.The tasks go up to 2000 during peak load.That means we will have 2000 concurrent connections to the database.
I want to check whether it is ok to have 2000 concurrent connections to database.secondly, Will it impact the performance of amazon postgres RDS.

Having 2000 connection at time should not cause any performance issue, since AWS manages the performance part. There are many DB load testing tools available, if you want to be at most sure about this.

How to scale the total number of connection with pgpool load balancing?

I have 3 postgresql database (one master and two slave) with a pgpool, each database can handle 200 connections, and I want to be able to get 600 active connection on the pgpool.
My problem is that if I set pgpool with 600 child process, it can open the 600 connection on only one database (the master for example if all connection make a write query), but with 200 child process I only use +- 70 connection on each database.
So is there a way to configure pgpool to have a load balancing that scale with the number of database ?
Thanks.

Having 600 connections available in each db should not be an ideal solution. I would really look into my application before setting such a high connections value.
Load balancing scalability of pgpool can be increased by setting equal backend_weight parameter. So that no of sql queries will equally get distributed among postgresql nodes.
Also pgpool manages database connection pool using num_init_children and max_pool parameter.
The num_init_children parameter is used to span pgpool process that will connect to each PostgreSQL backends.
Also num_init_children parameter value is the allowed number of concurrent clients to connect with pgpool.
pgpool roughly tries to make max_pool*num_init_children no of connections to each postgresql backend.

How to shift internal communication of nodes in a MongoDB cluster to another network to decrease the load of main network

I have created a 8 node MongoDB cluster with 2 shards + 2 Replica(1 for each shard) + 3 Config Servers + 1 Mongos.
All these are on network 192.168.1.(eth0) with application server. So this network is handling all the traffic.
So I have created one another network 192.168.10.(eth1) which is having only these 8 MongoDB nodes.
Now all the eight nodes are the part of both the networks with dual IP's.
Now I want to shift the internal traffic between these mongodb nodes to network 192.168.10.(eth1) to reduce the load from main network 192.168.1.(eth0)
So how to bind the ports/nodes for the purpose?

You can use bind_ip as a startup or configuration option. Keep in mind that various nodes need to be accessible in the event of failover.
Notably here is your single mongos where it would be advised to either co-locate the service per app server, or depending on requirements, have a pool available to your driver connection. Preferably both and having a large instance for each 'mongos' where aggregate operations are used.

I got the solution of the problem I was looking for. I configured the cluster according to the IP's of network 192.168.11._
Now the internal data traffic is going through this network.

ElasticSearch architecture and hosting

I am looking to use Elastic Search with MongoDB to support our full text search requirements. I am struggling to find information on the architecture and hosting and would like some help. I am planning to host ES on premise rather than in the cloud. We currently have MongoDB running in replica set with three nodes.
How many servers are required to run ElasticSearch for high availability?
What is the recommended server specification. Currently my thoughts are 2 x CPU, 4GB RAM, C drive: 40GB , D drive: 40GB
How does ES support failover
Thanks
Tariq

How many servers are required to run ElasticSearch for high availability?
At least 2
What is the recommended server specification. Currently my thoughts are 2 x CPU, 4GB RAM, C drive: 40GB , D drive: 40GB
It really depends on the amount of data you're indexing, but that amount of RAM and (I'm assuming a decent dual core CPU) should be enough to get you started
How does ES support failover
you set up a clustering with multiple nodes in such a way that each node has a replica of another
So in a simple example your cluster would consist of two servers, each with one node on them.
You'd set replicas to 1 so that the shards in your node, would have a backup copy stored on the other node and vice versa.
So if a node goes down, elasticsearch will detect the failure and route the requests for that node to its replica on another node, until you fix the problem.
Of course you could make this even more robust by having 4 servers with one node each and 2 replicas, as an example. What you must understand is that elasticsearch will optimize that distribution of replicas and primary shards based on the number of shards you have.
so with the 2 nodes and 1 replica example above, say you added 2 extra servers/nodes (1 node/server is recommended), Elasticsearch would move the replicas off the nodes and to their own node, so that you'd have 2 nodes with 1 primary shard(s) and nothing else then 2 other nodes with 1 copy of those shards (replicas) each.

How many servers are required to run ElasticSearch for high
availability?
I recommend 3 servers with 3 replication factor index. It will be more stable in case of one server goes down, plus it's better for highload, cause of queries can be distributed through cluster.
What is the recommended server specification. Currently my thoughts
are 2 x CPU, 4GB RAM, C drive: 40GB , D drive: 40GB
I strongly recommend more RAM. We have 72Gb on each machine in the cluster and ES works perfectly smooth (and we still never fall in garbage collector issues)
How does ES support failover
In our case at http://indexisto.com we had a lot of test and some production cluster server fails. Starting from 3 server no any issues in case server goes down. More servers in cluster - less impact of one server fail.