too many clients in postgresql in docker-swarm with spring boot - postgresql

I set up a development environment in docker swarm environment, which consists of 2 nodes, a few networks and a few microservices. Following gives an example how it looks in cluster.
Service Network Node Image Version
nginx reverse proxy backend, frontend node 1 latest stable-alpine
Service A backend, database node 2 8-jre-alpine
Service B backend, database node 2 8-jre-alpine
Service C backend, database node 1 8-jre-alpine
Database postgresql database node 1 latest alpine
Services are spring boot 2.1.7 applications with boot-data-jpa. All the services above contain database connection to postgresql instance. For database I configured only following properties in application.properties:
spring.datasource.url
spring.datasource.username
spring.datasource.password
spring.datasource.driver-class-name
spring.jpa.hibernate.ddl-auto=
spring.jpa.properties.hibernate.jdbc.lob.non_contextual_creation=true
After some time I see that the connection limit in postgresql exceeds, which does not allow to create a new connection.
2019-09-21 13:01:07.031 1 --- [onnection adder] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Cannot acquire connection from data source org.postgresql.util.PSQLException: FATAL: sorry, too many clients already
A similar error is shown also when I try to connect over ssh to database.
psql: FATAL: sorry, too many clients already
What I tried till now:
spring.datasource.hikari.leak-detection-threshold=20000
which didnt help.
I found several answers to this problem like:
increase connection limit in postgresql
No I dont want to do this. It is just a temporary solution. It will pollute the connection again but a bit later maybe.
add idle timeout in hikaripCP configuration
Default configuration of hikariCP has already a default value of 10mins, which doesnt help
add max life time to hikariCP configuration
Default configuration of hikariCP has already a default value of 30mins, which doesnt help
reduce number of idle connection in hikariCP configuration
Default configuration of hikariCP has already a default value of 10, which doesnt help
set min idle in hikariCP configuration
Default is 10 and I am fine with it. b
I am expecting a connection around 30 for the services but I find nearly 100 connections. Restarting the services or stopping them does not close the idle connections neither. What are your suggestions? Is it a docker specific problem? Did someone experience the same problem?

Related

SQLAlchemy with Aurora Serverless V2 PostgreSQL - many connections

I have an AWS Serverless V2 database setup (postgresql) that is being accessed from a compute cluster. The cluster launches a large number of jobs (>1000) and each job independently puts/pulls some data from the database. The Serverless cluster is setup to autoscale from 2 to 32 units as needed.
The code being run by each cluster job is using SQLAlchemy (either the ORM or the core). I am setting up each database connection with a null pool and pessimistic disconnect handling (i.e., pool_pre_ping=True). From my reading of the docs this should be handling disconnects due to being idle mid-connection.
Code is also written to access the DB, get the results, close the connection (to avoid idle connections), and then reopen the connection after processing (5-30 minutes). This is working well because once processing is completed, the new connections are staggered and the DB has scaled up.
My logs are showing the standard, all connections are taken error: psycopg2.OperationalError: FATAL: remaining connection slots are reserved for non-replication superuser and rds_superuser connections until the DB scales the available units high enough.
Questions:
Should I be configuring the SQLAlchemy connection differently? It feels like an anti-pattern to put in a custom retry to grab a connection while waiting for the DB to scale the number of available units as this type of capability seems to be built into SQLAlchemy usually.
Should I be using an RDS Proxy in front of the database? This also seems like an anti-pattern, adding a proxy in front of an autoscaling DB.
PG version is 10.

MongoDB clients stopped connecting to secondary nodes of the replicaset

Application stopped connecting to the mongodb secondary replicaset. We have the read preference set to secondary.
mongodb://db0.example.com,db1.example.com,db2.example.com/?replicaSet=myRepl&readPreference=secondary&maxStalenessSeconds=120
Connections always go to the primary overloading the primary node. This issue started after restarting patching and restart of the servers.
Tried mongo shell connectvity using above resulting in command being abruptly terminated. I see the process for that connect in the server in ps -ef|grep mongo
Any one faced this issue? Any troubleshooting tips are appreciated. Log's aren't showing anything related to the terminated/stopped connection process.
We were able to fix the issue. It was an issue on the spring boot side. When the right bean (we have two beans - one for primary and one for secondary connections) was injected, the connection was established to the secondary node for heavy reading and reporting purposes.

High number of connections to Airflow Metadata DB

I tried to find information about the number of connections that Airflow establishes with the metadata database instance (Postgres in my case).
By running select * from pg_stat_activity I realized it creates at least 7 connections whose states change between idle and idle in transaction. The queries are registered as COMMIT or SELECT 1 (mostly). This was using the LocalExecutor on Airflow 2.1, but I tested with an installation of Airflow 1.10 having the same results.
Is anyone aware of where these connections come from? And, is there a way (and a reason) to change this?
Yes. Airflow will Open big number of connections - basically every process it creates will almost for sure open at least one connection. This is "known" characteristics of Apache Airflow.
If you are using MySQL - this is not a big issue as MySQL is good in handling multiple connections (it multiplexes incoming connnections via threads). Postgres uses process-per-connection approach which is much more resource-hungry.
The recommended way to handle that (Postgres is the most stable backend for Airflow) is to use PGBouncer to proxy such connections to Postgres.
In our Official Helm Chart, PGBouncer is used by default, when Postgres is used. https://airflow.apache.org/docs/helm-chart/stable/index.html
I Highly recommend this approach.

Releasing JDBC connections after kubernetes pod is killed

I am running a couple of spring boot apps in Kubernetes. Each app is using spring JPA to connect to a Postgresql 10.6 database. What I have noticed is when the pods are killed unexpectedly the connections to the database are not released.
Running SELECT sum(numbackends) FROM pg_stat_database; on the database returns lets say 50, after killing a couple of pods running the the spring app and rerunning the query this number jumps to 60 this eventually causes the number of connections to postgresql to exceeds the maximum and prevent restarted pod's applications connecting to the database.
I have experimented with the postgresql option idle_in_transaction_session_timeout setting it to 15s but this does not drop the old connections and the number keeps increasing.
I am making use of the com.zaxxer.hikari.HikariDataSource datasource for my spring apps and was wondering if there was a way to prevent this from happening, either on the posgresql or spring boot's side.
Any advise or suggestions are welcome.
Spring boot version: 2.0.3.RELEASE
Java version: 1.8
Postgresql version 10.6
This issue can arise not only with kubernetes pods but also with a simple application running on a server which is killed forcibly ( like kill -9 pid on Linux) and not given opportunity to process to clean up via a shutdown hook to spring boot. I think in this case nothing on the application side can help. You can however try cleanup of inactive connections by several methods on database side as mentioned here.

pgbouncer 1.7 with master and slave

I am new in using pgbouncer 1.7 and I want to configure it with master slave configuration.
I have configured postgres 9.3 streaming replication using repmgr and I want to use pgbouncer for load balancing and connection pooling so that it automatically switches to slave if master goes down . So how should I configure it for the same. I have both master and slave on diff servers and og bouncer on diff servers. Do i need to install pgbouncer on both master and slave servers also for it to work or just installing on a diff server will work .
I have tried many online tutorial for it but sadly didnt found any suggestions. Please if anyone can help.
Thanks in advance,
Mohit
PgBouncer does not have automatic failover, propagation and ex-master rebuild handling. You can change IP for same hostname to failover though:
https://pgbouncer.github.io/faq.html
How to failover
PgBouncer does not have internal failover-host configuration nor detection. It is possible via some external tools:
DNS reconfiguration - when ip behind DNS name is reconfigured, pgbouncer will reconnect to new server. This behaviour can be tuned
via 2 config parameters - dns_max_ttl tunes lifetime for one hostname,
and dns_zone_check_period tunes how often zone SOA will be queried for
changes. If zone SOA record has changed, pgbouncer will re-query all
hostnames under that zone.
Write new host to config and let PgBouncer reload it - send SIGHUP or use RELOAD; command on console. PgBouncer will detect changed host
config and reconnect to new server.
Pgpool has automatic failover if you wnat to try.