SQLAlchemy with Aurora Serverless V2 PostgreSQL - many connections - postgresql

I have an AWS Serverless V2 database setup (postgresql) that is being accessed from a compute cluster. The cluster launches a large number of jobs (>1000) and each job independently puts/pulls some data from the database. The Serverless cluster is setup to autoscale from 2 to 32 units as needed.
The code being run by each cluster job is using SQLAlchemy (either the ORM or the core). I am setting up each database connection with a null pool and pessimistic disconnect handling (i.e., pool_pre_ping=True). From my reading of the docs this should be handling disconnects due to being idle mid-connection.
Code is also written to access the DB, get the results, close the connection (to avoid idle connections), and then reopen the connection after processing (5-30 minutes). This is working well because once processing is completed, the new connections are staggered and the DB has scaled up.
My logs are showing the standard, all connections are taken error: psycopg2.OperationalError: FATAL: remaining connection slots are reserved for non-replication superuser and rds_superuser connections until the DB scales the available units high enough.
Questions:
Should I be configuring the SQLAlchemy connection differently? It feels like an anti-pattern to put in a custom retry to grab a connection while waiting for the DB to scale the number of available units as this type of capability seems to be built into SQLAlchemy usually.
Should I be using an RDS Proxy in front of the database? This also seems like an anti-pattern, adding a proxy in front of an autoscaling DB.
PG version is 10.

Related

High number of connections to Airflow Metadata DB

I tried to find information about the number of connections that Airflow establishes with the metadata database instance (Postgres in my case).
By running select * from pg_stat_activity I realized it creates at least 7 connections whose states change between idle and idle in transaction. The queries are registered as COMMIT or SELECT 1 (mostly). This was using the LocalExecutor on Airflow 2.1, but I tested with an installation of Airflow 1.10 having the same results.
Is anyone aware of where these connections come from? And, is there a way (and a reason) to change this?
Yes. Airflow will Open big number of connections - basically every process it creates will almost for sure open at least one connection. This is "known" characteristics of Apache Airflow.
If you are using MySQL - this is not a big issue as MySQL is good in handling multiple connections (it multiplexes incoming connnections via threads). Postgres uses process-per-connection approach which is much more resource-hungry.
The recommended way to handle that (Postgres is the most stable backend for Airflow) is to use PGBouncer to proxy such connections to Postgres.
In our Official Helm Chart, PGBouncer is used by default, when Postgres is used. https://airflow.apache.org/docs/helm-chart/stable/index.html
I Highly recommend this approach.

Why does RDS proxy make performance worse?

I deployed a RDS Aurora cluster for postgresql 11 in AWS. My lambda is talking to this cluster via IAM authentication. Since lambda is serverless, I have to create a connection to database every time my lambda is triggered and close the connection when it finishes. It is not great since creating db connection is heavy and takes time. I have used xray to observe the connection performance which takes 150ms to create a new connection. It also gives a lot load on db cluster since there will be many short lived connections on db.
After some searching I found RDS proxy is designed to solve the problem. So I deployed RDS proxy to use username/password to connect to my Aurora cluster. And my lambda connects to RDS proxy via IAM authentication.
When I observe the creating connection performance, it becomes worse. It takes more than 500ms to create a connection and sometimes it even takes more than 1 second.
How come it is worse when using RDS proxy? Is there anything I didn't configure in the proxy?

Airflow database performance impact due to multiple connection to database

I have setup Airflow 1.10 to schedule python DAGs. I am also working on other project, which would need data from backend postgresql external database of Airflow (querying at every regular 5 min interval).
Now I am trying to understand the impact on Airflow database performance due to multiple connections on Airflow. Accordingly, I will plan my approach to get the data from Airflow database for other purpose

create a connection pool for many DBs on the same DB server (Spring Boot)

I'm looking for a way to create a connection pool for many DBs on the same DB server (PostgreSQL Aurora).
This means that I need the ability of changing the target DB of a connection at run time.
Currently I'm using HikariCP for connection pooling, in a stack of Spring Boot and JHispter.
Background:
we need to deploy a multi-tenancy micro-service system with a single DB server (to be specific, a single AWS Aurora PostgreSQL instance)
our solution of multi-tenancy is that each tenant has a DB, in that DB we have many schema for each service. All the DBs are in the same AWS Aurora instance.
Our problem:
with this deployment, we have a connection pool for each (tenant x micro-service instance).
This leads to a huge number of connections.
Ie: with the pool size of 50 connections/pool. We need: 500 tenants x 20 micro-service instances x 50 connections/pool = 500000 connections.
The maximum connections allowed on any Aurora DB is 16000, and actually by default the "max_connections" parameter is typically set to something lower.
So now I'm looking for a way to make our pooling scope larger, so that many tenants can share the same pool. Since we use only 1 Aurora server instance, I think it's possible to create a connection pool that can be shared between many tenants.
Is there any way to have a connection pool that can switch the DB at run time?
Unless Aurora has done some customization on this, you cannot change the database of a connection once it is established in PostgreSQL. You can still use a pooler, but it will effectively be a separate pool for each database. This is pretty fundamental, there is nothing you can do about it.

AWS RDS PosgreSQL Read replica with JDBC connection string

Will a read replica only be utilised if the Application specifically creates a second connection pool explicitly for writes...
..Or, can I direct read requests to an [Aurora] RDS PostgreSQL Read Replica just by using connection string parameters?
https://jdbc.postgresql.org/documentation/head/connect.html
I have tested like this:
jdbc:postgresql://node1,node2/database?targetServerType=master;ssl=true;sslmode=verify-full
jdbc:postgresql://node1,node2/database?targetServerType=preferSlave;loadBalanceHosts=true;ssl=true;sslmode=verify-full
But the RDS console continues to show 0 connections for the read replica.
I'm paying for the read replica, so I'd really like to utilise it as more than just a redundant host in case of a failover.