Which database NoSQL is recommended for web service with multiple simultaneous connections?
Related
Given a PostgreSQL database that is reasonably configured for its intended load what factors would contribute to selecting an external/middleware connection pool (i.e. pgBouncer, pgPool) vs a client-side connection pool (HikariCP, c3p0). Lastly, in what instances are you looking to apply both client-side and external connection pooling?
From my experience and understanding, the disadvantages of an external pool are:
additional failure point (including from a security standpoint)
additional latency
additional complexity in deployment
security complications w/ user credentials
In researching the question, I have come across instances where both client-side and external pooling are used. What is the motivation for such a deployment? In my mind that is compounding the majority of disadvantages for a gain that I appear to be missing.
Usually, a connection pool on the application side is a good thing for the reasons you detail. An external connection pool only makes sense if
your application server does not have a connection pool
you have several (many) instances of the application server, so that you cannot effectively limit the number of database connections with a connection pool in the application server
In https://cloud.google.com/sql/docs/quotas, it mentioned that "Cloud Run services are limited to 100 connections to a Cloud SQL database.". Assume I deploy my service as Cloud Run, what's the right way to handle 1 million concurrent connections? Can cloud spanner enables this - I can't find documentation discussing maximum concurrent connections on cloud spanner maximum concurrent connection with Cloud Run.
Do you want Cloud Run to handle a million concurrent connections, or do you want Cloud SQL to handle a million concurrent connections?
If you want Cloud SQL to handle a million concurrent connections, you are probably wrong. Check out this article about Pool sizing (it's on a Java repo, but is general enough to apply to all connection pooling). If you are at the point where you need a million concurrent connections, you would need to invest in more advanced architectures (such as sharding).
I'm looking for a way to create a connection pool for many DBs on the same DB server (PostgreSQL Aurora).
This means that I need the ability of changing the target DB of a connection at run time.
Currently I'm using HikariCP for connection pooling, in a stack of Spring Boot and JHispter.
Background:
we need to deploy a multi-tenancy micro-service system with a single DB server (to be specific, a single AWS Aurora PostgreSQL instance)
our solution of multi-tenancy is that each tenant has a DB, in that DB we have many schema for each service. All the DBs are in the same AWS Aurora instance.
Our problem:
with this deployment, we have a connection pool for each (tenant x micro-service instance).
This leads to a huge number of connections.
Ie: with the pool size of 50 connections/pool. We need: 500 tenants x 20 micro-service instances x 50 connections/pool = 500000 connections.
The maximum connections allowed on any Aurora DB is 16000, and actually by default the "max_connections" parameter is typically set to something lower.
So now I'm looking for a way to make our pooling scope larger, so that many tenants can share the same pool. Since we use only 1 Aurora server instance, I think it's possible to create a connection pool that can be shared between many tenants.
Is there any way to have a connection pool that can switch the DB at run time?
Unless Aurora has done some customization on this, you cannot change the database of a connection once it is established in PostgreSQL. You can still use a pooler, but it will effectively be a separate pool for each database. This is pretty fundamental, there is nothing you can do about it.
I have a data flow job reads from pub/sub and writes into elasticsearch and postgresql.
Postgresql is located behind the firewall of the company. In order data-flow to access postgresql db we have to set permissions on firewall. But the problem is we don't have data-flow workers IP addresses and we don't want to permit all incoming locations to postgresql.
What is the best way to secure writes from data-flow to postgresql?
We have a multi tenant set up with one database for each tenant in a single server. Is it possible to main a common pool for all databases with pgbouncer? Our number databases in one server can range in a few hundreds. While I can have a large number of connections from application to pgbouncer, I am limited by the number of connection I can have with postgres server. What is the best approach in this scenario?