I have connection pools setup for my system to handle concurrent connections for my managed database clusters in DigitalOcean.
Overall, each client I have, has their own DB, then I create a pool for that connection to avoid the error:
FATAL: remaining connection slots are reserved for non-replication superuser connections
Yesterday I ran into connection issues with a default database that my system also uses, I hadn't thought the connection pooling was needed for whatever dumb reason or another. No worries, I started getting flooded with error emails and then fixed the system to use the correct pooling mechanism.
This is where my question comes in, with the pooling on DigitalOcean they give you a specific "size" depending on your subscription, my subscription has an available "size" for the clusters of 97. As my clients grow I will be creating new pools and databases for them, so eventually I will run out of slots to assign a pool...what does this "size" dictate?
For example 1 client I have has an allotted size of 10 to their connection pool. Speaking to support:
The connection pool with a size of 1 will only allow 1 connection at a time. As for how you can estimate the number of simultaneous users, this is something you'll need to look over as your user and application grow. We don't have a way to give you that estimate from our back end.
So with that client that has a size of 10 alloted to their pool, they have 88 staff users that use the system simultaneously throughout the day, then they have about 4,000 users that they manage that can all sign in theoretically at once.
This is a lot more than 10 connections, and I get no errors on connection size at least that I've seen so far.
Given that I have a limited amount, how do I determine the appropriate size to use, does anybody have experience with this in production?
For example, with the connections listed above, is 10 too much, too little, just right?
Update 2/14/23
I have tested the capabilities bit because I was curious and can't get any semi-logical answer. When I use 1 connection pool for my 4,000 user client (although all users would not hit their DB/pool at the same time), I get connection errors (specifically when running background tasks from django-celery and Celery in the middle of the night).
Here are those errors, overall just connection already closed from here:
File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 269, in create_cursor cursor = self.connection.cursor()
This issue happened concurrently on 2 nights, but never during the day during normal user activity.
Once I upped the connection pool for said 4,000 user client to 2 instead of 1 the connection already closed error never occurred again.
Related
We have a setup where multiple Node processes write into the same database (different tables), and as a result, when using Knex, we end up with more connections to the database than desirable. So, I was thinking of using PgBouncer as a middleware for the Knex processes to connect to, but I'm unsure of how Knex's attempts at connection pooling will work with PgBouncer, which will setup its own pool of connections.
Please assume the following:
A 2vCPU database server
10+ Node processes interacting with the database
PgBouncer running with a pool size of 5
Questions:
If I set min/max size as 1/5 in each Knex setup, will I run out of connections or will PgBouncer somehow be able to "fool" each Knex setup into believing that it has its own pool?
It doesn't feel like I can use a Knex pool in this scenario. Even using min/max pool sizes as 1/1 will leave me out of options if the first five Knex steups I launch claim a connection each.
Is there a way to make Knex drop pooling and open/close connections as needed? This is the ideal setup for me because now PgBouncer won't actually be opening/closing connections but returning them to the pool (unless I'm mistaken about this?).
What strategy should I use? What should my knexfile look like? And would I need to code differently for this? Any help or ideas are welcome!
While it would be ridiculous to allow 32000 connections, it is also ridiculous to allow only 5. I think the lesson from your link should be not that there is a precisely defined magic number of connections, but that you need to look at the waitevents of your performing database, or just do experiments, to see what is going on and whether you have too many connections.
While repeatedly connecting to pgbouncer (which reuses its internal connection to PostgreSQL) might be less expensive than repeatedly connecting all the way through to PostgreSQL, it will still be far more expensive than just re-using an existing connection from knex's internal connection pool. If your connection load is high enough to matter, then bypassing the internal connection pool to just use pgbouncer would be a mistake. Most likely using pgbouncer at all is a mistake, as it just introduces yet another moving piece for no good reason.
Using knex pooler with min:1 and max:5 with 10 different knex app servers and a limit of 5 connections in pgbouncer would mean that only 5 of your app servers could have a connection. The rest would be forced to wait, but it isn't clear what they would be waiting for. Presumably they would wait forever, or until they caught a timeout error, or until one of other app servers exited or shutdown its pool. Pgbouncer would fool them all right, but not in a helpful way. It might make more sense to use this a min:0 (which is now the recommended setting, but still not the default), as that way an app server would at least release its final connection after idleTimeoutMillis, allowing another app to use it.
Using min:1 max:1 could be useful if pgbouncer were not used or used with a large enough pool size, but it could also break entirely. For example, if an app needs at least 2 simultaneous connections to work correctly. That would probably be a poorly written app, but poorly written apps are the rule, not the exception.
As written in HikariCP docs the formula for counting connection pool size is connections = ((core_count * 2) + effective_spindle_count). But which core count is this: my app server or database server?
For example: my is app running on 2 CPUs, but database is running on 16 CPUs.
This is Kevin's formula for connection pool size, where cores and spindles (you can tell is is an old formula) are the database server's.
This assumes that the connections are kept fairly busy. If you have transactions with longer idle times, you might need to make the pool bigger.
In the end, only trial and error can find the ideal pool size.
The quote is from PostgreSQL wiki which is related to database cores/server
database server only has so many resources, and if you don't have enough connections active to use all of them, your throughput will generally improve by using more connections.
Notice that this formula may be outdated (comment by #mustaccio)
That wiki page was last updated nearly 5 years ago, and the advice in question is even older. I/O queue depth might be more relevant today than the number of spindles, even if the latter are actually present
I have a local app that connects to a local MongoDB. It has 2 databases and about 60 collections in total.
I open one connection and then get an object to access each collection.
I let the system run the whole afternoon and checking stats, I found this:
I don't understand why I have over 750k connections; but also, I don't really understand the metrics; for example the number blow: total connections created, hovering at 1770...
Can someone explain what is going on?
Total Connections Created refers to the number of times the server has accepted a connection since it started running, so if it has been running for many months, it is likely to have many. It doesn't mean they are still active (and most won't be).
You can choose to only show active connections under Current Connections by clicking on the menu icon and choosing In Use:
Here's more info on why you might be seeing a large number of available connections: MongoDB available connections
I am building an app using Spring-Boot/Hibernate with Postgres as the database. I am using Spring 2.0, so Hikari is the default connection pool provider.
Currently, I am trying to load-test the application with a REST end-point that does an 'update-if-exists and insert if new' to an entity in the database. Its a fairly small entity with 'BIGSERIAL' primary key and no constraints on any other field.
The default connection pool size is 10 and I haven't really tweaked any other parameters - either of the HikariCP or for Postgres.
The point at which I am stuck at this moment is to debug connections in 'active' state and what they are doing or why they stuck currently.
When I run '10 simultaneous users', it basically translates into 2 or 3 times that many queries and thus, when I turn on the HikariCP debug logs, it hangs at something like this -
(total=10, active=10, idle=0, waiting=2) and the 'active' connections do not really release the connections, which is what I am trying to find out because the queries are fairly simple and the table itself is just 4 fields (including the primary key).
The best practices from HikariCP folks as well generally is that increasing the connection pool is not the right first step towards scaling.
If I do increase the connection pool size to 20, things start working for 10 simultaneous/concurrent users but then again, its not the root cause/solution for the problem I believe.
Is there any way I can log either Hibernate or Postgres messages that might help in knowing what these 'active' connections are waiting on and why the connection doesn't get released even after I increase the wait-time to a long time?
If it is a connection-leak ( as is reported when the leak-detection-threshold is reduced to a lower value (e.g. 30 seconds) ), then how can I tell if Hibernate is responsible for this connection leak or if it is something else?
If it is a lock/wait at the database level, how can I get the root of this?
UPDATE
After help from #brettw, I took a thread-dump when the connections were exhausted and it pointed in the direction of a connection-leak. The threads on HikariCP issues board - https://github.com/brettwooldridge/HikariCP/issues/1030#issuecomment-347632771 - which points to the Hibernate not closing connections which then pointed me to https://jira.spring.io/browse/SPR-14548, which talks about setting Hibernate's connection closing mode since the default mode holds the connection for too long. After setting spring.jpa.properties.hibernate.connection.handling_mode=DELAYED_ACQUISITION_AND_RELEASE_AFTER_TRANSACTION, the connection pool worked perfectly.
Also, the point made here - https://github.com/brettwooldridge/HikariCP/issues/612#issuecomment-209839908 is right - a connection leak should not be covered up by the pool.
It sounds like you could be hitting a true deadlock in the database. There should be a way to query PostgreSQL for current active queries, and current lock states. You'll have to google it.
Also, I would try a simple thread dump to see where all the threads are blocked. It could be a code-level synchronization deadlock.
If all of the threads are blocked on getConnection(), it is a leak.
If all of the threads are down in the driver, according to the stacktrace for each thread, it is a database deadlock.
If all of the threads are blocked waiting for a lock in your application code, then you have a synchronization deadlock -- likely two locks with inverted acquisition order in different parts of the code.
The HikariCP leakDetectionThreshold could be useful, but it will only show where the connection was acquired, not where the thread is currently stuck. Still, it could provide a clue.
Say I have set max_connections=10 in my postgresql.conf and make 11 concurrent connections. What happens with the 11-nth connection?
Will it be refused with an error or will it wait until some connection slots free up?
If exceeding connections wait in queue, is there a timeout limit for them? Where can that be set? Is it a fair queue?
Is there any documentation on this? I can't find this in the official docs
--------- edit ----------
found a source on this (no connection pooling): https://wiki.postgresql.org/wiki/Number_Of_Database_Connections
The decision not to include a connection pooler inside the PostgreSQL server itself has been taken deliberately and with good reason:
From documentation
max_connections (integer)
Determines the maximum number of concurrent
connections to the database server. The default is typically 100
connections, but might be less if your kernel settings will not
support it (as determined during initdb). This parameter can only be
set at server start.
In combination with the name of the variable it should be clear what happens: If you try to open up more than the number of set connections, you will get an error. In this case you will get a very prominent error that the number of connections is exhausted.
Postgres itself is not including any connection pooling or similar so it's either "Yes, you got in" or "No, you are not".