How to know the exactly total of current connections to PostgreSQL? - postgresql

My Azure PostgreSQL server has total connections is 480.
I want to check the total of current connections is accessing to database by perform below SQL:
select * from pg_stat_activity;
I can see the output list includes all users (superuser,...) and with idle and active status. So is this correct to check total of current connections? Or I should exclude "idle" connections to know the exactly the result?
Thank you so much,

"idle" connection is real connection. Because Postgres has not any internal executor pool (like thread pool of MySQL), any "idle" connection can process any commands. At this moment, the "idle" connection doesn't require too much sources, but when you calculate save memory limits (against using swap), you should to calculate with "idle" connections too - because any connection can be active connection sometimes.
480 connections is usually much - good number is 10-20 x CPU cores for max_connections. If you have too high max_connection, then you have to have low work_mem, what can has negative impact on performance, or your configuration should not be safe against overloading.
share buffers + (max connection * work_mem * 2) + ram for operation system
+ ram for filesystem < RAM

Related

Why WSO2 APIm needs 50+ DB connections at startup?

In our WSO2 setup, whenever the APIm comes up, it creates close to 50+ DB connections towards the PostGres DB. In stable phase, each APIm instance has only 4 DB connections. I would like to understand why it needs 50+ connections at startup? is it a bug or by design?
We run WSO2 in kubernetes setup, PostGres has a max connection limit set to 100, and two instances of APIm is not able to come-up due to this issue.
Within the WSO2 platform, the Tomcat JDBC pooling is used as the default pooling framework due to its production-ready stability and high performance. The goal of tuning the pool properties is to maintain a pool that is large enough to handle peak load without unnecessarily utilizing resources. These pooling configurations can be tuned for your production server in general in the <PRODUCT_HOME>/repository/conf/datasources/master-datasources.xml file. This is applicable if you are using an APIM version less than or equal to 2.6. If you are using APIM-3.X.X then these configurations can be found in <PRODUCT_HOME>/repository/conf/deployment.toml file.
The following parameters should be considered when tuning the connection pool:
The application's concurrency requirement.
The average time used for running a database query.
The maximum number of connections the database server can support.
The maxActive value is the maximum number of active connections that can be allocated from the connection pool at the same time. The default value is 100. The maximum latency (approximately) = (P / M) * T, where,
M = maxActive value
P = Peak concurrency value
T = Time (average) taken to process a query
Therefore, by increasing the maxActive value (up to the expected highest number of concurrency), the time that requests wait in the queue for a connection to be released will decrease. But before increasing the Max. Active value, consult the database administrator, as it will create up to maxActive connections from a single node during peak times, and it may not be possible for the DBMS to handle the accumulated count of these active connections.
Note that this value should not exceed the maximum number of requests allowed for your database.
For more details on this topic please refer to the official documents[1, 2].
[1] https://docs.wso2.com/display/ADMIN44x/Performance+Tuning#PerformanceTuning-JDBCpoolconfiguration
[2] http://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html

Recommended connection pool size for HikariCP

As written in HikariCP docs the formula for counting connection pool size is connections = ((core_count * 2) + effective_spindle_count). But which core count is this: my app server or database server?
For example: my is app running on 2 CPUs, but database is running on 16 CPUs.
This is Kevin's formula for connection pool size, where cores and spindles (you can tell is is an old formula) are the database server's.
This assumes that the connections are kept fairly busy. If you have transactions with longer idle times, you might need to make the pool bigger.
In the end, only trial and error can find the ideal pool size.
The quote is from PostgreSQL wiki which is related to database cores/server
database server only has so many resources, and if you don't have enough connections active to use all of them, your throughput will generally improve by using more connections.
Notice that this formula may be outdated (comment by #mustaccio)
That wiki page was last updated nearly 5 years ago, and the advice in question is even older. I/O queue depth might be more relevant today than the number of spindles, even if the latter are actually present

How to increase the connection limit for the Google Cloud SQL Postgres database?

The number of connections for Google Cloud SQL PostgreSQL databases is relatively low. Depending on the plan this is somewhere between 25 and 500, while the limit for MySQL in Google Cloud SQL is between 250 and 4000, reaching 4000 very quickly.
We currently have a number of trial instances for different customers running on Kubernetes and backed by the same Google Cloud SQL Postgres server. Each instance uses a separate set of database, roles and connections (one per service). We've already reached the limit of connections for our plan (50) and we're not even close to getting to the memory or cpu limits. Connection pooling seems not to be an option, because the connections are with different users.
I'm wondering now why the limit is so low and if there's a way to increase the limit without having to upgrade to a more expensive plan.
It looks like google just released this as a beta feature.
When creating or editing a database instance, you can add a flag called max_connections, where you can enter a new limit between 14 and 262143 connections.
There is a Feature request in the Public Issue Tracker to expose and hence control max_connections in PostgreSQL. This comment (I am reproducing it here) explains the reasons to set the number of connections the way it is now:
Per-tier max_connections is now fully rolled out. As shown on
https://cloud.google.com/sql/faq#sizeqps, the limits are now:
Memory size, in GiB | Maximum concurrent connections
--------------------+-------------------------------
0.6 (db-f1-micro) | 25
1.7 (db-g1-small) | 50
3.75 up to 6 | 100
6 up to 7.5 | 150
7.5 up to 15 | 200
15 up to 30 | 250
30 up to 60 | 300
60 up to 120 | 400
120 and above | 500
I understand your frustration about the micro/small instances having fewer than 100
concurrent connections and the lack of control of this flag. We arrived at these values by
taking the available RAM, reducing it by overhead, shared buffers, autovacuum memory and
then dividing the remaining ram by typical per-connection memory and rounding off. This
gives us the number of connections that can be used without risk of hitting out-of-memory
condition
The basic premise of a fully managed service with an attached SLA is that we provide safe
hosting. This is what motivates us using a max_connections that is safe against OOM.
Your option is, as you have discarded connection pooling, to use an instance with higher memory.
UPDATE:
As mentioned in a comment of the mentioned thread, there has been changes to the max connections settings, which are now:
Futhermore the defaults can now be overridden with flags, up to 260K connections.
For the Terraform gang, you can update the parameter using database_flags:
resource "google_sql_database_instance" "main" {
name = "main-instance"
database_version = "POSTGRES_14"
region = "us-central1"
settings {
tier = "db-f1-micro"
database_flags {
name = "max_connections"
value = 100
}
}
}
Note that at the time of writing the db-f1-micro default max_connections is 25, refs https://cloud.google.com/sql/docs/postgres/flags#postgres-m

Optimal Database Connection Pool Size

I have read that you should keep the number of connections in your database connection pool lower than the number of threads running in the application server and that might use that pool correct?
I have read too that having a high number of connections is not good but I don't really know why? Would it use more memory?
Right now during pick times my server is running out of connections and I don't know it would be good just to increase the number of connections.
Thank you
With a Small Connection pool you have faster access on the connection table but may not have enough connections to satisfy requests.
In the other hand with a Large Connection pool there are more connections to fulfill requests and requests will spend less (or no) time in the queue but access on the connection table is slower.
http://docs.oracle.com/cd/E19316-01/820-4343/abehs/index.html
core_count = 4
connections = ((core_count * 2) + effective_spindle_count) = 9

A good PgPool II configuration

I have been trying to configure PgPool to accept a requests of about 150. Postgres server is configured to accept only 100 connections. Anything beyond 100 need to be pooled by PgPool. I don't seem to get that. I only require PgPool to queue the requests, my current configuration does not do that. From my JMeter test, when I try to get connection beyond 100, postgres gives me an error saying PSQL error: sorry, too many clients.
I only have configured PGPool with the following parameters :
listen_address = 'localhost'
port = 9999
backend_hostname0 = 'localhost'
backend_port0 = 5432
num_init_children = 100
max_pool = 4
child_life_time =120
child_max_connections = 0
connections_life_tome = 120
client_idle_limit = 0
Since I only require PgPool to Queue the extra connections requests, is the above configuration correct?
Please advise on the proper configuration.
The 'child_max_connections' in pgpool is NOT the maximum allowed connections to the DB. It is the number of times a pooled connection can be used before it terminates and restarts. It is there to recycle connection threads and stop memory leaks.
The formula of max_pool x num_init_children describes the maximum number of connections that pgpool will make to Postgresql. Obviously, this needs to be less than the 'max_connections' set in postgresql, otherwise pgpool marks the DB as an unavailable backend. And if you have some DB connections reserved for admin use, you need to reduce the number of pgpool connections further.
So, what I am saying is that the 'max_connections' in the formula is the parameter set in postgresql.conf. Setting 'child_max_connections' to 100 in the comment above just means that the pgpool connection is closed and reopened every 100 times it is used.
The first thing is to figure out what you want as your maximum pool size. PostgreSQL performance (both in terms of throughput and latency) is usually best when the maximum number of active connections is somewhere around ((2 * number-of-cores) + effective-spindle-count). The effective spindle count can be tricky to figure -- if your active data set is fully cached, count it as zero, for example. Don't count any extra threads from hyperthreading as cores for this calculation. Also note that due to network latency issues, you may need a pool slightly larger than the calculated number to keep that number of connections active. You may need to do some benchmarks to find the sweet spot for your hardware and workload.
The setting you need to adjust is child_max_connections, with num_init_children kept less than or equal to that.