I have a WildFly which hosts an app which is calling some long running SQL queries (say queries or SP calls which take 10-20 mins or more).
Previously this WildFly was pointing to SQL Server 2008, now to Postgres 11.
Previously when I killed/rebooted WildFly, I had noticed that pretty quickly (if not instantaneously) the long-running SP/query calls that were triggered from the Java code (running in WildFly) were being killed too.
Now... with Postgres I noticed that these long running queries remain there and keep running in Postgres even after the app server has been shut down.
What is causing this? Who was killing them in SQL Server (the server-side itself or the JDBC driver or ...)? Is there some setting/parameter in Postgres which controls this behavior i.e. what the DB server will do with the query provided that the client who triggered the query has been shut down.
EDIT: We do a graceful WildFly shutdown by sending a command to WF to shutdown itself. Still the behavior seems different between SQL Server and Postgres.
If shutting down the application server causes JDBC calls that terminate the database session, this should not happen. If it doesn't close the JDBC connection properly, I'd call that a bug in the application server. If it does, but the queries on the backend are not canceled, I'd call that a bug in the JDBC driver.
Anyway, a workaround is to set tcp_keepalives_idle to a low value so that the server detects dead TCP connections quickly and terminates the query.
Related
So I've spent the better part of my day (and several searches before) looking for a workable solution to prevent data loss when the host of a PostgreSQL server installation gets rebooted or shut down. We maintain a number of Azure and on-prem servers and the number of times someone has inadvertently shut down the server without first ensuring Postgres is no longer flushing data to disk is far more frequent than it should be. Of note we are a Windows Server shop.
Our current best practice (which if followed appropriately works) is to stop the Postgres service, then watch disk writes to the Postgres data directory in Resource Monitor. Once nothing is writing to that directory, shut down the host. I have to think that there's a better way to ensure that it doesn't get shutdown in a manner that leads to data corruption, regardless of adherence to the best practice (or in some cases, because Windows Update mandates a reboot, regardless of configured settings telling it not to reboot).
Some things I've considered, but have been unable to find solid answers for:
Create a scheduled task that uses the "On an event" trigger to monitor the System log for event 1074. It would have to be configured to "run whether the user is logged in or not". The script would cancel the shutdown command with shutdown /a, then run a script to elegantly shutdown Postgres. I've seen mixed results on if the scheduled job would reliably trigger before Task Scheduler is terminated in the shutdown sequence.
Create a shutdown script using Group Policy. My question there is will it wait for the script to complete before executing the shutdown?
How do you deal with data loss in your Postgres server Windows hosts?
First, if you register PostgreSQL as a Windows service, a shutdown of the machine will automatically shut down PostgreSQL first.
But even without that, a properly configured PostgreSQL server on proper hardware will never suffer data loss (unless you hit a rare PostgreSQL software bug). It is one of the basic requirements for a relational database to survive crashes without data loss.
To enumerate a few things that come to mind:
make sure that the PostgreSQL parameters fsync and synchronous_commit are set to on
make sure that you are using a reliable file system for the data files and the WAL (a Windows network share is not a reliable file system)
make sure you are using storage that has no caches that are not battery-backed
I have a Spring Boot application (v2.1.5) that uses JPA (Hibernate) to connect to a Postgres DB. Spring Boot uses HikariCP for connection pooling.
In my production environment, I see the following query executed every few seconds regardless of DB activity (almost as if they are some kind of health check?):
SET application_name = 'PostgreSQL JDBC Driver'
I am struggling to get to the bottom of why these queries are performed so often and if they can be avoided because in my local environment the above statement is only executed when a query to the DB is performed. I still do not understand why, but it is less frequent and different behaviour compared to production.
Are these queries necessary? Can they be avoided?
Thanks.
UPDATE:
Here is a screenshot of the queries received by the DB to which the Spring boot app is connecting using HikariCP. The time is shown as "Just now" because all of the query shown are only ~0.5 seconds apart and all within the "current minute".
This seems to be performed by the Hikari Connection Pool. See Default HikariCP connection pool starting Spring Boot application and it's answers.
I wouldn't bother about it since, it is not performing a "extremely high number" of these operations, but only every few seconds, possibly whenever a connection is handed out by or returned to the pool.
If it really bothers you, you could look into various places for disabling it.
The last comment here suggests that setting the connection property assumeMinServerVersion to 9.0 or higher might help.
Since this is probably triggered by the HikariConnectionPool it might be configurable there, by configuring the behaviour when starting, lending and returning a connection.
I am running a couple of spring boot apps in Kubernetes. Each app is using spring JPA to connect to a Postgresql 10.6 database. What I have noticed is when the pods are killed unexpectedly the connections to the database are not released.
Running SELECT sum(numbackends) FROM pg_stat_database; on the database returns lets say 50, after killing a couple of pods running the the spring app and rerunning the query this number jumps to 60 this eventually causes the number of connections to postgresql to exceeds the maximum and prevent restarted pod's applications connecting to the database.
I have experimented with the postgresql option idle_in_transaction_session_timeout setting it to 15s but this does not drop the old connections and the number keeps increasing.
I am making use of the com.zaxxer.hikari.HikariDataSource datasource for my spring apps and was wondering if there was a way to prevent this from happening, either on the posgresql or spring boot's side.
Any advise or suggestions are welcome.
Spring boot version: 2.0.3.RELEASE
Java version: 1.8
Postgresql version 10.6
This issue can arise not only with kubernetes pods but also with a simple application running on a server which is killed forcibly ( like kill -9 pid on Linux) and not given opportunity to process to clean up via a shutdown hook to spring boot. I think in this case nothing on the application side can help. You can however try cleanup of inactive connections by several methods on database side as mentioned here.
We have mongodb with mgo driver for golang. There are two app servers connecting to mongodb running besides apps (golang binaries). Mongodb runs as a replica set and each server connects two primary or secondary depending on replica's current state.
We have experienced the SocketException handling request, closing client connection: 9001 socket exception on one of the mongo servers( which resulted in the connection to mongodb from our apps to die. After that, replica set continued to be functional but our second server (on which the error didn't happen) the connection died as well.
In the golang logs it was manifested as:
read tcp 10.10.0.5:37698-\u003e10.10.0.7:27017: i/o timeout
Why did this happen? How can this be prevented?
As I understand, mgo connects to the whole replica by the url (it detects whole topology by the single instance's url) but why did dy·ing of the connection on one of the servers killed it on second one?
Edit:
Full package path that is used "gopkg.in/mgo.v2"
Unfortunately can't share mongo files here. But besides the socketexecption mongo logs don't contain anything useful. There is indication of some degree of lock contention where lock acquired time is quite high some times but nothing beyond that
MongoDB does some heavy indexing some times but the wasn't any unusual spikes recently so it's nothing beyond normal
First, the mgo driver you are using: gopkg.in/mgo.v2 developed by Gustavo Niemeyer (hosted at https://github.com/go-mgo/mgo) is not maintained anymore.
Instead use the community supported fork github.com/globalsign/mgo. This one continues to get patched and evolve.
Its changelog includes: "Improved connection handling" which seems to be directly relating to your issue.
Its details can be read here https://github.com/globalsign/mgo/pull/5 which points to the original pull request https://github.com/go-mgo/mgo/pull/437:
If mongoServer fail to dial server, it will close all sockets that are alive, whether they're currently use or not.
There are two cons:
Inflight requests will be interrupt rudely.
All sockets closed at the same time, and likely to dial server at the same time. Any occasional fail in the massive dial requests (high concurrency scenario) will make all sockets closed again, and repeat...(It happened in our production environment)
So I think sockets currently in use should closed after idle.
Note that the github.com/globalsign/mgo has backward compatible API, it basically just added a few new things / features (besides the fixes and patches), which means you should be able to just change the import paths and all should be working without further changes.
I run 3 processes at the same time , all of them are using the same DB (RDS postgres)
all of them are java application that uses JDBC for connecting to the DB
I am using PGPoolingDataSource in every process as a connection pool for the DB
every request is handled by the book - ended with
finally{
connection.close();
}
main problems:
1.
I ran out of connections really fast because I do a massive work
with the DB at the beginning (which is ok) but the pool never
shrinks.
I get some exceptions in the code because there are not enough connections and I wish I could expand the timeout when a requesting
a connection.
My insights:
the PGPoolinDataSource never shrinks by definition! I couldn't find any documentation about it, but I assume this is the case. So I tried the apache DBCP pool and again I am having the same problem .
I think there must be timeout when waiting for a connection - I would guess that this timeout can be configured, but I couldn't find this configuration on both pools .
My Questions:
why does the pool never shrinks?!
how to determine how many connections to allocate for each pool\process (here every process has one pool)
what happens if I don't close the pool (not the connections) and the app is dead does the connections on the pool are still alive? this happens a lot when I update the application I stop and start it so I never close the pool.
what would be a good JDBC connection pool that works best with postgres and that has an option to set the timeout for the getConnection ?
Thanks