HikariPool-1 - Unusual system clock change detected, soft-evicting connections from pool - hikaricp

My application use Spring boot and hikaricp.
It occurs this errors:
HikariPool-1 - Unusual system clock change detected, soft-evicting connections from pool
Please help me fix it!

Two recommendations. One, make sure you are using the latest version of HikariCP. Two, configure the computer to sync time from a NTP server.
Newer versions of HikariCP will only evict connections when backward time motion is detected. But will still log a warning for large forward leaps. Large forward leaps often occur on laptops that go into sleep mode or VMs that are suspended and resumed.

Related

Jobrunr background server stopped polling

I had a number of jobs scheduled but seems none of the jobs were running. On further debugging, I found that there are no available servers, and in the jobrunr_backgroundjobservers table, it seems that there has not been a heart beat for any of the servers. What would cause this issue? How would I restart a heartbeat? And how would I know when such an issue occurs and the servers go down again, given that schedules are time sensitive?
It will stop polling if the connection to the database was lost or the database goes down for a while.
The JobRunr Pro version adds extra features and one of them is database fault tolerance - if such an issue occurs, JobRunr Pro will go in standby and will start processing again once the connection to the database is stable again.
See https://www.jobrunr.io/en/documentation/pro/database-fault-tolerance/ for more info.

SIGTERM signal arrives first to kuma and stops all active application connections immediately

we have applications that work with Kafka (MSK), we noticed that once pod is starting to shutdown (during autoscaling or deployment) the app container loses all active connections and the SIGTERM signal causes Kuma to close all connections immediately which cause data loss due to unfinished sessions (which doesn’t get closed gracefully) on the app side and after that we receive connection errors to the kafka brokers,
is anyone have an idea how to make Kuma wait some time once it gets the SIGTERM signal to let the sessions close gracefully?
or maybe a way to let the app know before the kuma about the shutsown?
or any other idea ?
This is known issue getting fixed in the coming 1.7 release: https://github.com/kumahq/kuma/pull/4229

How to close SQL connections of old Cloud Run revisions?

Context
I am running a SpringBoot application on Cloud Run which connects to a postgres11 CloudSQL database using a Hikari connection pool. I am using the smallest PSQL instance (1vcpu/614mb/25connection limit). For the setup, I have followed these resources:
Connecting to Cloud SQL from Cloud Run
Managing database connections
Problem
After deploying the third revision, I get the following error:
FATAL: remaining connection slots are reserved for non-replication superuser connections
What I found out
Default connection pool size is 10, hence why it fails on the third deployment (30 > 25).
When deleting an old revision, active connections shown in the Cloud SQL admin panel drop by 10, and the next deployment succeeds.
Question
It seems, that old Cloud Run revisions are being kept in a "cold" state, maintaining their connection pools. Is there a way to close these connections without deleting the revisions?
In the best practices section it says:
...we recommend that you use a client library that supports connection pools that automatically reconnect broken client connections."
What is the recommended way of managing connection pools in Cloud Run, given that it seems old revisions somehow manage to maintain their connections?
Thanks!
Currently, Cloud Run doesn't provide any guarantees on how long it will remain warm after it's started up. When not in use, the instance is severely throttled by not necessarily shutdown. Thus, you have some revisions that are holding up connections even when not being directed traffic.
Even in this situation, I disagree that with the idea that you should avoid using connection pooling. Connection pooling can lower latency, improve stability, and help put an upper limit on the number of open connections. Alternatively, you can use some of the following configuration options to help keep your pool in check:
minimumIdle - This property controls the minimum number of idle connections that HikariCP tries to maintain in the pool. If the idle connections dip below this value and total connections in the pool are less than maximumPoolSize, HikariCP will make a best effort to add additional connections quickly and efficiently.
maximumPoolSize - This property controls the maximum size that the pool is allowed to reach, including both idle and in-use connections.
idleTimeout - This property controls the maximum amount of time that a connection is allowed to sit idle in the pool. This setting only applies when minimumIdle is defined to be less than maximumPoolSize. Idle connections will not be retired once the pool reaches minimumIdle connections.
If you set minimumIdle to 0, your application will still be able to use up to maximumPoolSize connections at once. However, once a connection is idle in the pool for idleTimeout seconds, it will be closed. If you set idleTimeout to something small like 1 minute, it will allow the number of connections your pool is using to scale down to 0 when not in use.
Hope this helps!
The issue here is that the connections don't get closed by HikariCP when they are opened. I don't know much about Hikari but I found this which explains how connections should be handled through Hikari. I hope that helps!

HikariCP warning message in Play Framework 2.5.x

I'm getting the message below in Play for Scala, what does this mean and what could be the reason? Is this related to Slick or to JDBC (I'm using both)?
[warn] c.z.h.p.HikariPool - HikariPool-7 - Unusual system clock change
detected, soft-evicting connections from pool.
Possible bug in HikariCP
There was some issues in HikariCP that cause this error:
https://github.com/brettwooldridge/HikariCP/issues/559
So be sure yo you use 2.4.4 version or newer
Possible time shifting
HikariCP will log a warning for large forward leaps. Large forward leaps often occur on laptops that go into sleep mode or VMs that are suspended and resumed.
There is similar question:
HikariPool-1 - Unusual system clock change detected, soft-evicting connections from pool
The only thing I would add is that NTP synchronization also could have bugs: Clock drift even though NTPD running

Why is my Netty based TCP server hanging up with 100% CPU usage?

I've developed a Netty based TCP server to receive maintain connection with GSM/GPRS based devices and to persist those data in MySql database. Currently 5K connections are handled. Devices send periodic messages with interval of 30-60 secs, but connections are kept alive to maintain duplex communication.
The server application consumes 1-2% CPU in normal operation with peaks up to 10%, average load is very low. However after 6 hours to 48 hours normal operation, server application hangs up with constant 100% CPU consumption, thread dump indicates that epoll selector is the reason for high CPU usage. Applications still keeps connections for a few hours, then CPU consumption increases to 200% and most of the connections are released.
In the beginning of the project we used MINA and had the same issue with 1K active connections, that is why we switched to Netty. Until 5K connections Netty was much more stable and hang up period was 1-2 weeks.
Our server configuration:
I7-2600 Quad Core CPU,
8 GB Ram, Centos 5.0,
Open JDK 6.0,
Netty 3.2.4 (Netty is updated to 3.5.2 a few hours ago)
In order to overcome this problem we will update JDK to 7.0 (JDK has a new I/O implementation optimized for asynchronous operations) and try different OS including FreeBSD, Windows Server since each operating system has different strategies for handling I/O.
Any help will be appreciated, thanks..
This sounds like the Epoll bug.
The app is proxying connections to backend systems. The proxy has a pool of channels that it can use to send requests to the backend systems. If the pool is low on channels, new channels are spawned and put into the pool so that requests sent to the proxy can be serviced. The pools get populated on app startup, so that is why it doesn't take long at all for the CPU to spike through the roof (22 seconds into the app lifecycle). Source
Netty has a workaround built-in. Not sure from which version though, will have to update later.
System.setProperty("org.jboss.netty.epollBugWorkaround", "true");