Google Cloud SQL + Hikari CP + Communications link failure - google-cloud-sql

I'm experiencing intermittent connectivity errors from a Spring Boot application communicating with a D1 Google CloudSQL Server with the configuration settings described here HikariCP MySQL settings
I was wondering if anyone has encountered this before.
I've read the FAQ posted here Hikari FAQ and I'm wondering if my default idleTimeout and maxLifeTime (30 mins) settings might be at fault; wait_timeout and interactive_timeout on the server are both set to default 28800s (8 hours).
The FAQ says that these two settings should be about a minute less that the server settings, but if I'm losing connections after 30 minutes I can't quite see how upping the maxLifeTime to 7hrs 59mins is going to improve the situation.
Does anyone have any recommendations?
Redacted stack trace(s):
Get these from time to time
org.springframework.security.authentication.InternalAuthenticationServiceException: Could not get JDBC Connection; nested exception is java.sql.SQLException: Timeout after 30018ms of waiting for a connection.
at org.springframework.security.authentication.dao.DaoAuthenticationProvider.retrieveUser(DaoAuthenticationProvider.java:110)
at org.springframework.security.authentication.dao.AbstractUserDetailsAuthenticationProvider.authenticate(AbstractUserDetailsAuthenticationProvider.java:132)
at org.springframework.security.authentication.ProviderManager.authenticate(ProviderManager.java:156)
at org.springframework.security.authentication.ProviderManager.authenticate(ProviderManager.java:177)
...
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: Timeout after 30023ms of waiting for a connection.
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80)
....
Caused by: java.sql.SQLException: Timeout after 30023ms of waiting for a connection.
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:208)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:108)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77)
... 59 common frames omitted
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:630)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:695)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:727)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:737)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:787)
Hibernate search:
2015-02-17 10:34:17.090 INFO 1 --- [ entityloader-2] o.h.s.i.SimpleIndexingProgressMonitor : HSEARCH000030: 31050 documents indexed in 1147865 ms
2015-02-17 10:34:17.090 INFO 1 --- [ entityloader-2] o.h.s.i.SimpleIndexingProgressMonitor : HSEARCH000031: Indexing speed: 27.050219 documents/second; progress: 99.89%
2015-02-17 10:41:59.917 WARN 1 --- [ntifierloader-1] com.zaxxer.hikari.proxy.ConnectionProxy : Connection com.mysql.jdbc.JDBC4Connection#372f2018 (HikariPool-0) marked as broken because of SQLSTATE(08S01), ErrorCode(0).
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 1,611,087 milliseconds ago. The last packet sent successfully to the server was 927,899 milliseconds ago.
The indexing isn't particularly quick at the moment I think because I'm not using projections. The process takes about 30 minutes to execute.
Thanks

It could be a couple of things here. First, the network infrastructure (firewalls, load-balancers, etc.) between the application tier and the database tier can impose their own connection timeouts, regardless of MySql settings.
The indexing failure indicates that the connection was out of the pool for ~27 minutes with no SQL activity when that failure occurred.
Second, specifically regarding the "Could not get JDBC Connection" error, you may be running into Cloud SQL connection limits.
I recommend three things. One, make sure you are on the latest HikariCP (2.3.2) and latest MySql Connector/J driver (5.1.34). Two, enable DEBUG-level logging for the com.zaxxer.hikari package. HikariCP debug logging is not "chatty", but will log pool statistics every 30 seconds (and sometimes more detail in failure conditions). Lastly, try setting the maxPoolSize to something smaller (unless already at the default), and setting maxLifeTime to 15 or 20 minutes (1200000ms).
If the error occurs again, post updated logs containing the HikariCP debug logs around the time of failure. Also, feel free to open a tracking issue over on Github as larger logs etc. are easier there.

Related

Hikari CP connections are suddenly invalidated

The bounty expires in 5 days. Answers to this question are eligible for a +50 reputation bounty.
Habil Ganbarli is looking for an answer from a reputable source.
Hi Stackoverflow family,
So we have an application with Kotlin & Spring boot that uses a single DB instance(1 GB Memory and instance class is db.t3.micro) as PostgreSQL and is hosted in AWS. What happens for the last couple of days is suddenly connections in my pool are invalidated(2-3 times a day) and the pool size drops drastically. In summary:
Let's say everything is normal in Hikari and the connections are closed and added according to the maxliftime which is 30 minutes by default and the log are like below:
HikariPool-1 - Pool stats (total=40, active=0, idle=40, waiting=0)
HikariPool-1 - Fill pool skipped, pool is at sufficient level.
Suddenly most of the connections become invalidated. Let's say 30 out of 40. The connections are closed before they pass their max lifetime and the logs are like below for all closed connections:
HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection#5257d7b2 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
HikariPool-1 - Closing connection org.postgresql.jdbc.PgConnection#7b673105: (connection is dead)
Additionally after these messages followed by multiple of this logs like below:
Add connection elided, waiting 6, queue 13
And the timeout failure stats like below:
HikariPool-1 - Timeout failure stats (total=12, active=12, idle=0, waiting=51)
Finally, I have left with lots of connection timeouts of requests due to the reason that there were no connection available for the most of the requests:
java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms
I have added leak-detection-threshold and it also logs like below during the problem happening:
Connection leak detection triggered for org.postgresql.jdbc.PgConnection#3bb5f155 on thread http-nio-8080-exec-482, stack trace follows
java.lang.Exception: Apparent connection leak detected
The hikari config is like below:
hikari:
data-source-properties: stringtype=unspecified
maximum-pool-size: 40
leak-detection-threshold: 30000
When this problem happens queries in PostgreSQL also take a lot of time: 8-9 seconds and increase up to 15-35 seconds. Some queries even 55-65 seconds(which usually take 1-3 seconds at most in usual times). That is why we think it is not a query issue.
In addition to that some sources suggest using try with resources, however, it is not the case for us as we do not obtain connections manually. In addition to that increasing the max pool size from 20 to 40 also did not help. I would really appreciate any comment or hint as we are dealing with this issue for almost a week.

Spring Boot 1.5.3 Creating more connection than specified in application.properties

I am working on a project where i have dual datasource configured. On testing i have limit the no of max-active connections to five but when i checked on database, i found that application create around 25+ connections.
Code Sample
# Number of ms to wait before throwing an exception if no connection is available.
spring.datasource.tomcat.max-wait=1000
# Maximum number of active connections that can be allocated from this pool at the same time.
spring.datasource.tomcat.max-active=1
spring.datasource.tomcat.max-idle=1
spring.datasource.tomcat.min-idle=1
spring.datasource.tomcat.initial-size=1
# Validate the connection before borrowing it from the pool.
spring.datasource.tomcat.test-on-borrow=true
spring.datasource.tomcat.test-while-idle = true
spring.datasource.tomcat.validation-query = true
spring.datasource.tomcat.time-between-eviction-runs-millis = 360000
spring.rdatasource.tomcat.max-wait=1000
# Maximum number of active connections that can be allocated from this pool at the same time.
spring.rdatasource.tomcat.max-active=1
spring.rdatasource.tomcat.max-idle=1
spring.rdatasource.tomcat.min-idle=1
spring.rdatasource.tomcat.initial-size=1
# Validate the connection before borrowing it from the pool.
spring.rdatasource.tomcat.test-on-borrow=true
spring.rdatasource.tomcat.test-while-idle= true
spring.rdatasource.tomcat.validation-query = true
spring.rdatasource.tomcat.time-between-eviction-runs-millis = 360000
above connection is working fine, but exceeding no of connection to database. User which i am using is limited to 10 connection.
when i hit request to application than i am getting
query wait timeout error with unable to create initial pool size.
I am using tomcat connection pooling
Please provide me the solution so application will run with 10 connections limit which is set at database.

Randomly Login Timeout Expired errors from SQL 2000 DTS against SQL2008R2 databases

I have some JOBs running on SQL Server 2000, which are calling stored procedures or queries against remote SQL Servers (different editions).
The JOB calls a DTS, and is the DTS who does the remote connection and executes the Stored Procedure or gets a query results from the remote server.
This has been working without errors for years. I don't know why during the last month, I'm having randomly errors on these kind of jobs... I've read some other posts and seems to be related to a security issue, but I repeat, the most of times the jobs are working, only some runs are failing with the error.
Executed as user: SERVER\user. DTSRun: Loading... DTSRun: Executing...
DTSRun OnStart: DTSStep_DTSDynamicPropertiesTask_2 DTSRun OnError:
DTSStep_DTSDynamicPropertiesTask_2, Error = -2147467259 (80004005)
Error string: Login timeout expired Error source: Microsoft OLE DB
Provider for SQL Server Help file: Help context: 0
Error Detail Records: Error: -2147467259 (80004005); Provider Error: 0 (0)
Error string: Login timeout expired Error source: Microsoft OLE DB Provider for SQL Server
Help file: Help context: 0 DTSRun OnFinish:
DTSStep_DTSDynamicPropertiesTask_2 DTSRun: Package execution complete.
Process Exit Code 1. The step failed.
I really don't know what to check. After reboot the server the problems are still there. Any help from you guys would be appreciated.
EDIT 2019-02-14 16:15 -------------------------------------------------------------------------------------------
One of the solutions I found has been to change the Remote Login Timeout property from the default 20 seconds to 30 seconds, or to 0 (Zero means without timeout), by executing the next code:
sp_configure 'remote login timeout', 30 --Or 0 seconds for infinite
go
reconfigure with override
go
From: https://support.microsoft.com/es-es/help/314530/error-message-when-you-execute-a-linked-server-query-in-sql-server-tim
I've tried this solution changing it to 30 seconds, but with the same result. Of course I didn't set it to 0 for obvious reasons, the timeouts are there for something. And also tried 300 seconds (5 minutes to make a login!) and still the same.
EDIT 2019-02-25 11:25 -------------------------------------------------------------------------------------------
Very similar to my problem, still not solved...
https://www.sqlservercentral.com/Forums/Topic1727739-391-1.aspx
For the moment I have a temp solution, and it is to increase the Connect Timeout on the Connection object.
It was blank (probably using its default value).
Since I changed this property (Connection Object > Advanced... > Connect Timeout) to 300 I'm not having the problems on these DTS. I leaved 2 DTS without the change to ensure I continued having the problem, and these are the only ones which continue having the problem. The changed ones are working fine now.

Connection failure has been detected: HQ119014: Did not receive data from invm:0

actually I have a TimerWS running correctly, but in a few hours I have this error
WARN [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from invm:0. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
WARN [org.hornetq.jms.server] (Thread-3 (HornetQ-client-global-threads-451033388)) Notified of connection failure in xa discovery, we will retry on the next recovery: HornetQException[errorType=NOT_CONNECTED message=HQ119006: Channel disconnected]
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
Any idea?
Something is going wrong with a client's connection to the server.
These warnings are generated from the server which determines that a client is no longer responding.When that happens, the server cleans up the server-side resources related to the client's connections.
I have seen this many times by clients which are not properly coded to close their resources when they exit (finally block).
Have a look also for network problems which break the connection.

High tomcat thread count caused by threads waiting on MultiThreadedHttpConnectionManager ConnectionPool

We've recently started seeing spikes in the thread counts on our tomcat servers (peaking at over 1000 when normally at around 100). We performed a thread dump on one of the tomcat servers whilst it's thread count was high and found that a large number of the threads were waiting on MultiThreadedHttpConnectionManager$ConnectionPool, stack trace as follows:
"TP-Processor21700" daemon prio=10 tid=0x4a0b3400 nid=0x2091 in Object.wait() [0x399f3000..0x399f4004]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x58ee5030> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
- locked <0x58ee5030> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
...
There are 3 points in our code where httpClient.executeMethod() is called (to obtain info via an http request to another tomcat server). In each case the GetMethod object passed to it has had its socket timeout value set (i.e. via getMethod.getParams().setSoTimeout();) before hand, and the MultiThreadedConnectionManager is configured in spring to have a connectionTimeout value of 10 seconds. One thing I have noticed is that only 2 of the 3 httpClient.executeMethod() invocations are followed by a call to getMethod.releaseConnection(), so I'm wondering if this may be the cause of the problem (i.e. connections not being explicitly released). However what's strange is that
the problem has only started occurring in the last few days and the source code has not been modified for over a year, plus the fact that there has been no recent surge in requests coming through to the tomcat servers. One change that did occur a couple of days before the problem started to occur was that we upgraded the JVM used by the tomcat server from Java 5 (1.5 update 14) to Java 6 (1.6 update 25). We have tried temporarily reverting the JVM version to Java 5 to see if the problem stopped occurring but it did not. Another point to note is that in most cases the tomcat server eventually recovers and
the thread count drops back to normal - we've only had one instance where a tomcat process appears to have crashed because of the thread count increase.
We are running Tomcat 5.5 with commons-httpclient-3.1.jar running against a Java 1.6 update 25 on a Red Hat linux environment.
Please let me know if you can suggest any ideas as to what may be the cause of this issue.
Thanks.
The problem was indeed caused by the fact that only 2 of the 3 httpClient.executeMethod(getMethod) invocations were followed by a call to getMethod.releaseConnection(). Ensuring all 3 httpClient.executeMethod(getMethod) invocations were inside a try/catch block followed by a finally block containing a call to getMethod.releaseConnection() prevented the high thread counts from occurring. Although this code had been in our live system for over a year, it appears that the reason the high thread count issue only recently started occurring was because various search engine crawlers had started hitting the site with lots of URL requests that caused the code where the connection was being used but not subsequently released to execute. Problem solved.