Akka http server dispatcher number constantly increasing - scala

I'm testing an akka http service on AWS ECS. Each instance is added to a load balancer which regularly makes requests to a health check route. Since this is a test environment I can control for no other traffic going to the server. I notice the debug log indicating that the "default dispatcher" number is consistently increasing:
[DEBUG] [01/03/2017 22:33:03.007] [default-akka.actor.default-dispatcher-41200] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
[DEBUG] [01/03/2017 22:33:29.142] [default-akka.actor.default-dispatcher-41196] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
[DEBUG] [01/03/2017 22:33:33.035] [default-akka.actor.default-dispatcher-41204] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
[DEBUG] [01/03/2017 22:33:59.174] [default-akka.actor.default-dispatcher-41187] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
[DEBUG] [01/03/2017 22:34:03.066] [default-akka.actor.default-dispatcher-41186] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
[DEBUG] [01/03/2017 22:34:29.204] [default-akka.actor.default-dispatcher-41179] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
[DEBUG] [01/03/2017 22:34:33.097] [default-akka.actor.default-dispatcher-41210] [akka://default/system/IO-TCP/selectors/$a/0] New connection accepted
This trend is never reversed and will get up into the tens of thousands pretty soon. Is this normal behavior or indicative of an issue?
Edit: I've updated the log snippet to show that the dispatcher thread number goes way beyond what I would expect.
Edit #2: Here is the health check route code:
class HealthCheckRoutes()(implicit executionContext: ExecutionContext)
extends LogHelper {
val routes = pathPrefix("health-check") {
pathEndOrSingleSlash {
complete(OK -> "Ok")
}
}
}

Probably, yes. I think that's the thread name.
If you do a thread dump on the server, does it have a great many open threads?
It looks like your server is leaking a thread per connection.
(It will probably be much easier to debug and diagnose this on your development machine, rather than on the EC2 VM. Try to reproduce it locally.)

For you Question, check this comment:
Akka http server dispatcher number constantly increasing
About dispatcher:
It is no problem to use default dispatcher for operations like health check.
Threads are controlled by the dispatcher you specified, or default-dispatcher if not specified.
default-dispatcher is setting as following, which means the thread pool size is between 8 to 64 or equal to (number of processors * 3).
default-dispatcher {
type = "Dispatcher"
executor = "default-executor"
default-executor {
fallback = "fork-join-executor"
}
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 8
# The parallelism factor is used to determine thread pool size using the
# following formula: ceil(available processors * factor). Resulting size
# is then bounded by the parallelism-min and parallelism-max values.
parallelism-factor = 3.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 64
# Setting to "FIFO" to use queue like peeking mode which "poll" or "LIFO" to use stack
# like peeking mode which "pop".
task-peeking-mode = "FIFO"
}
Dispathcer Document:
http://doc.akka.io/docs/akka/2.4.16/scala/dispatchers.html
Configuration reference:
http://doc.akka.io/docs/akka/2.4.16/general/configuration.html#akka-actor
BTW for operations take a long time and blocks other operations, here is how to specify a custom dispatcher in Akka HTTP for them:
http://doc.akka.io/docs/akka-http/current/scala/http/handling-blocking-operations-in-akka-http-routes.html

According to this akka-http github issue there doesn't seem to be a problem: https://github.com/akka/akka-http/issues/722

Related

Hikari CP connections are suddenly invalidated

The bounty expires in 5 days. Answers to this question are eligible for a +50 reputation bounty.
Habil Ganbarli is looking for an answer from a reputable source.
Hi Stackoverflow family,
So we have an application with Kotlin & Spring boot that uses a single DB instance(1 GB Memory and instance class is db.t3.micro) as PostgreSQL and is hosted in AWS. What happens for the last couple of days is suddenly connections in my pool are invalidated(2-3 times a day) and the pool size drops drastically. In summary:
Let's say everything is normal in Hikari and the connections are closed and added according to the maxliftime which is 30 minutes by default and the log are like below:
HikariPool-1 - Pool stats (total=40, active=0, idle=40, waiting=0)
HikariPool-1 - Fill pool skipped, pool is at sufficient level.
Suddenly most of the connections become invalidated. Let's say 30 out of 40. The connections are closed before they pass their max lifetime and the logs are like below for all closed connections:
HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection#5257d7b2 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
HikariPool-1 - Closing connection org.postgresql.jdbc.PgConnection#7b673105: (connection is dead)
Additionally after these messages followed by multiple of this logs like below:
Add connection elided, waiting 6, queue 13
And the timeout failure stats like below:
HikariPool-1 - Timeout failure stats (total=12, active=12, idle=0, waiting=51)
Finally, I have left with lots of connection timeouts of requests due to the reason that there were no connection available for the most of the requests:
java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms
I have added leak-detection-threshold and it also logs like below during the problem happening:
Connection leak detection triggered for org.postgresql.jdbc.PgConnection#3bb5f155 on thread http-nio-8080-exec-482, stack trace follows
java.lang.Exception: Apparent connection leak detected
The hikari config is like below:
hikari:
data-source-properties: stringtype=unspecified
maximum-pool-size: 40
leak-detection-threshold: 30000
When this problem happens queries in PostgreSQL also take a lot of time: 8-9 seconds and increase up to 15-35 seconds. Some queries even 55-65 seconds(which usually take 1-3 seconds at most in usual times). That is why we think it is not a query issue.
In addition to that some sources suggest using try with resources, however, it is not the case for us as we do not obtain connections manually. In addition to that increasing the max pool size from 20 to 40 also did not help. I would really appreciate any comment or hint as we are dealing with this issue for almost a week.

Spring Boot 1.5.3 Creating more connection than specified in application.properties

I am working on a project where i have dual datasource configured. On testing i have limit the no of max-active connections to five but when i checked on database, i found that application create around 25+ connections.
Code Sample
# Number of ms to wait before throwing an exception if no connection is available.
spring.datasource.tomcat.max-wait=1000
# Maximum number of active connections that can be allocated from this pool at the same time.
spring.datasource.tomcat.max-active=1
spring.datasource.tomcat.max-idle=1
spring.datasource.tomcat.min-idle=1
spring.datasource.tomcat.initial-size=1
# Validate the connection before borrowing it from the pool.
spring.datasource.tomcat.test-on-borrow=true
spring.datasource.tomcat.test-while-idle = true
spring.datasource.tomcat.validation-query = true
spring.datasource.tomcat.time-between-eviction-runs-millis = 360000
spring.rdatasource.tomcat.max-wait=1000
# Maximum number of active connections that can be allocated from this pool at the same time.
spring.rdatasource.tomcat.max-active=1
spring.rdatasource.tomcat.max-idle=1
spring.rdatasource.tomcat.min-idle=1
spring.rdatasource.tomcat.initial-size=1
# Validate the connection before borrowing it from the pool.
spring.rdatasource.tomcat.test-on-borrow=true
spring.rdatasource.tomcat.test-while-idle= true
spring.rdatasource.tomcat.validation-query = true
spring.rdatasource.tomcat.time-between-eviction-runs-millis = 360000
above connection is working fine, but exceeding no of connection to database. User which i am using is limited to 10 connection.
when i hit request to application than i am getting
query wait timeout error with unable to create initial pool size.
I am using tomcat connection pooling
Please provide me the solution so application will run with 10 connections limit which is set at database.

scala, spray, akka - java.lang.OutOfMemoryError: unable to create new native thread

While checking the throughput of spray api.
Scenario: 25 concurrent users
Os: Free BSD
Memory: 2GB
No Of Cores: 2
At around 13 concurrent users i was getting the following error.
[ERROR] [06/29/2015 05:01:56.407] [default-akka.actor.default-dispatcher-2] [ActorSystem(default)] Uncaught error from thread [default-akka.actor.default-dispatcher-2] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at scala.concurrent.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
at scala.concurrent.forkjoin.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1795)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:117)
Akka and Spray Conf changes from default:
akka{
tcp{
register-timeout = 20s
}
}
spray.can {
request-timeout = 30 s
bind-timeout = 30s
unbind-timeout = 5s
registration-timeout = 30s
}
http.spray.can {
server{
pipelining-limit = 50
}
}
What is causing the OutOfMemmoryError. The exception is thrown from the router actor
can't read mind, but probably you do blocking (Await.result or similar) inside actors. ForkJoinPool creates a new thread for every blocked one automatically. So if you have a long-time blocks count_of_threads == count_of_requests (+ every thread holds the references iside call stack), which eventually causes OutOfMemory.
See, Blocking Needs Careful Management
P.S. Here you may find why Await.result (which uses scala.concurrent.blocking inside) leads to unmanageable creation of threads in ForkJoinPool(even regardless of maxParallelism).
Or you creating a lot of ActorSystems, same page of akka documentation states:
An ActorSystem is a heavyweight structure that will allocate 1…N
Threads, so create one per logical application.

Rest server (Play Framework) gets "Read Timed out" exception during load test

We are running a heavy load test (jmeter: 350 threads, 35M total requests) on a rest server using Play Framework and run into the following error after ~2 hour. We remove other components so that request simply take requests and do nothing. Anyone has any idea or simply Play Framework cannot handle heavy load like this?
2014/07/05 11:59:38 WARN - com.company.test.RestTest2: Run TestSQL throw error java.lang.Exception: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.company.dispatcher.RexsterRESTTaskDispatcher.dispatchTask(RexsterRESTTaskDispatcher.java:76)
at com.company.test.RestTest2.runTest(RestTest2.java:375)
at org.apache.jmeter.protocol.java.sampler.JavaSampler.sample(JavaSampler.java:191)
at org.apache.jmeter.threads.JMeterThread.process_sampler(JMeterThread.java:429)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:257)
at java.lang.Thread.run(Thread.java:744)
Part of the application.conf :
....
db.pool.timeout=100000
play {
akka {
akka.loggers = ["akka.event.Logging$DefaultLogger", "akka.event.slf4j.Slf4jLogger"]
loglevel = WARNING
actor {
default-dispatcher = {
fork-join-executor {
parallelism-factor = 64
parallelism-max = 1000
}
}
}
}
}
Had the this error today. It tool me a while to found out that one of the windows (svchost) processes was occupying the 1099 port, which the Jmeter server was trying to use.
I got a hint for this when trying to start the Jmeter-Server.bat file manually. Then, the following PowerShell command provided the details of that process. After closing that process, Jmeter clients started to connect again.
Get-Process -Id (Get-NetTCPConnection -LocalPort 1099).OwningProcess
There a many things to check:
Are you running Test from same machine ? if yes it's a problem
Is your machine TCP stack tuned ?
What is your JVM configuration regarding Xmx as long as your machine memory, CPU ...
What does your test look like ? could you show a screenshot with all elements unfolded ?
I think Play/AKKA can handle this load without problem so I would look into configuration issues.

Handling connection failures in apache-camel

I am writing an apache-camel RabbitMQ consumer. I would like to react somehow to connection problems (i.e. try to reconnect). Is it possible to configure apache-camel to automatically reconnect?
If not, how can I find out that a connection to the queue was interrupted? I've done the following test:
start the queue (and some producer)
start my consumer (it was getting messages as expected)
stop the queue (the messages stopped arriving, as expected, but no exception was thrown)
start the queue (no new messages were received)
I am using camel in Scala (via akka-camel), but a Java solution would be probably also OK
You can pass in the flag automaticRecoveryEnabled=true to the URI, Camel will reconnect if the connection is lost.
For automatic RabbitMQ resource recovery (Connections/Channels/Consumers/Queues/Exchanages/Bindings) when failures occur, check out Lyra (which I authored). Example usage:
Config config = new Config()
.withRecoveryPolicy(new RecoveryPolicy()
.withMaxAttempts(20)
.withInterval(Duration.seconds(1))
.withMaxDuration(Duration.minutes(5)));
ConnectionOptions options = new ConnectionOptions().withHost("localhost");
Connection connection = Connections.create(options, config);
The rest of the API is just the amqp-client API, except your resources are automatically recovered when failures occur.
I'm not sure about camel-rabbitmq specifically, but hopefully there's a way you can swap in your own resource creation via Lyra.
Current camel-rabbitmq just create a connection and the channel when the consumer or producer is started. So it don't have a chance to catch the connection exception :(.