Why is Akka creating so many dispatchers? - scala

I'm using Akka for several different Actors. The work done by these Actors is non-blocking. I noticed something odd - the number of dispatchers scales with the number of Actors I'm creating. If I create hundreds of actors, I find myself with hundreds of dispatchers, sometimes over 1000.
This is, even though, most of the dispatchers look like this:
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000003d503de50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
(basically, doing nothing most of the time)
I initialize dispatchers with calls like this:
ActorMaterializer(ActorMaterializerSettings(system).withDispatcher(s"akka.dispatchers.$dispatcherName"))(system))
My dispatcher configuration is below (we have different dispatchers for different actors):
dispatchers {
connector-actor-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 200
}
throughput = 1
}
http-actor-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 1
parallelism-factor = 1.0
parallelism-max = 64
task-peeking-mode = "FIFO"
}
throughput = 1
}
commands-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 1
parallelism-factor = 1.0
parallelism-max = 64
task-peeking-mode = "FIFO"
}
throughput = 1
}
http-server-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 1
}
throughput = 1
}
http-client-dispatcher-low {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 1
}
throughput = 1
}
http-client-dispatcher-high {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 1
}
throughput = 1
}
http-client-dispatcher-parser {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 200
}
throughput = 1
}
}
}
What am I missing?

It seems that this mostly answered in the comments above, but to collate them into an "Answer": it appears that "so many dispatchers" are getting created because you are creating them explicitly in your config.
Also, when you give an example of a "dispatcher" you are actually showing a thread stack trace. So you might be confusing threads with dispatchers. And many of the the dispatchers you are creating have large thread counts. As #Tim says "If you are creating hundreds of actors and each actor has its own dispatcher with many threads, you are going to get a lot of threads!"
But that's really a tuning question. The answer to the direct question is that generally (with the exception of the system dispatcher), dispatchers are generally only created when you specifically ask for them to be created. And it appears that you are creating many of them. And that you also have enormous thread counts for each dispatcher.
As discussed, the general best practice is to have one dispatcher for non-blocking actors and another dispatcher with blocking actors. Each dispatcher will also generally only need a small number of threads. There are some edge cases where you might want additional dedicated dispatchers for particularly sensitive or badly behaving actors, but it depends on your actors and your application.

Related

Why IoScheduler using ScheduledExecutorService with the poolCoreSize is 1?

I found the IoScheduler.createWorker() will create a NewThreadWorker immediately if there is no cached NewThreadWorker,This may result in OutOfMemoryError.
If I put 1000 count of work to IoScheduler one-time,it will create 1000 count of NewThreadWorker and ScheduledExecutorService.
private void submitWorkers(int workerCount) {
for (int i = 0; i < workerCount; i++) {
Single.fromCallable(new Callable<String>() {
#Override
public String call() throws Exception {
Thread.sleep(1000);
return "String-call(): " + Thread.currentThread().hashCode();
}
})
.subscribeOn(Schedulers.io())
.subscribe(new Consumer<String>() {
#Override
public void accept(String s) throws Exception {
// TODO
}
});
}
}
If I set the workerCount with 1000, I received a OutOfMemoryError,I want to know why IoScheduler use NewThreadWorker with ScheduledExecutorService but just execute a single work。
Every time a new work is coming it will create a NewThreadWorker and ScheduledExecutorService if there is no cached NewThreadWorker,Why is it designed to be such a process?
The standard workers of RxJava each use a dedicated thread to avoid excessive thread hopping and work migration in flows.
The standard IO scheduler uses an unbounded number of worker threads because it's main use is to allow blocking operations to block a worker thread while other operations can commence on other worker threads. The difference from newThread is that there is a thread reuse allowed once a worker is returned to an internal pool.
If there was a limit on the number of threads, it would drastically increase the likelihood of deadlocks due to resource exhaustion. Also, unlike the computation scheduler, there is no good default number for this limit: 1, 10, 100, 1000?
There are several ways to work around this problem, such as:
use Schedulers.from() with an arbitrary ExecutorService which you can limit and configure as you wish,
use ParallelScheduler from the Extensions project and define an arbitrary large but fixed pool of workers.

Akka dispatcher not configured exception in Play/Scala application

I am doing a disk intensive operation and I want to use my own thread-pool for it and not the default one.
I read the following link, and I am facing the exact same problem
Akka :: dispatcher [%name%] not configured, using default-dispatcher
But my config file is slightly different, I have tried the suggestion but it not working.
My application.conf in play has the following
jpa-execution-context {
thread-pool-executor {
core-pool-size-factor = 10.0
core-pool-size-max = 10
}
}
And then in my test code I do the following, but I get an exception. Here is the test method
private def testContext():Future[Int] = {
val system = ActorSystem.create()
val a = ActorSystem.create()
implicit val executionContext1 = system.dispatchers.lookup("jpa-execution-context")
Future{logger.error("inside my new thread pool wonderland");10}{executionContext1}
}
Here is the exception:
akka.ConfigurationException: Dispatcher [jpa-execution-context] not configured
I think you forgot a few elements in your configuration:
jpa-execution-context {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 10.0
core-pool-size-max = 10
}
}
Doc link: http://doc.akka.io/docs/akka/current/scala/dispatchers.html#types-of-dispatchers

How many threads are being used by this dispatcher?

I have a dummy application.conf as follows:
configuration {
default-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-min = 4
core-pool-size-factor = 2.0
core-pool-size-max = 8
}
throughput = 10
mailbox-capacity = -1
mailbox-type = ""
}
}
So, there'll be 4 x 2 = 8 threads in the pool, always, and a maximum of 8 x 2 = 16 threads.
Now, if I understand correctly, dispatcher is responsible for picking an actor and a bunch of messages from the mailbox and processing them.
Next, I spawn just one child actor for a supervisor as follows:
val greeter: ActorRef = context.actorOf(GreetingsActor.propsWithDispatcher)
What I'd like to know is.. since there's only one instance of the child actor, only one thread from the pool would ever be in use.
Is my understanding correct?
Yes, a single actor can process 1 message at a time on a single thread. When the message is processed, the thread is returned to thread-pool, thus available for scheduling to other messages and actors.

Akka Remoting Failures with Amazon EC2

I'm building a library with Akka actors in Scala to do some large-scale data crunching.
I'm running my code on Amazon EC2 spot instances using StarCluster. The program is unstable because the actor remoting sometimes drops:
While the code is running, nodes will disconnect one by one in a few minutes. The nodes say something like:
[ERROR] [07/16/2014 17:40:06.837] [slave-akka.actor.default-dispatcher-4] [akka://slave/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fslave%40master%3A2552-0/endpointWriter] AssociationError [akka.tcp://slave#node005:2552] -> [akka.tcp://slave#master:2552]: Error [Association failed with [akka.tcp://slave#master:2552]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://slave#master:2552]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: master
and
[WARN] [07/16/2014 17:30:05.548] [slave-akka.actor.default-dispatcher-12] [Remoting] Tried to associate with unreachable remote address [akka.tcp://slave#master:2552]. Address is now quarantined, all messages to this address will be delivered to dead letters.
Even though I can ping between the nodes just fine.
I've been trying to fix this; I've figured it's some configuration setting. The Akka remoting documentation even says,
However in cloud environments, such as Amazon EC2, the value could be
increased to 12 in order to account for network issues that sometimes
occur on such platforms.
However, I've set that and beyond and still no luck in fixing the issue. Here are my current remoting configurations:
akka {
actor {
provider = "akka.remote.RemoteActorRefProvider"
}
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
port = 2552
# for modelling
#send-buffer-size = 50000000b
#receive-buffer-size = 50000000b
#maximum-frame-size = 25000000b
send-buffer-size = 5000000b
receive-buffer-size = 5000000b
maximum-frame-size = 2500000b
}
watch-failure-detector.threshold = 100
acceptable-heartbeat-pause = 20s
transport-failure-detector {
heartbeat-interval = 4 s
acceptable-heartbeat-pause = 20 s
}
}
log-dead-letters = off
}
and I deploy my actors like so all from the master node:
val o2m = system.actorOf(Props(classOf[IntOneToMany], p), name = "o2m")
val remote = Deploy(scope = RemoteScope(Address("akka.tcp", "slave", args(i), 2552)))
val b = system.actorOf(Props(classOf[IntBoss], o2m).withDeploy(remote), name = "boss_" + i)
etc.
Can anyone point me to a mistake I'm making/how I can fix this problem and stop nodes from disconnecting? Alternatively, some solution of just re-launching the actors if they are disconnected also works; I don't care about dropped messages much. In fact I thought this was supposed to be easily configurable behavior but I'm finding it difficult to find the right place to look for that.
Thank you
at least the properties syntax was wrong: acceptable-heartbeat-pause should be under watch-failure-detector, (yours are at the same level). they should be like below:
watch-failure-detector {
threshold = 100
acceptable-heartbeat-pause = 20 s
}
transport-failure-detector {
heartbeat-interval = 4 s
acceptable-heartbeat-pause = 20 s
}

Scala program exiting before the execution and completion of all Scala Actor messages being sent. How to stop this?

I am sending my Scala Actor its messages from a for loop. The scala actor is receiving the
messages and getting to the job of processing them. The actors are processing cpu and disk intensive tasks such as unzipping and storing files. I deduced that the Actor part is working fine by putting in a delay Thread.sleep(200) in my message passing code in the for loop.
for ( val e <- entries ) {
MyActor ! new MyJob(e)
Thread.sleep(100)
}
Now, my problem is that the program exits with a code 0 as soon as the for loop finishes execution. Thus preventing my Actors to finish there jobs. How do I get over this? This may be really a n00b question. Any help is highly appreciated!
Edit 1:
This solved my problem for now:
while(MyActor.getState != Actor.State.Terminated)
Thread.sleep(3000)
Is this the best I can do?
Assume you have one actor you're want to finish its work. To avoid sleep you can create a SyncVar and wait for it to be initialized in the main thread:
val sv = new SyncVar[Boolean]
// start the actor
actor {
// do something
sv.set(true)
}
sv.take
The main thread will wait until some value is assigned to sv, and then be woken up.
If there are multiple actors, then you can either have multiple SyncVars, or do something like this:
class Ref(var count: Int)
val numactors = 50
val cond = new Ref(numactors)
// start your actors
for (i <- 0 until 50) actor {
// do something
cond.synchronized {
cond.count -= 1
cond.notify()
}
}
cond.synchronized {
while (cond.count != 0) cond.wait
}