Akka control threadpool threads - scala

Potentially a very silly question--
Is it possible to customize Akka/Scala actors such that you control the threads that are used by the actors? e.g. can you initialize your own set of threads to be used in the thread pool, or otherwise control/modify the threads?

In Akka, the thread pool is managed via a MessageDispatcher instance. You can set the dispatcher you want to actors easily:
class MyActor( dispatcher: MessageDispatcher ) extends Actor {
self.dispatcher = dispatcher
...
}
To provide your own dispatcher, you can extend akka.dispatch.MessageDispatcher (see existing dispatchers implementation for examples). Here you can play directly with the threads.
Of course, it's dangerous to put business logic inside a dispatcher because it may break the actor model and increase the number of concurrency bugs...

I tried to understand it myself, but seams that guys in Akka don't want thread management to be exposed to public.
ThreadPoolConfig - the class that is responsible for creation of ExecutorService instances is a case class with method createExecutorService() declared final!
final def createExecutorService(threadFactory: ThreadFactory): ExecutorService = {
flowHandler match {
case Left(rejectHandler) ⇒
val service = new ThreadPoolExecutor(...)
service
case Right(bounds) ⇒
val service = new ThreadPoolExecutor(...)
new BoundedExecutorDecorator(service, bounds)
}
}
So, I don't see an easy ways to provide your own ExecutorService.

Related

Using custom dispatcher for all user actors in Akka to avoid thread starvation for system actors

Question is about dispatcher configuration in Akka 2.5.x.
I would like to avoid thread starvation issue that may cause nodes to be quarantines because system messages are not delivered. In order to achieve that I would like to create separate dispatcher which is exactly the same as default dispatcher configuration.
I have defined custom dispatcher configuration with the name my-dispatcher {...}. Can I use the following to make Akka use this dispatcher for all user's actors?
akka.actor.deployment {
"/**" { # <- should this work?
dispatcher = my-dispatcher
...
}
Idea came from the following example in documentation:
# all direct children of '/user/actorC' have a dedicated dispatcher
"/actorC/*" {
dispatcher = my-dispatcher
}
So if I replace actorC with ** it will target all actors under /user, I would expect it to work. Does anyone do it this way? Or I need to find another solution?
As far as I checked your approach is working and it overrides dispatcher in all actors under user path.
Although more conventional approach to configure the default dispatcher would by just overriding entries in config under akka.actor.default-dispatcher (you can override only some of them), for example:
akka {
actor {
default-dispatcher {
fork-join-executor {
parallelism-min = 2
}
throughput = 100
}
}
If you insinst to use your dispatcher defined with different path in config as default, you could just override default-dispatcher section with my-dispatcher:
akka.actor.default-dispatcher = ${path.to.my-dispatcher}

Declaring Actor state variables as mutable ones

I am fairly new with Akka framework and Concurrency concepts. And from Akka docs, I understood that only one message in the Actor mailbox would be processed at a time. So single thread would be processing Actor's state at a time. And my doubt is that, so declaring an Actor state/data variable as mutable - 'Var'(Only when 'Val' doesn't fit), will not cause inconsistent Actor states in the case of Concurrency.
I am using Scala for development. In the following Master actor, details of workers is stored in a mutable variable 'workers'. Will it be a problem with concurrency?
class Master extends PersistentActor with ActorLogging {
...
private var workers = Map[String, WorkerState]()
...
}
I think what you are doing is fine. As you said, one of the fundamental guarantees of Akka actors is that a single actor will be handling one message at a time, so there will not be inconsistent Actor states.
Akka actors conceptually each have their own light-weight thread,
which is completely shielded from the rest of the system. This means
that instead of having to synchronize access using locks you can just
write your actor code without worrying about concurrency at all.
http://doc.akka.io/docs/akka/snapshot/general/actors.html
Also, it is a good thing that you're using a var instead of a val with a mutable map :)
Another way to consider coding situations like these is to alter the actor's "state" after each message handled. Eg.:
class Master extends PersistentActor with ActorLogging {
type MyStateType = ... // eg. Map[String, WorkerState], or an immutable case class - of course, feel free to just inline the type...
def receive = handle(initState) // eg. just inline a call to Map.empty
def handle(state: MyStateType): Actor.Receive = LoggingReceive {
case MyMessageType(data) =>
... // processing data - build new state
become(handle(newState))
case ... // any other message types to be handled, etc.
}
... // rest of class implementation
}
While it is true that there is still mutable state happening here (in this case, it is the state of the actor as a whole - it becomes effectively a "non-finite state machine"), it feels better contained/hidden (to me, at least), and the "state" (or "workers") available to the actor for any given message is treated as entirely immutable.

Akka Circuit Breaker sharing between actors

I have a shared external resource (say a file store) which a pool of actors is using. Each time a new request is made to the file store a new actor is created to fill the request with a reference to the external system passed in.
The current approach where I create a circuit breaker per actor defeats the purpose as a new actor is created for each 'request' which performs a sequence of operations on this external resource.
Not ideal - too many CB instances;
class MySharedResourceActor(externalResourceRef: ExtSystem) extends Actor with ActorLogging {
val breaker = new CircuitBreaker(context.system.scheduler,
maxFailures = 5,
callTimeout = 10.seconds,
resetTimeout = 1.minute).onOpen(notifyMeOnOpen())
def receive = {
case SomeExternalOp =>
breaker.withSyncCircuitBreaker(dangerousCallToExternalSystem()) pipeTo sender()
}
}
Better Approach - pass in a CB ref;
class MySharedResourceActor(externalResourceRef: ExtSystem, val breaker: CircuitBreaker) extends Actor with ActorLogging {
def receive = {
case SomeExternalOp =>
breaker.withSyncCircuitBreaker(dangerousCallToExternalSystem()) pipeTo sender()
}
}
Is it safe to pass in a Circuit-breaker reference from the parent actor which also maintains a reference to the external system and share this circuit breaker between multiple actors in a router pool, dynamically created or otherwise?
Yes it's safe to follow this approach. We share circuit breakers across related actors (pooled or otherwise) that are making http calls to the same host. If you didn't do this, and let each instance have it's own breaker, even if they were long lived instances, each one would need to hit the fail threshold separately before the breaker opened and I doubt this is the behavior you want. By sharing, it allows multiple actors to contribute stats (fails, successes) into the breaker so that the breaker is representative of all calls that have gone into the resource.
In looking at Akka's code, they are using atomics inside of the circuit breaker to represent state and handle state transitions, so they should be safe to use in multiple actors.

How to start Akka Actors since 2.0?

I'm using Akka Actors and I'm trying to update my code to use the latest 2.0 milestone. The API changed somewhat, for example the creation of Actors now works via something called an ActorSystem.
Starting and stopping actors changed as well - the latter is available via the ActorSystems methods .stop(..) and .shutdown(). But I can for the life of me not figure out how to start them...
The documentation is good, but seems to be missing some important points. I feel kinda stupid to ask, but how do you start actors in your Akka 2.0 environment? If I understood correctly actors who have a 'parent' are started when this parent is started - but what about the top level actor(s)?
In Akka 2.0, there is no need for a start() method because Actors are started as soon as you instantiate them in the context of an ActorSystem (or another Actor) -- but you need to instantiate them with one of the provided methods of ActorSystem or an Actor's context.
So, for example, if you have an Actor subclass called MyClass, you could start it with:
val system = ActorSystem()
val myActor = system.actorOf(Props[MyActor])
or, if your actor took constructor arguments:
val myActor = system.actorOf(Props(new MyActor("arg1"))
or, if you were in the body of another Actor,
val myActor = context.actorOf(Props(new Actor("arg1"))
and then your actor could immediately receive messages, e.g.
myActor ! MyMessage
Even your top level actors are started immediately, as all Actors in 2.0 are automatically in a supervision hierarchy. As soon as the actor is instantiated with the ActorSystem, it's ready to receive messages.

garbage collecting scala actors

Scenario: I have this code:
class MyActor extends Actor {
def act() {
react {
case Message() => println("hi")
}
}
}
def meth() {
val a = new MyActor
a.start
a ! Message()
}
is the MyActor instance garbage collected? if not, how do i make sure it is? if I create an ad-hoc actor (with the 'actor' method), is that actor GCed?
This thread on the scala-user mailing list is relevant.
There Phillip Haller mentions using a particular scheduler (available in Scala 2.8) to enable termination of an Actor before garbage collection, either on a global or per-actor basis.
Memory leaks with the standard Actor library has lead to other Actor implementations. This was the reason for David Pollak and Jonas Boner's Actor library for Lift that you can read much more about here: http://blog.lostlake.org/index.php?/archives/96-Migrating-from-Scala-Actors-to-Lift-Actors.html
Have you tried adding a finalize method to see whether it is? I think the answer here is that the actors subsystem behaves no different to how you would expect it to: it does not cache any reference to your actor except in a thread-local for the duration of processing.
I would therefore expect that your actor is a candidate for collection (assuming the subsystem correctly clears out the ThreadLocal reference after the actor has processed the message which it does indeed appear to do in the Reaction.run method).