Avoid concurrent execution of Akka actor - scala

I'm currently sending a message to an Akka actor every second for it to perform a task.
If that task (function) is still running when a new message is received by the actor, I would like the actor to do nothing. Basically, I want the actor function to be run only if it's not already running.
What's a good way to do this? Should I use Akka actors or do it another way?
Cheers

Actors process a message at a time. The situation you describe cannot happen.

Akka actors process their messages asynchronously one-by-one, so you can only drop/ignore "expired" messages to avoid extra processing and OutOfMemoryException-s due to actor's Mailbox overflow.
You can ignore expired (more than 1 second old in your case) messages inside your actor:
case class DoWork(createdTime: Long = System.currentTimeMillis)
final val messageTimeout = 1000L // one second
def receive = {
case DoWork(createdTime) =>
if((System.currentTimeMillis - createdTime) < messageTimeout) { doWork() }
}
Or you can create a custom Mailbox which can drop expired messages internally.
Of course, as Robin Green already mentioned, actors in general are not supposed to run long-running operations internally so this approach can be suitable only if your actor doesn't need to process other kind of messages (they will not be processed in time). And in case of high CPU demands you may consider to move your actor on a separate dispatcher.

Related

Restore previous actor state after exception

I have an actor which receive Int and accumulate result to the buffer. Some times actor can throw exception. What is the best way to restore previous state of buffer into actor after exception ?
BR.
In order to control what happens you need to set the supervisor strategy on the parent actor.
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case e: Exception => {
println(s"got exception ~ ${e.getMessage()}")
Resume // Or Continue, Escalate, Restart, Stop
}
}
If you would like to handle the exception then resume the actor then your handler would return Resume
When resuming, the actor state remains unchanged.
In addition to manub's answer, you should also consider separating
the part that accumulates the value, and
the part that may fail
into two actors as follows:
The accumulating actor should create a child actor. This child actor does the job that can potentially fail, and then sends the result of its job as a message back to its parent. The parent actor receives the result of the job and does the accumulating, or a notification that the child actor has failed. This way, you can, for example, restart failed jobs using various supervision strategies.
For more information on such an approach, see for example this answer Handling Faults in Akka actors
In order to be able to recover the state of a stateful actor you should consider Akka Persistence. You will require a durable store (Cassandra or LevelDB for example) to persist state, and the framework will handle restoring the state for you (provided that you have a PersistentActor).
http://doc.akka.io/docs/akka/current/scala/persistence.html explains in detail how to achieve this.
You can set the Supervisor strategy to Resume. When your Actor throws an exception, the Supervisor will handle the exception and your Actor will throw away the current message and move on to the next one. (Newer versions of Akka let you set a maximum number of retires on the same message)
If instead, you would like your Actor to continue processing the message that caused the exception, you can use a try-catch-finally block.

How much processing should be done in an Akka actor ?

I've to do some processing (run some jobs) on a regular interval (10min, 30min, 60min ...). I've different actors for each. I'm sending messages to self inside an actor to trigger the processing. The processing done inside an actor could take anywhere from a 10ms to 30s or more in some cases. My question from an Actor design perspective is how "heavy" can the processing inside an Actor receive be ? Or it doesn't really matter ?
It's fine to have actors that take arbitrarily long to process a message. But you wouldn't want to do that in an actor that is responsible for the timely processing of further messages.
A common pattern is to have a manager actor that receives messages and farms out work to worker actors. It's no problem if the worker actors take a long time, because the manager actor can simply create another worker actor if it needs one.
Think about what needs to be responsive and what will take a long time, and make sure a given actor isn't doing both.
But also, if you have something to do that is going to take 30 seconds, you might want to break it up into multiple actors somehow so that multiple cores can work on it in parallel. Another common pattern is for an actor given a task to look at how big it is and if it is over some threshold, spawn multiple actors and give each part of the job. Each of those actors then does the same thing, deciding whether to do the job itself or split up the work among some child actors. In this way trees of actors are formed on the fly based on the size of the job, and when the job is done the actors disappear.

Using Futures in Akka Actors

I'm just starting to learn Akka Actors in Scala. My understanding is that messages received by an Actor are queued in an Actor's mailbox, and processed one at a time. By processing messages one at a time, concurrency issues (race conditions, deadlocks) are mitigated.
But what happens if the Actor creates a future to do the work associated with a message? Since the future is async, the Actor could begin processing the next several messages while the future associated with the prior message is still running. Wouldn't this potentially create race conditions? How can one safely use futures within an Actor's receive() method to do long running tasks?
The safest way to use futures within an actor is to only ever use pipeTo on the future and send its result as a message to an actor (possibly the same actor).
import akka.pattern.pipe
object MyActor {
def doItAsynchronously(implicit ec: ExecutionContext): Future[DoItResult] = {
/* ... */
}
}
class MyActor extends Actor {
import MyActor._
import context.dispatcher
def receive = {
case DoIt =>
doItAsynchronously.pipeTo(self)
case DoItResult =>
// Got a result from doing it
}
}
This ensures that you won't mutate any state within the actor.
Remember two things
The notion behind the term Future(a special actor) is that we can create an actor for any result while it(the result) is still being computed, started or finished but we can't have an address for that special actor or future.
Suppose I want to buy something (my result is buying something, and the process it to initiate steps to start buying procedure) we can create an actor for the result (buy) but if there is any problem and we can't buy the thing we will have an exception instead of the result.
Now how the above two things fit is explained below-
Say we need to compute the factorial of 1 billion we may think that it will take a lot of time to compute but we get the Future immediately it will not take time to produce the Future (" Since the future is async, the Actor could begin processing the next several messages while the future associated with the prior message is still running."). Next, we can pass this future, store it and assign it.
Hope so you understand what I'm trying to say.
Src : https://www.brianstorti.com/the-actor-model/
If you need to mutate state in futures without blocking incoming messages you might want to reconsider redesigning your actor model. I would introduce separate actors for each task you would use those futures on. After all, an actor's main task is to maintain its state without letting it escape thus providing safe concurrency. Define an actor for those long running task whose responsibility is only to take care of that.
Instead of taking care of the state manually you might want to consider using akka's FSM so you get a much cleaner picture of what changes when. I presonally prefer this approach to ugly variables when I'm dealing with more complex systems.

Akka Kill vs. Stop vs. Poison Pill?

Newbie question of Akka - I'm reading over Akka Essentials, could someone please explain the difference between Akka Stop/Poison Pill vs. Kill ? The book offers just a small explaination "Kill is synchronous vs. Poison pill is asynchronous." But in what way? Does the calling actor thread lock during this time? Are the children actors notified during kill, post-stop envoked, etc? Example uses of one concept vs. the other?
Many thanks!
Both stop and PoisonPill will terminate the actor and stop the message queue. They will cause the actor to cease processing messages, send a stop call to all its children, wait for them to terminate, then call its postStop hook. All further messages are sent to the dead letters mailbox.
The difference is in which messages get processed before this sequence starts. In the case of the stop call, the message currently being processed is completed first, with all others discarded. When sending a PoisonPill, this is simply another message in the queue, so the sequence will start when the PoisonPill is received. All messages that are ahead of it in the queue will be processed first.
By contrast, the Kill message causes the actor to throw an ActorKilledException which gets handled using the normal supervisor mechanism. So the behaviour here depends on what you've defined in your supervisor strategy. The default is to stop the actor. But the mailbox persists, so when the actor restarts it will still have the old messages except for the one that caused the failure.
Also see the 'Stopping an Actor', 'Killing an Actor' section in the docs:
http://doc.akka.io/docs/akka/snapshot/scala/actors.html
And more on supervision strategies:
http://doc.akka.io/docs/akka/snapshot/scala/fault-tolerance.html
Use PoisonPill whenever you can. It is put on the mailbox and is consumed like any other message. You can also use "context.stop(self)" from within an actor.
PoisonPill asynchronously stops the actor after it’s done with all messages that were received into mailbox, prior to PoisonPill.
You can use both actor stop and poison pill to stop processing of actors, and kill to terminate the actor in its entirety.
x.stop is a call you make in akka receive method, will only replace actor state with new actor after calling postStop.
x ! PoisonPill is a method that you pass to actor to stop processing when the actor is running(Recommended). will also replace actor state after calling postStop.
x.kill will terminate the actor and will remove the actor in the actor Path and replace the entire actor with a new actor.

An Actor "queue"?

In Java, to write a library that makes requests to a server, I usually implement some sort of dispatcher (not unlike the one found here in the Twitter4J library: http://github.com/yusuke/twitter4j/blob/master/twitter4j-core/src/main/java/twitter4j/internal/async/DispatcherImpl.java) to limit the number of connections, to perform asynchronous tasks, etc.
The idea is that N number of threads are created. A "Task" is queued and all threads are notified, and one of the threads, when it's ready, will pop an item from the queue, do the work, and then return to a waiting state. If all the threads are busy working on a Task, then the Task is just queued, and the next available thread will take it.
This keeps the max number of connections to N, and allows at most N Tasks to be operating at the same time.
I'm wondering what kind of system I can create with Actors that will accomplish the same thing? Is there a way to have N number of Actors, and when a new message is ready, pass it off to an Actor to handle it - and if all Actors are busy, just queue the message?
Akka Framework is designed to solve this kind of problems, and is exactly what you're looking for.
Look thru this docu - there're lots of highly configurable dispathers (event-based, thread-based, load-balanced, work-stealing, etc.) that manage actors mailboxes, and allow them to work in conjunction. You may also find interesting this blog post.
E.g. this code instantiates new Work Stealing Dispatcher based on the fixed thread pool, that fulfils load balancing among the actors it supervises:
val workStealingDispatcher = Dispatchers.newExecutorBasedEventDrivenWorkStealingDispatcher("pooled-dispatcher")
workStealingDispatcher
.withNewThreadPoolWithLinkedBlockingQueueWithUnboundedCapacity
.setCorePoolSize(16)
.buildThreadPool
Actor that uses the dispatcher:
class MyActor extends Actor {
messageDispatcher = workStealingDispatcher
def receive = {
case _ =>
}
}
Now, if you start 2+ instances of the actor, dispatcher will balance the load between the mailboxes (queues) of the actors (actor that has too much messages in the mailbox will "donate" some to the actors that has nothing to do).
Well, you have to see about the actors scheduler, as actors are not usually 1-to-1 with threads. The idea behind actors is that you may have many of them, but the actual number of threads will be limited to something reasonable. They are not supposed to be long running either, but rather quickly answering to messages they receive. In short, the architecture of that code seems to be wholly at odds with how one would design an actor system.
Still, each working actor may send a message to a Queue actor asking for the next task, and then loop back to react. This Queue actor would receive either queueing messages, or dequeuing messages. It could be designed like this:
val q: Queue[AnyRef] = new Queue[AnyRef]
loop {
react {
case Enqueue(d) => q enqueue d
case Dequeue(a) if q.nonEmpty => a ! (q dequeue)
}
}