Akka + WithinTimeRange - scala

I've testing the fault tolerant system of akka and so far it's been good when talking about retrying to send a msg according the maxNrOfRetries specified.
However, it does not restart the actor within the given time range, it restarts all at once, ignoring the within time range.
I tried with AllForOneStrategy and OneForOneStrategy but does not change anything.
Trying to follow this blog post: http://letitcrash.com/post/23532935686/watch-the-routees, this is the code I've been working.
class Supervisor extends Actor with ActorLogging {
var replyTo: ActorRef = _
val child = context.actorOf(
Props(new Child)
.withRouter(
RoundRobinPool(
nrOfInstances = 5,
supervisorStrategy =
AllForOneStrategy(maxNrOfRetries = 3, withinTimeRange = 10.second) {
case _: NullPointerException => Restart
case _: Exception => Escalate
})), name = "child-router")
child ! GetRoutees
def receive = {
case RouterRoutees(routees) =>
routees foreach context.watch
case "start" =>
replyTo = sender()
child ! "error"
case Terminated(actor) =>
replyTo ! -1
context.stop(self)
}
}
class Child extends Actor with ActorLogging {
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info("***** RESTARTING *****")
message foreach{ self forward }
}
def receive = LoggingReceive {
case "error" =>
log.info("***** GOT ERROR *****")
throw new NullPointerException
}
}
object Boot extends App {
val system = ActorSystem()
val supervisor = system.actorOf(Props[Supervisor], "supervisor")
supervisor ! "start"
}
Am I doing anything wrong to accomplish that?
EDIT
Actually, I misunderstood the purpose of the withinTimeRange.
To schedule my retries in a time range, I'm doing the following:
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info("***** RESTARTING *****")
message foreach { msg =>
context.system.scheduler.scheduleOnce(30.seconds, self, msg)
}
}
It seems to work ok.

I think you have misunderstood the purpose of the withinTimeRange arg. That value is supposed to be used in conjunction with maxNrOfRetries to provide a window in which to support the limiting of the number of retries. For example, as you have specified, the implication is that the supervisor will no longer restart an individual child if that child needs to be restarted more than 3 times in 10 seconds.

From docs:
maxNrOfRetries - the number of times a child actor is allowed to be
restarted, negative value means no limit, if the limit is exceeded the
child actor is stopped
withinTimeRange - duration of the time window
for maxNrOfRetries, Duration.Inf means no window
Your code means that when any child fails with NullPointerException more than 3 times within 10 seconds it will not be restarted again. Because of AllForOneStrategy after first Routee fails all routees are restarted. And because you've overridden preRestart to resend failed message this situation repeats again until reaches 3 failures within 10 seconds(which is achieved in less than a second).

Related

Send message to actor after restart from Supervisor

I am using BackoffSupervisor strategy to create a child actor that has to process some message. I want to implement a very simple restart strategy, in which in case of exception:
Child propagates failing message to supervisor
Supervisor restarts child and sends the failing message again.
Supervisor gives up after 3 retries
Akka persistence is not an option
So far what I have is this:
Supervisor definition:
val childProps = Props(new SenderActor())
val supervisor = BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = cmd.hashCode.toString,
minBackoff = 1.seconds,
maxBackoff = 2.seconds,
randomFactor = 0.2
)
.withSupervisorStrategy(
OneForOneStrategy(maxNrOfRetries = 3, loggingEnabled = true) {
case msg: MessageException => {
println("caught specific message!")
SupervisorStrategy.Restart
}
case _: Exception => SupervisorStrategy.Restart
case _ ⇒ SupervisorStrategy.Escalate
})
)
val sup = context.actorOf(supervisor)
sup ! cmd
Child actor that is supposed to send the e-mail, but fails (throws some Exception) and propagates Exception back to supervisor:
class SenderActor() extends Actor {
def fakeSendMail():Unit = {
Thread.sleep(1000)
throw new Exception("surprising exception")
}
override def receive: Receive = {
case cmd: NewMail =>
println("new mail received routee")
try {
fakeSendMail()
} catch {
case t => throw MessageException(cmd, t)
}
}
}
In the above code I wrap any exception into custom class MessageException that gets propagated to SupervisorStrategy, but how to propagate it further to the new child to force reprocessing? Is this the right approach?
Edit. I attempted to resent the message to the Actor on preRestart hook, but somehow the hook is not being triggered:
class SenderActor() extends Actor {
def fakeSendMail():Unit = {
Thread.sleep(1000)
// println("mail sent!")
throw new Exception("surprising exception")
}
override def preStart(): Unit = {
println("child starting")
}
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
reason match {
case m: MessageException => {
println("aaaaa")
message.foreach(self ! _)
}
case _ => println("bbbb")
}
}
override def postStop(): Unit = {
println("child stopping")
}
override def receive: Receive = {
case cmd: NewMail =>
println("new mail received routee")
try {
fakeSendMail()
} catch {
case t => throw MessageException(cmd, t)
}
}
}
This gives me something similar to following output:
new mail received routee
caught specific message!
child stopping
[ERROR] [01/26/2018 10:15:35.690]
[example-akka.actor.default-dispatcher-2]
[akka://example/user/persistentActor-4-scala/$a/1962829645] Could not
process message sample.persistence.MessageException:
Could not process message <stacktrace>
child starting
But no logs from preRestart hook
The reason that the child's preRestart hook is not invoked is because Backoff.onFailure uses BackoffOnRestartSupervisor underneath the covers, which replaces the default restart behavior with a stop-and-delayed-start behavior that is consistent with the backoff policy. In other words, when using Backoff.onFailure, when a child is restarted, the child's preRestart method is not called because the underlying supervisor actually stops the child, then starts it again later. (Using Backoff.onStop can trigger the child's preRestart hook, but that's tangential to the present discussion.)
The BackoffSupervisor API doesn't support the automatic resending of a message when the supervisor's child restarts: you have to implement this behavior yourself. An idea for retrying messages is to let the BackoffSupervisor's supervisor handle it. For example:
val supervisor = BackoffSupervisor.props(
Backoff.onFailure(
...
).withReplyWhileStopped(ChildIsStopped)
).withSupervisorStrategy(
OneForOneStrategy(maxNrOfRetries = 3, loggingEnabled = true) {
case msg: MessageException =>
println("caught specific message!")
self ! Error(msg.cmd) // replace cmd with whatever the property name is
SupervisorStrategy.Restart
case ...
})
)
val sup = context.actorOf(supervisor)
def receive = {
case cmd: NewMail =>
sup ! cmd
case Error(cmd) =>
timers.startSingleTimer(cmd.id, Replay(cmd), 10.seconds)
// We assume that NewMail has an id field. Also, adjust the time as needed.
case Replay(cmd) =>
sup ! cmd
case ChildIsStopped =>
println("child is stopped")
}
In the above code, the NewMail message embedded in the MessageException is wrapped in a custom case class (in order to easily distinguish it from a "normal"/new NewMail message) and sent to self. In this context, self is the actor that created the BackoffSupervisor. This enclosing actor then uses a single timer to replay the original message at some point. This point in time should be far enough in the future such that the BackoffSupervisor can potentially exhaust SenderActor's restart attempts, so that the child can have ample opportunity to get in a "good" state before it receives the resent message. Obviously this example involves only one message resend regardless of the number of child restarts.
Another idea is to create a BackoffSupervisor-SenderActor pair for every NewMail message, and have the SenderActor send the NewMail message to itself in the preStart hook. One concern with this approach is the cleaning up of resources; i.e., shutting down the BackoffSupervisors (which will, in turn, shut down their respective SenderActor children) when the processing is successful or when the child restarts are exhausted. A map of NewMail ids to (ActorRef, Int) tuples (in which the ActorRef is a reference to a BackoffSupervisor actor, and the Int is the number of restart attempts) would be helpful in this case:
class Overlord extends Actor {
var state = Map[Long, (ActorRef, Int)]() // assuming the mail id is a Long
def receive = {
case cmd: NewMail =>
val childProps = Props(new SenderActor(cmd, self))
val supervisor = BackoffSupervisor.props(
Backoff.onFailure(
...
).withSupervisorStrategy(
OneForOneStrategy(maxNrOfRetries = 3, loggingEnabled = true) {
case msg: MessageException =>
println("caught specific message!")
self ! Error(msg.cmd)
SupervisorStrategy.Restart
case ...
})
)
val sup = context.actorOf(supervisor)
state += (cmd.id -> (sup, 0))
case ProcessingDone(cmdId) =>
state.get(cmdId) match {
case Some((backoffSup, _)) =>
context.stop(backoffSup)
state -= cmdId
case None =>
println(s"${cmdId} not found")
}
case Error(cmd) =>
val cmdId = cmd.id
state.get(cmdId) match {
case Some((backoffSup, numRetries)) =>
if (numRetries == 3) {
println(s"${cmdId} has already been retried 3 times. Giving up.")
context.stop(backoffSup)
state -= cmdId
} else
state += (cmdId -> (backoffSup, numRetries + 1))
case None =>
println(s"${cmdId} not found")
}
case ...
}
}
Note that SenderActor in the above example takes a NewMail and an ActorRef as constructor arguments. The latter argument allows the SenderActor to send a custom ProcessingDone message to the enclosing actor:
class SenderActor(cmd: NewMail, target: ActorRef) extends Actor {
override def preStart(): Unit = {
println(s"child starting, sending ${cmd} to self")
self ! cmd
}
def fakeSendMail(): Unit = ...
def receive = {
case cmd: NewMail => ...
}
}
Obviously the SenderActor is set up to fail every time with the current implementation of fakeSendMail. I'll leave the additional changes needed in SenderActor to implement the happy path, in which SenderActor sends a ProcessingDone message to target, to you.
In the good solution that #chunjef provides, he alert about the risk of schedule a job resend before the backoff supervisor has started the worker
This enclosing actor then uses a single timer to replay the original message at some point. This point in time should be far enough in the future such that the BackoffSupervisor can potentially exhaust SenderActor's restart attempts, so that the child can have ample opportunity to get in a "good" state before it receives the resent message.
If this happens, the scenario will be jobs going to dead letters and no further progress will be done.
I've made a simplified fiddle with this scenario.
So, the schedule delay should be larger than the maxBackoff, and this could represent an impact in job completion time.
A possible solution to avoid this scenario is making the worker actor to send a message to his father when is ready to work, like here.
The failed child actor is available as the sender in your supervisor strategy. Quoting https://doc.akka.io/docs/akka/current/fault-tolerance.html#creating-a-supervisor-strategy:
If the strategy is declared inside the supervising actor (as opposed
to within a companion object) its decider has access to all internal
state of the actor in a thread-safe fashion, including obtaining a
reference to the currently failed child (available as the sender of
the failure message).
Sending emails is a dangerous operation with some third party software in your case. Why not to apply Circuit Breaker pattern and skip the sender actor entirely? Also, you can still have an actor (with some Backoff Supervisor) and Circuit Breaker inside it (if that makes sense for you).

Retry on minor Exceptions for a long-living akka actor

I have an actor that is created at application startup as a child of another actor and receives a message once per day from the parent to perform operation to fetch some files from some SFTP server.
Now, there might be some minor temporary connection exceptions that cause the operation to fail. In this case, a retry is needed.
But there might be a case in which exception is thrown and is not going to be resolved on a retry (ex: file not found, some configuration is improper etc.)
So, in this case what could be an appropriate retry mechanism and supervision strategy considering that the actor will receive messages after a long interval (once a day).
In this case, the message sent to the actor is not bad input - it is just a trigger. Example:
case object FileFetch
If I have a supervision strategy in the parent like this, it is going to restart the failing child on every minor/major exception without retries.
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = -1, withinTimeRange = Duration.inf) {
case _: Exception => Restart
}
What I want to have is something like this:
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = -1, withinTimeRange = Duration.inf) {
case _: MinorException => Retry same message 2, 3 times and then Restart
case _: Exception => Restart
}
"Retrying" or resending a message in the event of an exception is something that you have to implement yourself. From the documentation:
If an exception is thrown while a message is being processed (i.e. taken out of its mailbox and handed over to the current behavior), then this message will be lost. It is important to understand that it is not put back on the mailbox. So if you want to retry processing of a message, you need to deal with it yourself by catching the exception and retry[ing] your flow. Make sure that you put a bound on the number of retries since you don’t want a system to livelock (so consuming a lot of cpu cycles without making progress).
If you want to resend the FileFetch message to the child in the event of a MinorException without restarting the child, then you could catch the exception in the child to avoid triggering the supervision strategy. In the try-catch block, you could send a message to the parent and have the parent track the number of retries (and perhaps include a timestamp in this message, if you want the parent to enact some kind of backoff policy, for example). In the child:
def receive = {
case FileFetch =>
try {
...
} catch {
case m: MinorException =>
val now = System.nanoTime
context.parent ! MinorIncident(self, now)
}
case ...
}
In the parent:
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = -1, withinTimeRange = Duration.Inf) {
case _: Exception => Restart
}
var numFetchRetries = 0
def receive = {
case MinorIncident(fetcherRef, time) =>
log.error(s"${fetcherRef} threw a MinorException at ${time}")
if (numFetchRetries < 3) { // possibly use the time in the retry logic; e.g., a backoff
numFetchRetries = numFetchRetries + 1
fetcherRef ! FileFetch
} else {
numFetchRetries = 0
context.stop(fetcherRef)
... // recreate the child
}
case SomeMsgFromChildThatFetchSucceeded =>
numFetchRetries = 0
case ...
}
Alternatively, instead of catching the exception in the child, you could set the supervisor strategy to Resume the child in the event of a MinorException, while still having the parent handle the message retry logic:
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = -1, withinTimeRange = Duration.Inf) {
case m: MinorException =>
val child = sender()
val now = System.nanoTime
self ! MinorIncident(child, now)
Resume
case _: Exception => Restart
}

Akka Router increment counter on message arrival from routees

I'm trying to keep counting on each successful import. But here is a problem - Counter works if the router receives a message from its parent but if I'm trying to send a message from its children it receives it but doesn't update the global variable that is out of the scope.
I know it sounds complicated. Let me show you the code.
Here is the router
class Watcher(size: Int) extends Actor {
var router = {
val routees = Vector.fill(size) {
val w = context.actorOf(
Props[Worker]
)
context.watch(w)
ActorRefRoutee(w)
}
Router(RoundRobinRoutingLogic(), routees)
}
var sent = 0
override def supervisorStrategy(): SupervisorStrategy = OneForOneStrategy(maxNrOfRetries = 100) {
case _: DocumentNotFoundException => {
Resume
}
case _: Exception => Escalate
}
override def receive: Receive = {
case container: MessageContainer =>
router.route(container, sender)
case Success =>
sent += 1
case GetValue =>
sender ! sent
case Terminated(a) =>
router.removeRoutee(a)
val w = context.actorOf(Props[Worker])
context.watch(w)
router = router.addRoutee(w)
case undef =>
println(s"${this.getClass} received undefinable message: $undef")
}
}
Here is the worker
class Worker() extends Actor with ActorLogging {
var messages = Seq[MessageContainer]()
var received = 0
override def receive: Receive = {
case container: MessageContainer =>
try {
importMessage(container.message, container.repo)
context.parent ! Success
} catch {
case e: Exception =>
throw e
}
case e: Error =>
log.info(s"Error occurred $e")
sender ! e
case undef => println(s"${this.getClass} received undefinable message: $undef")
}
}
So on supervisor ? GetValue I get 0 but suppose to have 1000.The strangest thing is that when I debug it with the breakpoint right on the case Success => ... the value is incremented every time the new message arrives. But supervisor ? GetValue still returns 0.
Let's assume I want to count on case container: MessageContainer => ... and it will magically work; I'll get desirable number, but it doesn't show if I actually imported anything. What's going on?
Here is the test case.
#Test
def testRouter(): Unit = {
val system = ActorSystem("RouterTestSystem")
// val serv = AddressFromURIString("akka.tcp://master#host:1334")
val supervisor = system.actorOf(Props(new Watcher(20)))//.withDeploy(akka.actor.Deploy(scope = RemoteScope(serv))))
val repo = coreSession.getRepositoryName
val containers = (0 until num)
.map(_ => MessageContainer(MessageFactory.generate("/"), repo))
val watch = Stopwatch.createStarted()
(0 until num).par
.foreach( i => {
supervisor ! containers.apply(i)
})
implicit val timeout = Timeout(60 seconds)
val future = supervisor ? GetValue
val result = Await.result(future, timeout.duration).asInstanceOf[Int]
val speed = result / (watch.elapsed(TimeUnit.MILLISECONDS) / 1000.0)
println(f"Import speed: $speed%.2f")
assertEquals(num, result)
}
Can you please explained it in details. Why is it happening? Why only on message received from the children? Another approach?
Well... there can be many potential problems hidden in the parts of code that you have not shared. But, for the sake of this discussion I will assume that everything else is fine and we will just discuss problems with your shared code.
Now, let me explain a bit about Actors. To put things simply, every actor has a mailbox (where it keeps messages in the sequence they were received) and processes them one by one in the order they were received. Since the mailbox is used like a Queue we will refer to it as a Queue in this discussion.
Also... I don't know what this container.apply(i) is going to return... so I will refer to the return value of that container.apply(1) as MessageContainer__1
In your test runner you are first creating an instance of Watcher,
val supervisor = system.actorOf(Props(new Watcher(20)))
Now, lets say that you are sending these 2 messages (num = 2) to supervisor,
So supervisor's mailbox will look something like,
Queue(MessageContainer__0, MessageContainer__1)
Then you send it another message GetValue so the mailbox will look like,
Queue(MessageContainer__0, MessageContainer__1, GetValue)
Now the actor will process the first message and pass it to the workers, the mail-box will look like,
Queue(MessageContainer__1, GetValue)
Now even if your worker is ultra-fast and instantaneous in sending the reply the mailbox will look like,
Queue(MessageContainer__1, GetValue, Success)
And now since your worker super-ultra-fast and instantaneously replies with a Success, the state after passing the second MessageContainer will look like,
Queue(GetValue, Success, Success)
And... here is the root of your problem. The Supervisor sees the GetValue massage before any Success messages, no matter how fast your workers are.
And thus it will process GetValue and reply with current value of sent which is 0.

Akka Supervisor Strategy - Correct Use Case

I have been using Akka Supervisor Strategy to handle business logic exceptions.
Reading one of the most famous Scala blog series Neophyte, I found him giving a different purpose for what I have always been doing.
Example:
Let's say I have an HttpActor that should contact an external resource and in case it's down, I will throw an Exception, for now a ResourceUnavailableException.
In case my Supervisor catches that, I will call a Restart on my HttpActor, and in my HttpActor preRestart method, I will call do a schedulerOnce to retry that.
The actor:
class HttpActor extends Actor with ActorLogging {
implicit val system = context.system
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Restarting Actor due: ${reason.getCause}")
message foreach { msg =>
context.system.scheduler.scheduleOnce(10.seconds, self, msg)
}
}
def receive = LoggingReceive {
case g: GetRequest =>
doRequest(http.doGet(g), g.httpManager.url, sender())
}
A Supervisor:
class HttpSupervisor extends Actor with ActorLogging with RouterHelper {
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 5) {
case _: ResourceUnavailableException => Restart
case _: Exception => Escalate
}
var router = makeRouter[HttpActor](5)
def receive = LoggingReceive {
case g: GetRequest =>
router.route(g, sender())
case Terminated(a) =>
router = router.removeRoutee(a)
val r = context.actorOf(Props[HttpActor])
context watch r
router = router.addRoutee(r)
}
}
What's the point here?
In case my doRequest method throws the ResourceUnavailableException, the supervisor will get that and restart the actor, forcing it to resend the message after some time, according to the scheduler. The advantages I see is the fact I get for free the number of retries and a nice way to handle the exception itself.
Now looking at the blog, he shows a different approach in case you need a retry stuff, just sending messages like this:
def receive = {
case EspressoRequest =>
val receipt = register ? Transaction(Espresso)
receipt.map((EspressoCup(Filled), _)).recover {
case _: AskTimeoutException => ComebackLater
} pipeTo(sender)
case ClosingTime => context.system.shutdown()
}
Here in case of AskTimeoutException of the Future, he pipes the result as a ComebackLater object, which he will handle doing this:
case ComebackLater =>
log.info("grumble, grumble")
context.system.scheduler.scheduleOnce(300.millis) {
coffeeSource ! EspressoRequest
}
For me this is pretty much what you can do with the strategy supervisor, but in a manually way, with no built in number of retries logic.
So what is the best approach here and why? Is my concept of using akka supervisor strategy completely wrong?
You can use BackoffSupervisor:
Provided as a built-in pattern the akka.pattern.BackoffSupervisor implements the so-called exponential backoff supervision strategy, starting a child actor again when it fails, each time with a growing time delay between restarts.
val supervisor = BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "myEcho",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2 // adds 20% "noise" to vary the intervals slightly
).withAutoReset(10.seconds) // the child must send BackoffSupervisor.Reset to its parent
.withSupervisorStrategy(
OneForOneStrategy() {
case _: MyException => SupervisorStrategy.Restart
case _ => SupervisorStrategy.Escalate
}))

Handling Faults in Akka actors

I've a very simple example where I've an Actor (SimpleActor) that perform a periodic task by sending a message to itself. The message is scheduled in the constructor for the actor. In the normal case (i.e., without faults) everything works fine.
But what if the Actor has to deal with faults. I've another Actor (SimpleActorWithFault). This actor could have faults. In this case, I'm generating one myself by throwing an exception. When a fault happens (i.e., SimpleActorWithFault throws an exception) it is automatically restarted. However, this restart messes up the scheduler inside the Actor which no longer functions as excepted. And if the faults happens rapidly enough it generates more unexpected behavior.
My question is what's the preferred way to dealing with faults in such cases? I know I can use Try blocks to handle exceptions. But what if I'm extending another actor where I cannot put a Try in the super class or some case when I'm an excepted fault happens in the actor.
import akka.actor.{Props, ActorLogging}
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import akka.actor.Actor
case object MessageA
case object MessageToSelf
class SimpleActor extends Actor with ActorLogging {
//schedule a message to self every second
context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf)
//keeps track of some internal state
var count: Int = 0
def receive: Receive = {
case MessageA => {
log.info("[SimpleActor] Got MessageA at %d".format(count))
}
case MessageToSelf => {
//update state and tell the world about its current state
count = count + 1
log.info("[SimpleActor] Got scheduled message at %d".format(count))
}
}
}
class SimpleActorWithFault extends Actor with ActorLogging {
//schedule a message to self every second
context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf)
var count: Int = 0
def receive: Receive = {
case MessageA => {
log.info("[SimpleActorWithFault] Got MessageA at %d".format(count))
}
case MessageToSelf => {
count = count + 1
log.info("[SimpleActorWithFault] Got scheduled message at %d".format(count))
//at some point generate a fault
if (count > 5) {
log.info("[SimpleActorWithFault] Going to throw an exception now %d".format(count))
throw new Exception("Excepttttttiooooooon")
}
}
}
}
object MainApp extends App {
implicit val akkaSystem = akka.actor.ActorSystem()
//Run the Actor without any faults or exceptions
akkaSystem.actorOf(Props(classOf[SimpleActor]))
//comment the above line and uncomment the following to run the actor with faults
//akkaSystem.actorOf(Props(classOf[SimpleActorWithFault]))
}
The correct way is to push down the risky behavior into it's own actor. This pattern is called the Error Kernel pattern (see Akka Concurrency, Section 8.5):
This pattern describes a very common-sense approach to supervision
that differentiates actors from one another based on any volatile
state that they may hold.
In a nutshell, it means that actors whose state is precious should not
be allowed to fail or restart. Any actor that holds precious data is
protected such that any risky operations are relegated to a slave
actor who, if restarted, only causes good things to happen.
The error kernel pattern implies pushing levels of risk further down
the tree.
See also another tutorial here.
So in your case it would be something like this:
SimpleActor
|- ActorWithFault
Here SimpleActor acts as a supervisor for ActorWithFault. The default supervising strategy of any actor is to restart a child on Exception and escalate on anything else:
http://doc.akka.io/docs/akka/snapshot/scala/fault-tolerance.html
Escalating means that the actor itself may get restarted. Since you really don't want to restart SimpleActor you could make it always restart the ActorWithFault and never escalate by overriding the supervisor strategy:
class SimpleActor {
override def preStart(){
// our faulty actor --- we will supervise it from now on
context.actorOf(Props[ActorWithFault], "FaultyActor")
...
override val supervisorStrategy = OneForOneStrategy () {
case _: ActorKilledException => Escalate
case _: ActorInitializationException => Escalate
case _ => Restart // keep restarting faulty actor
}
}
To avoid messing up the scheduler:
class SimpleActor extends Actor with ActorLogging {
private var cancellable: Option[Cancellable] = None
override def preStart() = {
//schedule a message to self every second
cancellable = Option(context.system.scheduler.schedule(0 seconds, 1 seconds, self, MessageToSelf))
}
override def postStop() = {
cancellable.foreach(_.cancel())
cancellable = None
}
...
To correctly handle exceptions (akka.actor.Status.Failure is for correct answer to an ask in case of Ask pattern usage by sender):
...
def receive: Receive = {
case MessageA => {
try {
log.info("[SimpleActor] Got MessageA at %d".format(count))
} catch {
case e: Exception =>
sender ! akka.actor.Status.Failure(e)
log.error(e.getMessage, e)
}
}
...