Akka 2.1 exception handling (Scala) - scala

I have a supervising Akka actor which uses a router to forward messages to worker actors.
I have a class which wraps the supervisor and when I call a method on that class it "asks" the supervisor to do something and then I use Await.result(theFuture) to wait for the result (I cannot continue without the result).
If the workers throw an exception, I want to restart the worker which threw the exception, and I want the exception to be caught by the code which calls the wrapper class.
I passed a OneForOneStrategy to the router constructor, which returns RESTART in the case of an Exception. In the postRestart method of the worker, I log the restart, so I can validate that the worker is actually restarted.
When the worker throws an exception, it gets restarted, but the exception disappears. The Future which is the result of asking the supervisor, contains an exception, but it is a akka.pattern.AskTimeoutException, which is thrown after just 5 seconds rather than 20 seconds, which is the implicit timeout that I have lingering around. The exception actually occurs less than a second after the worker starts.
Question 1: how can I get the exception from the worker in the code which calls my wrapper class?
Also, the receive method of the worker is like this:
def receive = {
case r: Request =>
val response = ??? //throws an exception sometimes
sender ! response
}
Something is logging the exception to the console, but it isn't my code. The stack trace is:
[ERROR] [02/11/2013 21:34:20.093] [MySystem-akka.actor.default-dispatcher-9]
[akka://MySystem/user/MySupervisor/MyRouter/$a] Something went wrong!
at myApp.Worker.$$anonfun$receive$1.applyOrElse(Source.scala:169)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:425)
at akka.actor.ActorCell.invoke(ActorCell.scala:386)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:230)
at akka.dispatch.Mailbox.run(Mailbox.scala:212)
at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:502)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:262)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1478)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Line 169 of Source.scala is the line val response = ??? shown in the listing of the receive method above.
Question 2: who is logging that exception to the console, and how can I stop it?

1)
try somethingThatCanFail() catch {
case e: Exception => sender ! Status.Failure(e); throw e
}
The "tell failure" causes the caller to get a Failure containing the exception. Throwing "e" causes the oneForOneStrategy to be called which restarts the worker.
2)
It is the actor system itself that logs the failure, and the only way
to quiet it down is to filter out things by creating and configuring
your own LoggingAdapter as described here
http://doc.akka.io/docs/akka/2.1.0/scala/logging.html There is a
ticket for changing this
https://www.assembla.com/spaces/akka/tickets/2824 but it is targeted
for Akka 2.2
Answered by https://groups.google.com/forum/#!topic/akka-user/fenCvYu3HYE

In order to be notified of one of your children failing, you need to
First watch the child
Then you will be sent a Terminated() when the actor dies with a reference to it.
Something like:
class ParentActor extends Actor {
// this is sample of how to watch for death of one of your children
val childActor = context.actorOf(Props[SomeService], "SomeService")
val dyingChild = context.watch(context.actorOf(childActor))
def receive = {
case Terminated(`dyingChild`) =>
println("dyingChild died")
case Terminated(terminatedActor) =>
println(s"This child just died $terminatedActor")
}
}
Hope this helps.

Related

Don't immediately stop child Akka actors when parent actor stopped

I have an actor p: ParentActor which I want to do some cleanup jobs when it is stopped by making use of the postStop method. One of the jobs involves sending a message to a child actor c: ChildActor. After that, c and then p should be stopped. However, once context.stop(p) has been called c seems to be immediately stopped and is unable to receive messages.
Below is an example of what I'd like to do:
class ParentActor extends Actor {
...
override def postStop {
logger.info("Shutdown message received. Sending message to child...")
val future = c ? "doThing"
logger.info("Waiting for child to complete task...")
Await.result(future, Duration.Inf)
context.stop(c)
logger.info("Child stopped. Stopping self. Bye!")
context.stop(self)
}
}
Results in error message:
[ERROR] [SetSystem-akka.actor.default-dispatcher-5] [akka://SetSystem/user/ParentActor] Recipient[Actor[akka://SetSystem/user/ParentActor/ChildActor#70868360]] had already been terminated. Sender[Actor[akka://SetSystem/user/ParentActor#662624985]] sent the message of type "java.lang.String".
An alternative would be to send p a message saying to shutdown, and have the above actions take place as a result of that, but using the built in stopping functionality seems better.
PS. This is a new application so design alternatives are welcome.
When an actor A is stopped, its children are indeed stopped before A's postStop hook is called. The sequence of events when an actor is stopped is as follows (from the official documentation):
Termination of an actor proceeds in two steps: first the actor suspends its mailbox processing and sends a stop command to all its children, then it keeps processing the internal termination notifications from its children until the last one is gone, finally terminating itself (invoking postStop, dumping mailbox, publishing Terminated on the DeathWatch, telling its supervisor).
Overriding the parent's postStop won't help you because your desired shutdown procedure includes sending a message to a child and waiting for a reply. When the parent is terminated, the child is stopped before the parent's postStop is run.
As you mentioned, sending a specific message to the ParentActor to initiate the shutdown is another approach. Something like the following would work:
import akka.pattern.pipe
class ParentActor extends Actor {
...
def receive = {
case Shutdown =>
(c ? DoYourThing).mapTo[ThingIsDone].pipeTo(self)
case ThingIsDone =>
logger.info("Child has finished its task.")
context.stop(self)
case ...
}
}
Note that one should avoid using Await in an actor. Instead, pipe the result of a Future to an actor.

How to get the actorRef of the failing actor in a OneForOneStrategy setup?

How can I get the actorRef or name of the failing node? I need to restart the node if the exception happens once. If the exception happens > 1, then I need to resume.
My thought was to have the supervisor store how many times a given node has had the exception - but I can't figure out which node failed. Maybe the approach is bad.
x would be the count of times the failing node had a given exception.
OneForOneStrategy() {
case _: FileNotFoundException =>
// Need to know how many times node n has had this exception and restart/resume as required.
if(x == 1)
Restart
else
Resume
case _: Exception => Stop
}
You could catch the FileNotFoundException and throw a CustomException that has an ActorRef field set to the excepting Actor (self). Then in your OneForOne, you catch the CustomException that has the ActorRef field set to the problem actor. So now you have a reference to the failing actor, then check the exception count as you described inside your supervisor and make the call on whether or not to restart/resume.
Actually in supervisor's Decider you can obtain the failing child's ActorRef via sender() method.

Sending back to sender, from supervisor, in case of failure

I have an Actor, which acts as a Supervisor, but also needs to "return" data to the caller, wether this is an actor or not shouldn't matter.
I am asking my Supervisor, let's call him SV.
SV Processes the message I send to him, and sends back a response.
val system = ActorSystem("ActorSystem")
val sv = system.actorOf(Props[SV], name = "SV")
sv ? msg
And SV's recieve method looks like this:
def receive = {
case msg => (someChild ? msg).pipeTo(sender)
...
}
This all works jolly fine.
The problem is when the child throws an exception, this exception is caught by the supervisor strategy.
override def supervisorStrategy = OneForOneStrategy () {
case e : Throwable => {
val newResponse = someNewResponse
sender ! newResponse
...
}
}
sender is not longer a reference to whoever called SV in the first place, and I can't seem to work out how to send a message back to the askee, and get back to my original flow.
One of the three Actor rules is: “An Actor can send a finite number of messages to other Actors it knows.” The last two words are critical here: if the supervisor has not been introduced to the original sender somehow and the failure (exception) itself does not contain the sender’s reference either, then the supervisor cannot possibly send a message to the sender. Either you catch all exceptions in the child actor, wrap them together with the sender in your own exception type and then rethrow, or the original message needs to travel via the supervisor to the child and back, so that the supervisor can see that a reply is outstanding when a failure occurs.
Are you using the supervisor strategy and exceptions to control the flow or your data? Consider using the type system (Option type or a Failure on the response Future) for "exception" cases in your child actors and handle the response flow without exceptions.
The supervisor strategy is for unhandled exceptions.
When an unhandled exception occurs you lose the ability to respond with a message to the sender. Unless you wrap the sender in the unhandled exception like Roland Kuhn suggests.
Let the supervisor strategy instead handle how the actor system should respond to your unhandled exceptions by mapping them to a Directive.

Capturing the message that killed an actor

I'm trying to respond to the sender of message when the receiving actor dies due to that message. If I Restart the actor on failure I get
preRestart(reason: Throwable, message: Option[Any])
but now I'm committed to restarting.
If I Stop the actor, i only get
postStop()
with no knowledge what stopped myself.
Meanwhile in the supervisor, I only get the Throwable and no indication of what caused it.
I suppose, I can dig through the DeadLetters post actor termination, but that seems like a noisy approach, since I'd have to listen to all of dead letters and somewhere correlate the termination with the deadletter event stream.
UPDATE: DeadLetter really doesn't seem to be an option. The message that caused the death doesn't even go to DeadLetters, it just disappears.
Is there a mechanism I am overlooking?
According to this thread on the Akka Users List, there isn't a mechanism in the actor supervision death cycle to accomplish this. Moreover, the documentation explicitly states that the message is dropped:
What happens to the Message
If an exception is thrown while a message is being processed (i.e. taken out of its mailbox and handed over to the current behavior), then this message will be lost. It is important to understand that it is not put back on the mailbox. So if you want to retry processing of a message, you need to deal with it yourself by catching the exception and retry your flow.
The ideal solution would be to use a dedicated actor for dangerous operations and have the initiator monitor the death of that actor to determine failure.
As my scenario arose from something considered safe but that had a bug in it, the separate actor option would have been after the fact. To avoid wrapping all code paths in try/catch but be able to guard more complicated and critical flows, I ended up with creating a wrapper for receive that let's me intercept exceptions:
object SafeReceive {
def apply(receive: Receive)(recover: Any => PartialFunction[Throwable, Unit]): Receive =
new Receive {
override def isDefinedAt(x: Any): Boolean = receive.isDefinedAt(x)
override def apply(v1: Any): Unit = try {
receive(v1)
} catch recover(v1)
}
}
which I can use for select actors like this:
def receive = SafeReceive {
case ... => ...
} {
msg => {
case e: Exception =>
sender ! OperationFailed(msg, e)
throw e
}
}

Akka - Actor preStart has no previous message

I have some actors that when their preRestart() is invoked, the "message" is None. I don't understand why this is the case.
Context
My application is a Play web application that uses Akka. My error handling strategy is to send an error notification to the origin of the request, so details can be sent down to the user. I do this by hooking into preRestart() as follows:
trait NotifyRequesterErrorHandling { this: Actor with ActorLogging =>
override def preRestart(reason: Throwable, message: Option[Any]) {
notifyRequester(message, reason)
}
protected def notifyRequester(message: Option[Any], reason: Throwable) {
message match {
case Some(m) =>
val who = getRequester(m)
who ! SearchError(reason, m)
case None =>
val error = if (reason == null) "No exception" else reason toString()
ReportError("NotificationError", Some(reason), "", s"${this.context.self.path}: Attempted to notify requester but no previous message")
}
}
def getRequester(message: Any): ActorRef
}
Problem
In my logs, I am seeing a lot of the "Attempted to notify requester but no previous message" error logs. Usually it occurs for all the actors in my system. This is likely because I have one top-level actor, which is responsible for all the other actors (they are all children).
In the logs, the reason parameter does contain a throwable, though.
I am also using a one-for-all strategy. So, basically, whenever all the actors are restarted, I get lots of these errors.
Possible Explanations aka guesses
After all actors are restarted a new instance of each actor is created, and thus there is no previous message
When actors are restarted as one-for-all, all of their messages have been processed and their queue is empty. This means that there is no previous message
The documentation for preRestart states: "message - optionally the current message the actor processed when failing, if applicable"
I.e. it is only applicable to the failing actor. Not the other ones who are just restarted and not failing.