Why does `TestFSMRef.receive must throw[Exception]` fail intermittently - scala

Hi fellow coders and admired gurus,
I have an actor that implements FSM that is required to throw an IOException on certain messages while in a specific state (Busy) to be restarted by its Supervisor.
excerpt:
case class ExceptionResonse(errorCode: Int)
when(Busy) {
case ExceptionResponse(errorCode) =>
throw new IOException(s"Request failed with error code $errorCode")
}
I am trying to test that behavior by using a TestActorRef and calling receive directly on that expecting receive to throw an IOException.
case class WhenInStateBusy() extends TestKit(ActorSystem()) with After {
val myTestFSMRef = TestFSMRef(MyFSM.props)
...
def prepare: Result = {
// prepares tested actor by going through an initialization sequence
// including 'expectMsgPfs' for several messages sent from the tested FSM
// most of my test cases depend on the correctness of that initialization sequence
// finishing with state busy
myTestFSMRef.setState(Busy)
awaitCond(
myTestFSMRef.stateName == Busy,
maxDelay,
interval,
s"Actor must be in State 'Busy' to proceed, but is ${myTestFSMRef.stateName}"
)
success
}
def testCase = this {
prepare and {
myTestFSMRef.receive(ExceptionResponse(testedCode)) must throwAn[IOException]
}
}
}
Note: The initialization sequence makes sure, the tested FSM is fully initialized and has setup its internal mutable state. State Busy can only be left when the actor receives a certain kind of message that in my test setup has to be provided by the test case, so I am pretty sure the FSM is in the right state.
Now, on my Jenkins server (Ubuntu 14.10) this test case fails in about 1 out of 20 attempts (-> No exception is thrown). However, on my development machine (Mac Os X 10.10.4) I am not able to reproduce the bug. So debugger does not help me.
The tests are run sequentially and after each example the test system is shut down.
Java version 1.7.0_71
Scala version 2.11.4
Akka version 2.3.6
Specs2 version 2.3.13
Can anyone explain why sometimes calling myTestActorRef.receive(ExceptionResponse(testedCode)) does not result in an Exception?

This is a tricky question indeed: my prime suspect is that the Actor is not yet initialized. Why is this? When implementing system.actorOf (which is used by TestFSMRef.apply()) it became clear that there can only be one entity that is responsible for actually starting an Actor, and that is its parent. I tried many different things and all of them were flawed in some way.
But how does that make this test fail?
The basic answer is that with the code you show it is not guaranteed that at the time you execute setState the FSM has already been initialized. Especially on (low-powered) Jenkins boxes it may be that the guardian actor does not get scheduled to run for a measurable amount of time. If that is the case then the startWith statement in your FSM will override the setState because it runs afterwards.
The solution to this would be to send another message to the FSM and expect back the proper response before calling setState.

Related

Scala - differents between eventually timeout and Thread.sleep()

I have some async (ZIO) code, which I need to test. If I create a testing part using Thread.sleep() it works fine and I always get response:
for {
saved <- database.save(smth)
result <- eventually {
Thread.sleep(20000)
database.search(...)
}
} yield result
But if I made same logic using timeout and interval from eventually then it never works correctly ( I got timeouts):
for {
saved <- database.save(smth)
result <- eventually(timeout(Span(20, Seconds)), interval(Span(20, Seconds))) {
database.search(...)
}
} yield result
I do not understand why timeout and interval works different then Thread.sleep. It should be doing exactly same thing. Can someone explain it to me and tell how I should change this code to do not need to use Thread.sleep()?
Assuming database.search(...) returns ZIO[] object.
eventually{database.search(...)} most probably succeeds immediately after the first try.
It successfully created a task to query the database.
Then database is queried without any retry logic.
Regarding how to make it work:
val search: ZIO[Any, Throwable, String] = ???
val retried: ZIO[Any with Clock, Throwable, Option[String]] = search.retry(Schedule.spaced(Duration.fromMillis(1000))).timeout(Duration.fromMillis(20000))
Something like that should work. But I believe that more elegant solutions exist.
The other answer from #simpadjo addresses the "what" quite succinctly. I'll add some additional context as to why you might see this behavior.
for {
saved <- database.save(smth)
result <- eventually {
Thread.sleep(20000)
database.search(...)
}
} yield result
There are three different technologies being mixed here which is causing some confusion.
First is ZIO which is an asynchronous programming library that uses it's own custom runtime and execution model to perform tasks. The second is eventually which comes from ScalaTest and is useful for checking asynchronous computations by effectively polling the state of a value. And thirdly, there is Thread.sleep which is a Java api that literally suspends the current thread and prevents task progression until the timer expires.
eventually uses a simple retry mechanism that differs based on whether you are using a normal value or a Future from the scala standard library. Basically it runs the code in the block and if it throws then it sleeps the current thread and then retries it based on some interval configuration, eventually timing out. Notably in this case the behavior is entirely synchronous, meaning that as long as the value in the {} doesn't throw an exception it won't keep retrying.
Thread.sleep is a heavy weight operation and in this case it is effectively blocking the function being passed to eventually from progressing for 20 seconds. Meaning that by the time the database.search is called the operation has likely completed.
The second variant is different, it executes the code in the eventually block immediately, if it throws an exception then it will attempt it again based on the interval/timeout logic that your provide. In this scenario the save may not have completed (or propagated if it is eventually consistent). Because you are returning a ZIO which is designed not to throw, and eventually doesn't understand ZIO it will simply return the search attempt with no retry logic.
The accepted answer:
val retried: ZIO[Any with Clock, Throwable, Option[String]] = search.retry(Schedule.spaced(Duration.fromMillis(1000))).timeout(Duration.fromMillis(20000))
works because the retry and timeout are using the built-in ZIO operators which do understand how to actually retry and timeout a ZIO. Meaning that if search fails the retry will handle it until it succeeds.

How to stop all actors and wait for them to terminate?

I am implementing unit tests for my Akka project. To avoid InvalidActorNameExceptions and the like, I want all actors that were created within one unit test to be stopped before the next unit test is being run. So, for each actor created within a unit test, I call _system.stop(someActorRef) at the end of it. However, it takes some time for an actor to actually be stopped, and unfortunately, the next unit test usually starts running before the actors that were created within the previous one are actually gone. And since there is neither a Future that is being returned by the stop method, nor an awaitStop method available, I really don't know how to solve this. Currently I call Thread.sleep(1000) at the end of each unit test and hope all actors are dead by then, but, obviously, I cannot stay this way. :D
I'd appreciate any hint!
You could try this at the end of your test:
val probe = TestProbe()
probe.watch(someActorRef)
system.stop(someActorRef)
probe.expectMsgType[Terminated]
//another way
//probe.expectMsgPF() {
// case Terminated(someActorRef) =>
//}

Retry / replay of failed messages in AKKA

I'm using AKKA.NET in my current .NET project.
My question is this: How are experienced AKKA-developers implementing the replay-message-on-failure pattern using the latest AKKA libraries for either Java or .NET?
Here are some more details.
I want to ensure that a failed message (i.e. a message received by an actor leading to an exception) is replayed / retried a number of times with a time interval between each. Normally the actor is restarted by the failed message is thrown away.
I have written my own small helper method like this to solve it:
public void WithRetries(int noSeconds, IUntypedActorContext context, IActorRef receiver, IActorRef sender, Object message, Action action)
{
try
{
action();
}
catch (Exception e)
{
context.System.Scheduler.ScheduleTellOnce(new TimeSpan(0, 0, noSeconds), receiver, message, sender);
throw;
}
}
}
Now my actors typically look like this:
Receive<SomeMessage>(msg =>
{
ActorHelper.Instance.WithRetries(-1, Context, Self, Sender, msg, () => {
...here comes the actual message processing
});
});
I like the above solution because it is straightforward. However, I don't like that it adds yet another layer of indirection in my code, and the code gets a bit more messy if I use this helper method in many places. Furthermore it has some limitations. First of all, the number of retries is not governed by the helper method. It is governed by the supervision strategy of the supervisor, which I believe is messy. Furthermore, the time interval is fixed whereas I would in some cases like a time interval that increases for each retry.
I would prefer something that can be configured using HOCON. Or something that can be applied as a cross-concern.
I can see various suggestions for either AKKA for Scala, AKKA for Java and AKKA.NET. I have seen examples with routers, examples with Circuit Breaker (e.g. http://getakka.net/docs/CircuitBreaker#examples) and so forth. I have also seen some examples using the same idea as above. But I have a feeling that it should be even simpler. Perhaps it involves some usage of AKKA Persistence and events.
So to repeat my question: How are experienced AKKA-developers implementing the replay-message-on-failure pattern using the latest AKKA libraries for either Java or .NET?
I looked into this last year sometime - I'm away from my dev machine so cannot check,so this is all coming from memory:
I seem to remember the solution to this was a combination of stashing and supervision strategies and lifecycle hooks :)
I think you can wrap your child actor code in a try-catch, then in the case of error, stash the message and re-throw the exception so it is handled by the supervisor and all the usual supervision strategies come into play. I think you would resume rather than restart. Then in the appropriate lifecycle message (onresume?!) unstash messages which should mean the failed message is processed again.
Now this isn't all that different from what you've already posted above, so hopefully someone has a better solution :)
This may be late. But another solution is to pass the comamnd (or essential params) to the actor constructor and send the command to islef when created and use the Restart directive.
// Scala code
class ResilientActor(cmd:Comman) extends Actor {
def receive = {
...
}
self ! cmd
}
...
override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 3){
case _: SomeRetryableException => Restart
case t => super.supervisorStrategy.decider.applyOrElse(t, (_:Any) => Escalate)
}

Scala how to use akka actors to handle a timing out operation efficiently

I am currently evaluating javascript scripts using Rhino in a restful service. I wish for there to be an evaluation time out.
I have created a mock example actor (using scala 2.10 akka actors).
case class Evaluate(expression: String)
class RhinoActor extends Actor {
override def preStart() = { println("Start context'"); super.preStart()}
def receive = {
case Evaluate(expression) ⇒ {
Thread.sleep(100)
sender ! "complete"
}
}
override def postStop() = { println("Stop context'"); super.postStop()}
}
Now I run use this actor as follows:
def run {
val t = System.currentTimeMillis()
val system = ActorSystem("MySystem")
val actor = system.actorOf(Props[RhinoActor])
implicit val timeout = Timeout(50 milliseconds)
val future = (actor ? Evaluate("10 + 50")).mapTo[String]
val result = Try(Await.result(future, Duration.Inf))
println(System.currentTimeMillis() - t)
println(result)
actor ! PoisonPill
system.shutdown()
}
Is it wise to use the ActorSystem in a closure like this which may have simultaneous requests on it?
Should I make the ActorSystem global, and will that be ok in this context?
Is there a more appropriate alternative approach?
EDIT: I think I need to use futures directly, but I will need the preStart and postStop. Currently investigating.
EDIT: Seems you don't get those hooks with futures.
I'll try and answer some of your questions for you.
First, an ActorSystem is a very heavy weight construct. You should not create one per request that needs an actor. You should create one globally and then use that single instance to spawn your actors (and you won't need system.shutdown() anymore in run). I believe this covers your first two questions.
Your approach of using an actor to execute javascript here seems sound to me. But instead of spinning up an actor per request, you might want to pool a bunch of the RhinoActors behind a Router, with each instance having it's own rhino engine that will be setup during preStart. Doing this will eliminate per request rhino initialization costs, speeding up your js evaluations. Just make sure you size your pool appropriately. Also, you won't need to be sending PoisonPill messages per request if you adopt this approach.
You also might want to look into the non-blocking callbacks onComplete, onSuccess and onFailure as opposed to using the blocking Await. These callbacks also respect timeouts and are preferable to blocking for higher throughput. As long as whatever is way way upstream waiting for this response can handle the asynchronicity (i.e. an async capable web request), then I suggest going this route.
The last thing to keep in mind is that even though code will return to the caller after the timeout if the actor has yet to respond, the actor still goes on processing that message (performing the evaluation). It does not stop and move onto the next message just because a caller timed out. Just wanted to make that clear in case it wasn't.
EDIT
In response to your comment about stopping a long execution there are some things related to Akka to consider first. You can call stop the actor, send a Kill or a PosionPill, but none of these will stop if from processing the message that it's currently processing. They just prevent it from receiving new messages. In your case, with Rhino, if infinite script execution is a possibility, then I suggest handling this within Rhino itself. I would dig into the answers on this post (Stopping the Rhino Engine in middle of execution) and setup your Rhino engine in the actor in such a way that it will stop itself if it has been executing for too long. That failure will kick out to the supervisor (if pooled) and cause that pooled instance to be restarted which will init a new Rhino in preStart. This might be the best approach for dealing with the possibility of long running scripts.

Easiest way to do idle processing in a Scala Actor?

I have a scala actor that does some work whenever a client requests it. When, and only when no client is active, I would like the Actor to do some background processing.
What is the easiest way to do this? I can think of two approaches:
Spawn a new thread that times out and wakes up the actor periodically. A straight forward approach, but I would like to avoid creating another thread (to avoid the extra code, complexity and overhead).
The Actor class has a reactWithin method, which could be used to time out from the actor itself. But the documentation says the method doesn't return. So, I am not sure how to use it.
Edit; a clarification:
Assume that the background task can be broken down into smaller units that can be independently processed.
Ok, I see I need to put my 2 cents. From the author's answer I guess the "priority receive" technique is exactly what is needed here. It is possible to find discussion in "Erlang: priority receive question here at SO". The idea is to accept high priority messages first and to accept other messages only in absence of high-priority ones.
As Scala actors are very similar to Erlang, a trivial code to implement this would look like this:
def act = loop {
reactWithin(0) {
case msg: HighPriorityMessage => // process msg
case TIMEOUT =>
react {
case msg: HighPriorityMessage => // process msg
case msg: LowPriorityMessage => // process msg
}
}
}
This works as follows. An actor has a mailbox (queue) with messages. The receive (or receiveWithin) argument is a partial function and Actor library looks for a message in a mailbox which can be applied to this partial function. In our case it would be an object of HighPriorityMessage only. So, if Actor library finds such a message, it applies our partial function and we are processing a message of high priority. Otherwise, reactWithin with timeout 0 calls our partial function with argument TIMEOUT and we immediately try to process any possible message from the queue (as it waits for a message we cannot exclude a possiblity to get HighPriorityMessage).
It sounds like the problem you describe is not well suited to the actor sub-system. An Actor is designed to sequentially process its message queue:
What should happen if the actor is performing the background work and a new task arrives?
An actor can only find out about this is it is continuously checking its mailbox as it performs the background task. How would you implement this (i.e. how would you code the background tasks as a unit of work so that the actor could keep interrupting and checking the mailbox)?
What should happen if the actor has many background tasks in its mailbox in front of the main task?
Do these background tasks get thrown away, or sent to another actor? If the latter, how can you prevent CPU time being given to that actor to perform the tasks?
All in all, it sounds much more like you need to explore some grid-style software that can run in the background (like Data Synapse)!
Just after asking this question I tried out some completely whacky code and it seems to work fine. I am not sure though if there is a gotcha in it.
import scala.actors._
object Idling
object Processor extends Actor {
start
import Actor._
def act() = {
loop {
// here lie dragons >>>>>
if (mailboxSize == 0) this ! Idling
// <<<<<<
react {
case msg:NormalMsg => {
// do the normal work
reply(answer)
}
case Idling=> {
// do the idle work in chunks
}
case msg => println("Rcvd unknown message:" + msg)
}
}
}
}
Explanation
Any code inside the argument of loop but before the call to react seems to get called when the Actor is about to wait for a message. I am sending a Idling message to self here. In the handler for this message I ensure that the mailbox-size is 0, before doing the processing.