Non-blocking scala loops - scala

Is there a way in Scala to execute something in a loop without blocking the entire flow?
I have the following code to transmit something in Actor model
All actors send something to other actors:
def some_method
loop {
// Transmit something
Thread.sleep(100)
}
I also have some code to receive what other actors send. But the flow is not coming out of the loop. It sleeps and continues without coming out of the loop. Thus, all actors keep sending but nobody receives. How can I fix this?

If I understand you correctly, you want the transmission to occur every 100ms, but you don't want to create another Thread for that (and a Thread.sleep inside an actor may indeed block the flow).
You can use reactWithin:
import java.util.Date
import math.max
def some_method = {
var last_transmission_time = 0
loop {
val current_time = (new Date).getTime
reactWithin(max(0, last_transmission_time + 100 - current_time)) {
// actor reaction cases
case TIMEOUT => {
// Transmit something
last_transmission_time = (new Date).getTime
}
}
}
}
The last_transmission_time saves the last time a transmission was done.
The reaction timeout is calculated so that a TIMEOUT will occur when the current time is the last-transmission-time + 100ms.
If a timeout occured it means over 100ms passed since the last transmission, so another transmission should be called.
If the reaction cases themselves may take a lot of time, then I don't see any simple solution but creating another thread.
I didn't try the code because I'm not sure that I fully understand your problem.

Related

How should an ask timeout be handled by the askee? Is it even possible?

Suppose I have some client code that uses akka's ask pattern on an actor:
implicit val timeout = Timeout(1.minute)
val result: Future[Any] = actor ? Question
And the actor handles it like this:
def receive = {
case Question =>
// work work work
// 3 minutes later...
sender ! Answer
}
The result Future is expected to time out in this scenario, since the reply would be sent after three minutes but the given timeout was only one minute.
Does akka's ask pattern do anything to notify the "askee" that there was a timeout? Is there a way to handle this, e.g. to cancel any remaining work that the actor might have done if there was not a timeout?
The problem
So you need a mecanism to stop a long running computation taking place in the askee, whether or not the asker timed out.
A solution
First of all, only the askee knows how to deal with its own computation. Therefore, it is the only one who can gracefully stop it.
A common way of dealing with this is to pass a maximumTime to the askee in the message indicating the maximum amount of time for it to send a completed answer.
Then, while computing its result, the askee can periodically check if the maximum time has been reached either throw a TimeoutException or send a Failure to the asker:
def receive = {
case MessageWithTimeout(msg, maximumTime) => compute(msg, maximumTime)
}
def compute(msg: Message, maximumTime: Long): T {
val startTime = System.nanoTime()
// ...
// somewhere during the computation:
if(System.nanoTime() - startTime > maximumTime) {
throw new TimeoutException(maximumTime + "exceeded")
}
// ...
}
Doing so, the askee will stop the computation after maximumTime.
If you send the same timeout as the one of the asker, then it is likely that the asker will timeout while waiting, and only then the askee will stop its computation and return.
It is to be noted that if you throw an Exception, then the behavior for the actors should be delegated to the supervisor.
This is not built in, but you could accomplish something close to this if the actor is prepared for a cancellation.
In your sender, you could do something like:
...
val actorForClosure = actor
future onFailure { case _ : AskTimeoutException => actorForClosure ! Cancel }
However, the actor would have to be able to handle a cancellation. If it blocks for 3 minutes, then the cancel request would not get in until after the computation was completed and be pointless. But, if you can break your computation up into chunks that iterate back onto itself, then you could leave a gap for the Cancel to come in between computation. So, cancellation has to be baked in from the start.

Scala how to use akka actors to handle a timing out operation efficiently

I am currently evaluating javascript scripts using Rhino in a restful service. I wish for there to be an evaluation time out.
I have created a mock example actor (using scala 2.10 akka actors).
case class Evaluate(expression: String)
class RhinoActor extends Actor {
override def preStart() = { println("Start context'"); super.preStart()}
def receive = {
case Evaluate(expression) ⇒ {
Thread.sleep(100)
sender ! "complete"
}
}
override def postStop() = { println("Stop context'"); super.postStop()}
}
Now I run use this actor as follows:
def run {
val t = System.currentTimeMillis()
val system = ActorSystem("MySystem")
val actor = system.actorOf(Props[RhinoActor])
implicit val timeout = Timeout(50 milliseconds)
val future = (actor ? Evaluate("10 + 50")).mapTo[String]
val result = Try(Await.result(future, Duration.Inf))
println(System.currentTimeMillis() - t)
println(result)
actor ! PoisonPill
system.shutdown()
}
Is it wise to use the ActorSystem in a closure like this which may have simultaneous requests on it?
Should I make the ActorSystem global, and will that be ok in this context?
Is there a more appropriate alternative approach?
EDIT: I think I need to use futures directly, but I will need the preStart and postStop. Currently investigating.
EDIT: Seems you don't get those hooks with futures.
I'll try and answer some of your questions for you.
First, an ActorSystem is a very heavy weight construct. You should not create one per request that needs an actor. You should create one globally and then use that single instance to spawn your actors (and you won't need system.shutdown() anymore in run). I believe this covers your first two questions.
Your approach of using an actor to execute javascript here seems sound to me. But instead of spinning up an actor per request, you might want to pool a bunch of the RhinoActors behind a Router, with each instance having it's own rhino engine that will be setup during preStart. Doing this will eliminate per request rhino initialization costs, speeding up your js evaluations. Just make sure you size your pool appropriately. Also, you won't need to be sending PoisonPill messages per request if you adopt this approach.
You also might want to look into the non-blocking callbacks onComplete, onSuccess and onFailure as opposed to using the blocking Await. These callbacks also respect timeouts and are preferable to blocking for higher throughput. As long as whatever is way way upstream waiting for this response can handle the asynchronicity (i.e. an async capable web request), then I suggest going this route.
The last thing to keep in mind is that even though code will return to the caller after the timeout if the actor has yet to respond, the actor still goes on processing that message (performing the evaluation). It does not stop and move onto the next message just because a caller timed out. Just wanted to make that clear in case it wasn't.
EDIT
In response to your comment about stopping a long execution there are some things related to Akka to consider first. You can call stop the actor, send a Kill or a PosionPill, but none of these will stop if from processing the message that it's currently processing. They just prevent it from receiving new messages. In your case, with Rhino, if infinite script execution is a possibility, then I suggest handling this within Rhino itself. I would dig into the answers on this post (Stopping the Rhino Engine in middle of execution) and setup your Rhino engine in the actor in such a way that it will stop itself if it has been executing for too long. That failure will kick out to the supervisor (if pooled) and cause that pooled instance to be restarted which will init a new Rhino in preStart. This might be the best approach for dealing with the possibility of long running scripts.

Trouble using scala actors

I have read that, when using react, all actors can execute in a single thread. I often process a collection in parallel and need to output the result. I do not believe System.out.println is threadsafe so I need some protection. One way (a traditional way) I could do this:
val lock = new Object
def printer(msg: Any) {
lock.synchronized {
println(msg)
}
}
(1 until 1000).par.foreach { i =>
printer(i)
}
println("done.")
How does this first solution compare to using actors in terms of efficiency? Is it true that I'm not creating a new thread?
val printer = actor {
loop {
react {
case msg => println(msg)
}
}
}
(1 until 10000).par.foreach { i =>
printer ! i
}
println("done.")
It doesn't seem to be a good alternative however, because the actor code never completes. If I put a println at the bottom it is never hit, even though it looks like it goes through every iteration for i. What am I doing wrong?
As you have it now with your Actor code, you only have one actor doing all the printing. As you can see from running the code, the values are all printed out sequentially by the Actor whereas in the parallel collection code, they're out of order. I'm not too familiar with parallel collections, so I don't know the performance gains between the two.
However, if your code is doing a lot of work in parallel, you probably would want to go with multiple actors. You could do something like this:
def printer = actor {
loop {
react {
case msg => println(msg)
}
}
}
val num_workers = 10
val worker_bees = Vector.fill(num_workers)(printer)
(1 until 1000).foreach { i =>
worker_bees(i % num_workers) ! i
}
The def is important. This way you're actually creating multiple actors and not just flooding one.
One actor instance will never process more than one message at the time. Whatever thread pool is allocated for the actors, each actor instance will only occupy one thread at the time, so you are guaranteed that all the printing will be processed serially.
As for not finishing, the execution of an actor never returns from a react or a loop, so:
val printer = actor {
loop {
react {
case msg => println(msg)
}
// This line is never reached because of react
}
// This line is never reached because of loop
}
If you replace loop and react with a while loop and receive, you'll see that everything inside the while loop executes as expected.
To fix your actor implementation you need to tell the actor to exit before the program will exit as well.
val printer = actor {
loop {
react {
case "stop" => exit()
case msg => println(msg)
}
}
}
(1 until 1000).par.foreach { printer ! _ }
printer ! "stop"
In both your examples there are thread pools involved backing both the parallels library and the actor library but they are created as needed.
However, println is thread safe as it does indeed have a lock in it's internals.
(1 until 1000).par.foreach { println(_) } // is threadsafe
As for performance, there are many factors. The first is that moving from a lock that multiple threads are contending for to a lock being used by only one thread ( one actor ) will increase performance. Second, if you are going to use actors and want performance, use
Akka. Akka actors are blazingly fast when compared to scala actors. Also, I hope that the stdout that println is writing to is going to a file and not the screen since involving the display drivers is going to kill your performance.
Using the parallels library is wonderful for performance since so you can take advantage of multiple cores for your computation. If each computation is very small then try the actor route for centralized reporting. However if each computation is significant and takes a decent amount of cpu time then stick just using println by itself. You really are not in a contended lock situation.
I'm not sure I can understand your problem correctly. For me your actor code works fine and terminates.
Nevertheless, you can savely use println for parallel collections, so all you really need is something like this:
(1 until 1000).par.foreach { println(_) }
Works like a charm here. I assume you already know that the output order will vary, but I just want to stress it again, because the question comes up ever so often. So don't expect the numbers to scroll down your screen in a successive fashion.

What happens when we use loop instead of while(true) with scala actors?

What's the difference of using loop instead of while(true) while using receive with actors. Loop seems to work much faster, but why, and what's going on under the bonnet?
Is there anything bad to use loop instead of while(true)?
More about context. I'm doing performance tests within simple ping/pong code. And I'm using receive.
This is the Ping class:
class ReceivePing(
count : Int,
pong : Actor
) extends Actor {def act() {
var pingsLeft = count - 1
pong ! Start
pong ! ReceivePing
while(true) {
receive {
case ReceivePong =>
if (pingsLeft % 10000 == 0)
Console.println("ReceivePing: pong")
if (pingsLeft > 0) {
pong ! ReceivePing
pingsLeft -= 1
} else {
Console.println("ReceivePing: stop")
pong ! Stop
exit()
}
}
}}}
instead of while(true) it performs better with loop.
Thanks
The while/receive loop blocks a thread, whereas the loop/react construct doesn't. This means the first construct needs one thread per actor, which quickly becomes slow.
According to Haller and Odersky 2006,
An actor that waits in a receive
statement is not represented by a
blocked thread but by a closure that
captures the rest of the actor's
computation. The closure is executed
once a message is sent to the actor
that matches one of the message
patterns specied in the receive.
The execution of the closure is "piggy-backed" on the thread of the sender.
If the receiving closure
terminates, control is returned to the
sender as if a procedure returns. If
the receiving closure blocks in a
second receive, control is returned to
the sender by throwing a special
exception that unwinds the receiver's
call stack.
(Apparently they later changed the behavior of receive and renamed the old receive to react.)
Using loop releases the thread to other tasks, while while doesn't. So, if you are using many actors, the use of loop makes then more efficient. On the other hand, a single actor using while and receive is much faster than one using loop and react (or, for that matter, loop and receive).

Easiest way to do idle processing in a Scala Actor?

I have a scala actor that does some work whenever a client requests it. When, and only when no client is active, I would like the Actor to do some background processing.
What is the easiest way to do this? I can think of two approaches:
Spawn a new thread that times out and wakes up the actor periodically. A straight forward approach, but I would like to avoid creating another thread (to avoid the extra code, complexity and overhead).
The Actor class has a reactWithin method, which could be used to time out from the actor itself. But the documentation says the method doesn't return. So, I am not sure how to use it.
Edit; a clarification:
Assume that the background task can be broken down into smaller units that can be independently processed.
Ok, I see I need to put my 2 cents. From the author's answer I guess the "priority receive" technique is exactly what is needed here. It is possible to find discussion in "Erlang: priority receive question here at SO". The idea is to accept high priority messages first and to accept other messages only in absence of high-priority ones.
As Scala actors are very similar to Erlang, a trivial code to implement this would look like this:
def act = loop {
reactWithin(0) {
case msg: HighPriorityMessage => // process msg
case TIMEOUT =>
react {
case msg: HighPriorityMessage => // process msg
case msg: LowPriorityMessage => // process msg
}
}
}
This works as follows. An actor has a mailbox (queue) with messages. The receive (or receiveWithin) argument is a partial function and Actor library looks for a message in a mailbox which can be applied to this partial function. In our case it would be an object of HighPriorityMessage only. So, if Actor library finds such a message, it applies our partial function and we are processing a message of high priority. Otherwise, reactWithin with timeout 0 calls our partial function with argument TIMEOUT and we immediately try to process any possible message from the queue (as it waits for a message we cannot exclude a possiblity to get HighPriorityMessage).
It sounds like the problem you describe is not well suited to the actor sub-system. An Actor is designed to sequentially process its message queue:
What should happen if the actor is performing the background work and a new task arrives?
An actor can only find out about this is it is continuously checking its mailbox as it performs the background task. How would you implement this (i.e. how would you code the background tasks as a unit of work so that the actor could keep interrupting and checking the mailbox)?
What should happen if the actor has many background tasks in its mailbox in front of the main task?
Do these background tasks get thrown away, or sent to another actor? If the latter, how can you prevent CPU time being given to that actor to perform the tasks?
All in all, it sounds much more like you need to explore some grid-style software that can run in the background (like Data Synapse)!
Just after asking this question I tried out some completely whacky code and it seems to work fine. I am not sure though if there is a gotcha in it.
import scala.actors._
object Idling
object Processor extends Actor {
start
import Actor._
def act() = {
loop {
// here lie dragons >>>>>
if (mailboxSize == 0) this ! Idling
// <<<<<<
react {
case msg:NormalMsg => {
// do the normal work
reply(answer)
}
case Idling=> {
// do the idle work in chunks
}
case msg => println("Rcvd unknown message:" + msg)
}
}
}
}
Explanation
Any code inside the argument of loop but before the call to react seems to get called when the Actor is about to wait for a message. I am sending a Idling message to self here. In the handler for this message I ensure that the mailbox-size is 0, before doing the processing.