Spray's `detach` Directive - scala

Given the following Spray code:
object Main extends App with SimpleRoutingApp {
implicit val system = ActorSystem("my-system")
val pipeline: HttpRequest => Future[String] = sendReceive ~> unmarshal[String]
startServer(interface = "localhost", port = 8080) {
path("go") {
get {
detach() {
complete {
val req = Post("http://www.google.com") ~> addHeader("Foo", "bar")
pipeline(req).recoverWith[String]{ case _ => Future { "error!" } }
}
}
}
}
}
}
I put the complete function within the detach directive.
The docs explain that detach will: execute the inner route inside a future.
What's the significance of using (or not) detach - from a performance perspective?
I looked at this related answer, but it focuses on how to use detach.

detach is usually needed because routing runs synchronously in an actor. This means that while an HttpRequest is routed, the actor cannot process any other messages at the same time.
However, routing bits that are asynchronous like completing with a Future or using one of the FutureDirectives will also free the original routing actor for new requests.
So, in cases where routing itself is the bottleneck or you complete a request synchronously, adding detach may help. In your case above, you already complete with a Future and have a relatively simple routing structure in which case adding detach won't help much (or may even introduce a tiny bit of latency).
Also, detach comes with some inconsistencies you can read about here:
https://github.com/spray/spray/issues/717
https://github.com/spray/spray/issues/872
An alternative to using detach is using per-request-actors.
In akka-http, routing is implemented on top of Futures to be as asynchronous as possible and not confined to an actor any more so that detach isn't needed and was removed therefore.

Without detach spray will process all requests one by one, while with detach it'll process them parallel. If you can process this requests in parallel, you'd better use detach for better performance.

Related

Asynchronous IO (socket) in Scala

import java.nio.channels.{AsynchronousServerSocketChannel, AsynchronousSocketChannel}
import java.net.InetSocketAddress
import scala.concurrent.{Future, blocking}
class Master {
val server: AsynchronousServerSocketChannel = AsynchronousServerSocketChannel.open()
server.bind(new InetSocketAddress("localhost", port))
val client: Future[AsynchronousSocketChannel] = Future { blocking { server.accept().get() } }
}
This is a pseudo code what I'm trying.
Before asking this question, I searched about it and found a related answer.
In her answer, I wonder what this means: "If you want to absolutely prevent additional threads from ever being created, then you ought to use an AsyncIO library, such as Java's NIO library."
Since I don't want to suffer from either running out of memory (case when using blocking) or thread pool hell (opposite case) , her answer was exactly what I have been looking forward. However, as you can see in my pseudo code, a new thread will be created for each client (I made just one client for the sake of simplicity) due to blocking even though I used NIO as she said.
Please explain her suggestion with a simple example.
Is my pseudo code an appropriate approach when trying to asynchronous io in Scala or is there a better alternative way?
Answer to Question 1
She suggests two things
a) If you are using future with blocking call then use scala.concurrent.blocking. blocking tells the default execution context to spawn temporary threads to stop starvation.
lets say blockingServe() does blocking. To do multiple blockingServes we use Futures.
Future {
blockingServe() //blockingServe() could be serverSocket.accept()
}
But above code leads to starvation in evented model. In order to deal with starvation. We have to ask execution context to create new temporary threads to serve extra requests. This is communicated to execution context using scala.concurrent.blocking
Future {
blocking {
blockingServe() //blockingServe() could be serverSocket.accept()
}
}
b)
We have not achieved non-blocking still. Code is still blocking but blocking in different thread (asynchronous)
How can we achieve true non blocking ?
We can achieve true non-blocking using non-blocking api.
So, in your case you have to use java.nio.channels.ServerSocketChannel to go for true non blocking model.
Notice in your code snippet you have mixed a) and b) which not required
Answer to Question 2
val selector = Selector.open()
val serverChannel = ServerSocketChannel.open()
serverChannel.configureBlocking(false)
serverChannel.socket().bind(new InetSocketAddress("192.168.2.1", 5000))
serverChannel.register(selector, SelectionKey.OP_ACCEPT)
def assignWork[A](serverChannel: ServerSocketChannel, selector: Selector, work: => Future[A]) = {
work
//recurse
}
assignWork[Unit](serverChannel, selector, Future(()))
}

Is it correct to use `Future` to run some loop task which is never finished?

In our project, we need to do a task which listens to a queue and process the coming messages in a loop, which is never finished. The code is looking like:
def processQueue = {
while(true) {
val message = queue.next();
processMessage(message) match {
case Success(_) => ...
case _ => ...
}
}
}
So we want to run it in a separate thread.
I can imagine two ways to do it, one is to use Thread as what we do in Java:
new Thread(new Runnable() { processQueue() }).start();
Another way is use Future (as we did now):
Future { processQueue }
I just wonder if is it correct to use Future in this case, since as I know(which might be wrong), Future is mean to be running some task which will finish or return a result in some time of the future. But our task is never finished.
I also wonder what's the best solution for this in scala.
A Future is supposed to a value that will eventually exist, so I don't think it makes much sense to create one that will never be fulfilled. They're also immutable, so passing information to them is a no-no. And using some externally referenced queue within the Future sounds like dark road to go down.
What you're describing is basically an Akka Actor, which has it's own FIFO queue, with a receive method to process messages. It would look something like this:
import akka.actor._
class Processor extends Actor {
def receive = {
case msg: String => processMessage(msg) match {
case Success(x) => ...
case _ => ...
}
case otherMsg # Message(_, _) => {
// process this other type of message..
}
}
}
Your application could create a single instance of this Processor actor with an ActorSystem (or some other elaborate group of these actors):
val akkaSystem = ActorSystem("myActorSystem")
val processor: ActorRef = akkaSystem.actorOf(Props[Processor], "Processor")
And send it messages:
processor ! "Do some work!"
In short, it's a better idea to use a concurrency framework like Akka than to create your own for processing queues on separate threads. The Future API is most definitely not the way to go.
I suggest perusing the Akka Documentation for more information.
If you are just running one thread (aside from the main thread), it won't matter. If you do this repeatedly, and you really want lots of separate threads, you should use Thread since that is what it is for. Futures are built with the assumption that they'll terminate, so you might run out of pool threads.

What effect does using Action.async have, since Play uses Netty which is non-blocking

Since Netty is a non-blocking server, what effect does changing an action to using .async?
def index = Action { ... }
versus
def index = Action.async { ... }
I understand that with .async you will get a Future[SimpleResult]. But since Netty is non-blocking, will Play do something similar under the covers anyway?
What effect will this have on throughput/scalability? Is this a hard question to answer where it depends on other factors?
The reason I am asking is, I have my own custom Action and I wanted to reset the cookie timeout for every page request so I am doing this which is a async call:
object MyAction extends ActionBuilder[abc123] {
def invokeBlock[A](request: Request[A], block: (abc123[A]) => Future[SimpleResult]) = {
...
val result: Future[SimpleResult] = block(new abc123(..., result))
result.map(_.withCookies(...))
}
}
The take away from the above snippet is I am using a Future[SimpleResult], is this similar to calling Action.async but this is inside of my Action itself?
I want to understand what effect this will have on my application design. It seems like just for the ability to set my cookie on a per request basis I have changed from blocking to non-blocking. But I am confused since Netty is non-blocking, maybe I haven't really changed anything in reality as it was already async?
Or have I simply created another async call embedded in another one?
Hoping someone can clarify this with some details and how or what effect this will have in performance/throughput.
def index = Action { ... } is non-blocking you are right.
The purpose of Action.async is simply to make it easier to work with Futures in your actions.
For example:
def index = Action.async {
val allOptionsFuture: Future[List[UserOption]] = optionService.findAll()
allOptionFuture map {
options =>
Ok(views.html.main(options))
}
}
Here my service returns a Future, and to avoid dealing with extracting the result I just map it to a Future[SimpleResult] and Action.async takes care of the rest.
If my service was returning List[UserOption] directly I could just use Action.apply, but under the hood it would still be non-blocking.
If you look at Action source code, you can even see that apply eventually calls async:
https://github.com/playframework/playframework/blob/2.3.x/framework/src/play/src/main/scala/play/api/mvc/Action.scala#L432
I happened to come across this question, I like the answer from #vptheron, and I also want to share something I read from book "Reactive Web Applications", which, I think, is also great.
The Action.async builder expects to be given a function of type Request => Future[Result]. Actions declared in this fashion are not much different from plain Action { request => ... } calls, the only difference is that Play knows that Action.async actions are already asynchronous, so it doesn’t wrap their contents in a future block.
That’s right — Play will by default schedule any Action body to be executed asynchronously against its default web worker pool by wrapping the execution in a future. The only difference between Action and Action.async is that in the second case, we’re taking care of providing an asynchronous computation.
It also presented one sample:
def listFiles = Action { implicit request =>
val files = new java.io.File(".").listFiles
Ok(files.map(_.getName).mkString(", "))
}
which is problematic, given its use of the blocking java.io.File API.
Here the java.io.File API is performing a blocking I/O operation, which means that one of the few threads of Play's web worker pool will be hijacked while the OS figures out the list of files in the execution directory. This is the kind of situation you should avoid at all costs, because it means that the worker pool may run out of threads.
-
The reactive audit tool, available at https://github.com/octo-online/reactive-audit, aims to point out blocking calls in a project.
Hope it helps, too.

Scala how to use akka actors to handle a timing out operation efficiently

I am currently evaluating javascript scripts using Rhino in a restful service. I wish for there to be an evaluation time out.
I have created a mock example actor (using scala 2.10 akka actors).
case class Evaluate(expression: String)
class RhinoActor extends Actor {
override def preStart() = { println("Start context'"); super.preStart()}
def receive = {
case Evaluate(expression) ⇒ {
Thread.sleep(100)
sender ! "complete"
}
}
override def postStop() = { println("Stop context'"); super.postStop()}
}
Now I run use this actor as follows:
def run {
val t = System.currentTimeMillis()
val system = ActorSystem("MySystem")
val actor = system.actorOf(Props[RhinoActor])
implicit val timeout = Timeout(50 milliseconds)
val future = (actor ? Evaluate("10 + 50")).mapTo[String]
val result = Try(Await.result(future, Duration.Inf))
println(System.currentTimeMillis() - t)
println(result)
actor ! PoisonPill
system.shutdown()
}
Is it wise to use the ActorSystem in a closure like this which may have simultaneous requests on it?
Should I make the ActorSystem global, and will that be ok in this context?
Is there a more appropriate alternative approach?
EDIT: I think I need to use futures directly, but I will need the preStart and postStop. Currently investigating.
EDIT: Seems you don't get those hooks with futures.
I'll try and answer some of your questions for you.
First, an ActorSystem is a very heavy weight construct. You should not create one per request that needs an actor. You should create one globally and then use that single instance to spawn your actors (and you won't need system.shutdown() anymore in run). I believe this covers your first two questions.
Your approach of using an actor to execute javascript here seems sound to me. But instead of spinning up an actor per request, you might want to pool a bunch of the RhinoActors behind a Router, with each instance having it's own rhino engine that will be setup during preStart. Doing this will eliminate per request rhino initialization costs, speeding up your js evaluations. Just make sure you size your pool appropriately. Also, you won't need to be sending PoisonPill messages per request if you adopt this approach.
You also might want to look into the non-blocking callbacks onComplete, onSuccess and onFailure as opposed to using the blocking Await. These callbacks also respect timeouts and are preferable to blocking for higher throughput. As long as whatever is way way upstream waiting for this response can handle the asynchronicity (i.e. an async capable web request), then I suggest going this route.
The last thing to keep in mind is that even though code will return to the caller after the timeout if the actor has yet to respond, the actor still goes on processing that message (performing the evaluation). It does not stop and move onto the next message just because a caller timed out. Just wanted to make that clear in case it wasn't.
EDIT
In response to your comment about stopping a long execution there are some things related to Akka to consider first. You can call stop the actor, send a Kill or a PosionPill, but none of these will stop if from processing the message that it's currently processing. They just prevent it from receiving new messages. In your case, with Rhino, if infinite script execution is a possibility, then I suggest handling this within Rhino itself. I would dig into the answers on this post (Stopping the Rhino Engine in middle of execution) and setup your Rhino engine in the actor in such a way that it will stop itself if it has been executing for too long. That failure will kick out to the supervisor (if pooled) and cause that pooled instance to be restarted which will init a new Rhino in preStart. This might be the best approach for dealing with the possibility of long running scripts.

Does this Scala actor block when creating new actor in a handler?

I have the following piece of code:
actor {
loop {
react {
case SomeEvent =>
//I want to submit a piece of work to a queue and then send a response
//when that is finished. However, I don't want *this* actor to block
val params = "Some args"
val f: Future[Any] = myQueue.submitWork( params );
actor {
//await here
val response = f.get
publisher ! response
}
}
}
}
As I understood it, the outer actor would not block on f.get because that is actually being performed by a separate actor (the one created inside the SomeEvent handler).
Is this correct?
Yes, that is correct. Your outer actor will simply create an actor and suspend (wait for its next message). However, be very careful of this kind of thing. The inner actor is started automatically on the scheduler, to be handled by a thread. That thread will block on that Future (that looks like a java.util.concurrent.Future to me). If you do this enough times, you can run into starvation problems where all available threads are blocking on Futures. An actor is basically a work queue, so you should use those semantics instead.
Here's a version of your code, using the Scalaz actors library. This library is much simpler and easier to understand than the standard Scala actors (the source is literally a page and a half). It also leads to much terser code:
actor {(e: SomeEvent) => promise { ... } to publisher }
This version is completely non-blocking.