How does spray.routing.HttpService dispatch requests? - scala

Disclaimer: I have no scala experience for now, so my question is connected with very basics.
Consider the following example (it may be incomplete):
import akka.actor.{ActorSystem, Props}
import akka.io.IO
import spray.can.Http
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.duration._
import akka.actor.Actor
import spray.routing._
import spray.http._
object Boot extends App {
implicit val system = ActorSystem("my-actor-system")
val service = system.actorOf(Props[MyActor], "my")
implicit val timeout = Timeout(5.seconds)
IO(Http) ? Http.Bind(service, interface = "localhost", port = 8080)
}
class MyActor extends Actor with MyService {
def actorRefFactory = context
def receive = runRoute(myRoute)
}
trait MyService extends HttpService {
val myRoute =
path("my") {
post {
complete {
"PONG"
}
}
}
}
My question is: what actually happens when control reaches complete block? The question seems to be too general, so let me split it.
I see creation of a single actor in the example. Does it mean that the application is single-threaded and uses only one cpu core?
What happens if I do blocking call inside complete?
If p. 1 is true and p. 2 will block, how do I dispatch requests to utilize all cpus? I see two ways: actor per request and actor per connection. The second one seems to be reasonable, but I cannot find the way to do it using spray library.
If the previous question is irrelevant, will detach directive do? And what about passing function returning Future to complete directive? What is the difference between detach and passing function returning the Future?
What is the proper way to configure number of working threads and balance requests/connections?
It would be great if you point me explanations in the official documentation. It is very extensive and I believe I am missing something.
Thank you.

It's answered here by Mathias - one of the Spray authors. Copying his reply for the reference:
In the end the only thing that really completes the request is a call
to requestContext.complete. Thereby it doesn't matter which thread
or Actor context this call is made from. All that matters is that it
does happen within the configured "request-timeout" period. You can of
course issue this call yourself in some way or another, but spray
gives you a number of pre-defined constructs that maybe fit your
architecture better than passing the actual RequestContext around.
Mainly these are:
The complete directive, which simply provides some sugar on top of the "raw" ctx => ctx.complete(…) function literal.
The Future Marshaller, which calls ctx.complete from an future.onComplete handler.
The produce directive, which extracts a function T => Unit that can later be used to complete the request with an instance of a custom
type.
Architecturally, in most cases, it's a good idea to not have the API
layer "leak into" the core of your application. I.e. the application
should not know anything about the API layer or HTTP. It should only
deal with objects of its own domain model. Therefore passing the
RequestContext directly to the application core is mostly not the best
solution.
Resorting to the "ask" and relying on the Future Marshaller is an
obvious, well understood and rather easy alternative. It comes with
the (small) drawback that an ask comes with a mandatory timeout check
itself which logically isn't required (since the spray-can layer
already takes care of request timeouts). The timeout on the ask is
required for technical reasons (so the underlying PromiseActorRef can
be cleaned up if the expected reply never comes).
Another alternative to passing the RequestContext around is the
produce directive (e.g. produce(instanceOf[Foo]) { completer =>
…). It extracts a function that you can pass on to the application
core. When your core logic calls complete(foo) the completion logic
is run and the request completed. Thereby the application core remains
decoupled from the API layer and the overhead is minimal. The
drawbacks of this approach are twofold: first the completer function
is not serializable, so you cannot use this approach across JVM
boundaries. And secondly the completion logic is now running directly
in an actor context of the application core, which might change
runtime behavior in unwanted ways if the Marshaller[Foo] has to do
non-trivial tasks.
A third alternative is to spawn a per-request actor in the API layer
and have it handle the response coming back from the application core.
Then you do not have to use an ask. Still, you end up with the same
problem that the PromiseActorRef underlying an ask has: how to clean
up if no response ever comes back from the application core? With a
re-request actor you have full freedom to implement a solution for
this question. However, if you decide to rely on a timeout (e.g. via
context.setReceiveTimeout) the benefits over an "ask" might be
non-existent.
Which of the described solutions best fits you architecture you need
to decide yourself. However, as I hopefully was able to show, you do
have a couple of alternatives to choose from.
To answer some of your specific questions: There is only a single actor/handler that services the route thus if you make it block Spray will block. This means you want to either complete the route immediately or dispatch work using either of the 3 options above.
There are many examples on the web for these 3 options. The easiest is to wrap your code in a Future. Check also "actor per request" option/example. In the end your architecture will define the most appropriate way to go.
Finally, Spray runs on top of Akka, so all Akka configuration still applies. See HOCON reference.conf and application.conf for Actor threading settings.

Related

Akka(Actor Model) , with Play being built on top of Akka itself, how are the actors used?

I'm fairly new to Akka, but with the tutorials available online, I have managed to understand how it works, in other words creating Actors and its children, creating a supervisor for fault tolerance, how messages go into the mailbox, in a fair sense, I understand how Akka works, but when it comes to using Akka with play, I have been stuck here since a few days now, I understand Akka works like threads, but with play controllers receiving request/responses, where does Akka go into it?
For a sample project, I have a html page sending data to the controller(via post), the controller receives it and runs a cassandra db query to extract data and displays data to a new page, this works fairly easy, but how do I implement this using Akka Actor Models, where does the Akka code go? do I take the http request inside the Actor and query it accordingly? Also do I write the Actor inside the controller itself?
Any kind of suggestions/books/sample projects/code snippets are really welcomed, I can share the code if required, sorry if my question might have seen vague, but in need of a little push, to explore the world of Akka.
Thank you in advance.
Note:
I'm building the sample project on Play Framework using Scala.
I guess you should check the play documentation. There is a nice example on how to use actors. So to answer your question you should preferably define the Actor as you would define any other class in its own file. You can instantiate it as stated in the play documentation or through a supervisor, and send it a message/asking your actor in your Action like so
import scala.concurrent.duration._
import akka.pattern.ask
implicit val timeout: Timeout = 5.seconds
def sayHello(name: String) = Action.async {
(helloActor ? SayHello(name)).mapTo[String].map { message =>
Ok(message)
}
}
With helloActor being your actor and SayHello being a message (case class).
In your case SayHello would query the database and return the data.
Hope it helps a bit.

How do you know what messages you can send to actor?

Is there any standard way of formalizing my scala/akka actor api? Imho, situation where I need to look into implementation to know what to send is not really a good option. Also, if implementation has changed and my message is no longer valid(not invoking action I think it invokes) I don't get any warning or error.
This is a question that is discussed very much in the community. I heard that maybe Akka 3 will have better support for typesafe actors, but that is some time down the road.
In the mean time you could use TypedActors, though the general suggestion is to use them at the boundaries of your application.
A nice approach that does not give you any typesafety, but makes the contract of an actor more visible, is to define the messages an actor can receive in their companion object. This way each time you want to send a message to an actor you choose from the message its companion object defines. This of course works best if you have specific messages for each actor. If you change the implementation you could remove the old message and add a new one, so that everyone who wanted to use the old implementation would get compiler errors.
Lastly there was a nice pattern last week on the mailing list. He creates traits to define the contracts for the actors and their consumers, but you still need to take care that the consumer mix in the correct trait.
In my experience, the best way to make sure everything works is an extensive test suite which will test each actor for itself, but also the communication between specific actors.
The approach generally taken in Erlang is to avoid sending messages to a process directly, and to provide additional API in the same module which defines the behavior of the process. In Akka it would look like
class Foo extends Actor {
// handles messages Bar(x: Int) and Baz
}
object Foo {
def bar(foo: ActorRef, x: Int) = foo ! Bar(x)
def baz(foo: ActorRef) = (foo ? Baz).mapTo[TypeOfResponseToBaz]
}
One problem is handling return messages, since Erlang generally promotes more synchronous style than Akka does. This may be handled by a naming convention (e.g. BarResponse or FooBarResponse if different actors handle the same message with different responses).

What is the best way to get a reference to an Akka Actor

I am a little confused by how I am supposed to get a reference to my Actor once it has been crerated in the Akka system. Say I have the following:
class MyActor(val x: Int) extends Actor {
def receive = {
case Msg => doSth()
}
}
And at some stage I would create the actor as follows:
val actorRef = system.actorOf(Props(new MyActor(2), name = "Some actor")
But when I need to refer to the MyActor object I cannot see any way to get it from the actor ref?
Thanks
Des
I'm adding my comment as an answer since it seems to be viewed as something worth putting in as an answer.
Not being able to directly access the actor is the whole idea behind actor systems. You are not supposed to be able to get at the underlying actor class instance. The actor ref is a lightweight proxy to your actor class instance. Allowing people to directly access the actor instance could lead down the path of mutable data issues/concurrent state update issues and that's a bad path to go down. By forcing you to go through the ref (and thus the mailbox), state and data will always be safe as only one message is processed at a time.
I think cmbaxter has a good answer, but I want to make it just a bit more clear. The ActorRef is your friend for the following reasons:
You cannot ever access the underlying actor. Actors receive their thread of execution from the Dispatcher given to them when they are created. They operate on one mailbox message at a time, so you never have to worry about concurrency inside of the actor unless YOU introduce it (by handling a message asynchronously with a Future or another Actor instance via delegation, for example). If someone had access to the underlying class behind the ActorRef, they could easily call into the actor via a method using another thread, thus negating the point of using the Actor to avoid concurrency in the first place.
ActorRefs provide Location Transparency. By this, I mean that the Actor instance could exist locally on the same JVM and machine as the actor from which you would like to send it a message, or it could exist on a different JVM, on a different box, in a different data center. You don't know, you don't care, and your code is not littered with the details of HOW the message will be sent to that actor, thus making it more declarative (focused on the what you want to do business-wise, not how it will be done). When you start using Remoting or Clusters, this will prove very valuable.
ActorRefs mask the instance of the actor behind it when failure occurs, as well. With the old Scala Actors, you had no such proxy and when an actor "died" (threw an Exception) that resulted in a new instance of the Actor type being created in its place. You then had to propagate that new instance reference to anyone who needed it. With ActorRef, you don't care that the Actor has been reincarnated - it's all transparent.
There is one way to get access to the underlying actor when you want to do testing, using TestActorRef.underlyingActor. This way, you can write unit tests against functionality written in methods inside the actor. Useful for checking basic business logic without worrying about Akka dynamics.
Hope that helps.

Syncronous Scala Future without separate thread

I'm building a library that, as part of its functionality, makes HTTP requests. To get it to work in the multiple environments it'll be deployed in I'd like it to be able to work with or without Futures.
One option is to have the library parametrise the type of its response so you can create an instance of the library with type Future, or an instance with type Id, depending on whether you are using an asynchronous HTTP implementation. (Id might be an Identity monad - enough to expose a consistent interface to users)
I've started with that approach but it has got complicated. What I'd really like to do instead is use the Future type everywhere, boxing synchronous responses in a Future where necessary. However, I understand that using Futures will always entail some kind of threadpool. This won't fly in e.g. AppEngine (a required environment).
Is there a way to create a Future from a value that will be executed on the current thread and thus not cause problems in environments where it isn't possible to spawn threads?
(p.s. as an additional requirement, I need to be able to cross build the library back to Scala v2.9.1 which might limit the features available in scala.concurrent)
From what I understand you wish to execute something and then wrap the result with Future. In that case, you can always use Promise
val p = Promise[Int]
p success 42
val f = p.future
Hence you now have a future wrapper containing the final value 42
Promise is very well explained here .
Take a look at Scalaz version of Future trait. That's based on top of Trampoline mechanism which will be executing by the current thread unless fork or apply won't be called + that completely removes all ExecutionContext imports =)

Scala and akka lifecycle.. again

So this question is related to an old one of mine: Do I need to re-use the same Akka ActorSystem or can I just create one every time I need one?
I asked a question about the lifecycle of actors, and I knew something was wrong in my mind, but couldn't phrase it correctly. Hopefully I can now :-).
Here's the situation. I want to test actors that have dependencies to other components and actors, so I went about composing my actors in bootstrap time (I'm using scalatra but however you bootstrap your app). I therefore have something like this:
trait DependencyComponent
{
val dependency : Dependency
}
trait ActorComponentA extends Actor with DependencyComponent {
val actorB : ActorRef
}
trait ActorComponentB extends Actor with DependencyComponent
Ok, so now I can test my actors by extending the traits and providing mock dependencies, all good. And I can bootstrap my app like so:
Bootstrap
val system = ActorSystem()
val actorA = system.actorOf(Props[DefaultActorA])
class DefaultActorB extends ActorComponentB {
val dependency = new RealDependency()
}
class DefaultActorA extends ActorComponentA {
val dependency = new RealDependency()
val actorB = context.actorOf(Props[DefaultActorB]).withRouter(RoundRobinRouter(nrOfInstances = 100)))
}
Cool, Im happy :-), now I can use the actorSystem and actorA within my app, and it has a 100 actorB routed to pass work to. So when actorA decideds that the work is done, it's my understanding that it should broadcast to the routed actors to shutdown. At this point when another request comes in actorA can no longer send messages to the router because all its actors are dead.
If I wasn't setting this up at boot time then actorA and its dependencies could be created when needed in my app. But that is very much like "newing up on object" in DI world. In order to test I would end up overriding the places where the actors were created.
Scalatra docs are suggesting creating my actors at boot time, so I feel that I am missing somehting here. Any help appreciated.
Cheers, Chris.
EDIT
I've +1 both #futurechimp and #cmbaxter as these both seem valid but slightly conflicting. So this is an open comment to both of you.
So #cmbaxter am I right in thinking that your suggesting never calling 'stop' on the routed actors and just maintaining a pool of them for use by ALL requests. And #futurechimp, your suggesting having the servlet instantiate the actors per request and killing them at the end of there lifecycle. Right?
It seems like per-request will spawn more actors (but dispose of them). Where the poll will have only a limited set for all requests in which case is there a potential bottle neck to this approach?
I guess basically, I'm asking if my assumptions are correct and if so what are the advantage and disadvantages to both approaches?
Instantiating an ActorSystem is expensive - however instantiating an Actor isn't. If you only want to instantiate your ActorSystem in ScalatraBootstrap, and your Actors elsewhere, that should work fine if that's what you need to do. I'll talk to some other people to confirm this, and then change the docs in Scalatra's Akka Guide to avoid confusion in future.
One of the questions you have to ask yourself here is: Are my actors going to be stateful or stateless. If stateless (and I would prefer this approach personally when possible), then they can be "long-lived" and you can start them when the server boots up and leave them running for the duration of the server's life. When you need to talk to them from elsewhere in the code, use system.actorFor(String) or system.actorSelection(String) (depending on what version of akka you are using) to look up the actor and send it a message. If the actors are going to be stateful, then they probably should be "short-lived" and started up in response to an individual request. In this case, you will not start them up when the server boots up; you will only start up the ActorSystem itself. Then, when the request comes in, you will instantiate via system.actorOf instead and make sure that when the work is done that you stop ActorA as it's the supervisor of all the ActorBs and stopping A will stop all of the Bs started by A.