Akka IO and TestActorRef - scala

If I need to write an integration test involving HTTP request via spray-can, how can I make sure that spray-can is using CallingThreadDispatcher?
Currently the following actor will print None
class Processor extends Actor {
override def receive = {
case Msg(n) =>
val response = (IO(Http) ? HttpRequest(GET, Uri("http://www.google.com"))).mapTo[HttpResponse]
println(response.value)
}
}
How can I make sure that the request is being performed on the same thread as the test (resulting in a synchronous request)?

It seems like strange way to do integration internal-testing as you don't mock the "Google", so is more like integration external-testing and synchronous TestActorRef doesn't much fit here. The requirement to control threads inside spray is also pretty tricky. However, if you really need that for http-request - it's possible. In general case, you have to setup several dispatchers in your application.conf:
"manager-dispatcher" (from Http.scala) to dispatch your IO(Http) ? req
"host-connector-dispatcher" to use it by HttpHostConnector(or ProxyHttpHostConnector) which actually dispatch your request
"settings-group-dispatcher" for Http.Connect
They all are decribed in Configuration Section of spray documentation. And they all are pointing to "akka.actor.default-dispatcher" (see Dispatchers), so you can change all of them by changing this one.
The problem here is that calling thread is not guaranteed to be your thread actually, so it will NOT help much with your tests. Just imagine if some of actors registers some handler, responding to your message:
//somewhere in spray...
case r#Request => registerHandler(() => {
...
sender ! response
})
The response may be sent from another thread, so response.value may still be None in current. Actually, the response will be sent from the listening thread of underlying socket library, indepently from your test's thread. Simply saying, request may be sent in one (your) thread, but the response is received in another.
If you really really need to block here, I would recommend you to move such code samples (like IO(Http) ? HttpRequest) out and mock them in any convinient way inside your tests. Smtng like that:
trait AskGoogle {
def okeyGoogle = IO(Http) ? HttpRequest(GET, Uri("http://www.google.com"))).mapTo[HttpResponse]
}
trait AskGoogleMock extends AskGoogle {
def okeyGoogle = Await.result(super.okeyGoogle, timeout)
}
class Processor extends Actor with AskGoogle {
override def receive = {
case Msg(n) =>
val response = okeyGoogle
println(response.value)
}
}
val realActor = system.actorOf(Props[Processor])
val mockedActor = TestActorRef[Processor with AskGoogleMock]
By the way, you can mock IO(HTTP) with another TestActorRef to the custom actor, which will do the outside requests for you - it should require minimal code changes if you have a big project.

Related

Non-blocking updates of mutable state with Akka Actors

EDIT: clarification of intent:
I have a (5-10 second) scala computation that aggregates some data from many AWS S3 objects at a given point in time. I want to make this information available through a REST API. I'd also like to update this information every minute or so for new objects that have been written to this bucket in the interim. The summary itself will be a large JSON blob, and can save a bunch of AWS calls if I cache the results of my S3 API calls from the previous updates (since these objects are immutable).
I'm currently writing this Spray.io based REST service in Scala. I'd like the REST server to continue serving 'stale' data even if a computation is currently taking place. Then once the computation is finished, I'd like to atomically start serving requests of the new data snapshot.
My initial idea was to have two actors, one doing the Spray routing and serving, and the other handling the long running computation and feeding the most recent cached result to the routing actor:
class MyCompute extends Actor {
var myvar = 1.0 // will eventually be several megabytes of state
import context.dispatcher
// [ALTERNATIVE A]:
// def compute() = this.synchronized { Thread.sleep(3000); myvar += 1.0 }
// [ALTERNATIVE B]:
// def compute() = { Thread.sleep(3000); this.synchronized { myvar += 1.0 }}
def compute() = { Thread.sleep(3000); myvar += 1.0 }
def receive = {
case "compute" => {
compute() // BAD: blocks this thread!
// [FUTURE]:
Future(compute()) // BAD: Not threadsafe
}
case "retrieve" => {
sender ! myvar
// [ALTERNATIVE C]:
// sender ! this.synchronized { myvar }
}
}
}
class MyHttpService(val dataService:ActorRef) extends HttpServiceActor {
implicit val timeout = Timeout(1 seconds)
import context.dispatcher
def receive = runRoute {
path("ping") {
get {
complete {
(dataService ? "retrieve").map(_.toString).mapTo[String]
}
}
} ~
path("compute") {
post {
complete {
dataService ! "compute"
"computing.."
}
}
}
}
}
object Boot extends App {
implicit val system = ActorSystem("spray-sample-system")
implicit val timeout = Timeout(1 seconds)
val dataService = system.actorOf(Props[MyCompute], name="MyCompute")
val httpService = system.actorOf(Props(classOf[MyHttpService], dataService), name="MyRouter")
val cancellable = system.scheduler.schedule(0 milliseconds, 5000 milliseconds, dataService, "compute")
IO(Http) ? Http.Bind(httpService, system.settings.config.getString("app.interface"), system.settings.config.getInt("app.port"))
}
As things are written, everything is safe, but when passed a "compute" message, the MyCompute actor will block the thread, and not be able to serve requests to the MyHttpService actor.
Some alternatives:
akka.agent
The akka.agent.Agent looks like it is designed to handle this problem nicely (replacing the MyCompute actor with an Agent), except that it seems to be designed for simpler updates of state:: In reality, MyCompute will have multiple bits of state (some of which are several megabyte datastructures), and using the sendOff functionality would seemingly rewrite all of that state every time which would seemingly apply a lot of GC pressure unnecessarily.
Synchronization
The [Future] code above solves the blocking problem, but as if I'm reading the Akka docs correctly, this would not be threadsafe. Would adding a synchronize block in [ALTERNATIVE A] solve this? I would also imagine that I only have to synchronize the actual update to the state in [ALTERNATIVE B] as well. I would seemingly also have to do the same for the reading of the state as in [ALTERNATIVE C] as well?
Spray-cache
The spray-cache pattern seems to be built with a web serving use case in mind (small cached objects available with a key), so I'm not sure if it applies here.
Futures with pipeTo
I've seen examples of wrapping a long running computation in a Future and then piping that back to the same actor with pipeTo to update internal state.
The problem with this is: what if I want to update the mutable internal state of my actor during the long running computation?
Does anyone have any thoughts or suggestions for this use case?
tl;dr:
I want my actor to update internal, mutable state during a long running computation without blocking. Ideas?
So let the MyCompute actor create a Worker actor for each computation:
A "compute" comes to MyCompute
It remembers the sender and spawns the Worker actor. It stores the Worker and the Sender in Map[Worker, Sender]
Worker does the computation. On finish, Worker sends the result to MyCompute
MyCompute updates the result, retrieves the orderer of it from the Map[Worker, Sender] using the completed Worker as the key. Then it sends the result to the orderer, and then it terminates the Worker.
Whenever you have blocking in an Actor, you spawn a dedicated actor to handle it. Whenever you need to use another thread or Future in Actor, you spawn a dedicated actor. Whenever you need to abstract any complexity in Actor, you spawn another actor.

Assert order of messages received using Akka TestProbe

We have an actor that we are writing unit tests for, and as part of the tests we want to assert that certain messages are sent to another actor in a certain order. In our unit tests, the actor receiving the messages is represented by an Akka TestProbe, which gets injected into the actor under test when it is created.
It is no problem to assert that the messages were sent to the test probe, however we have been struggling to work out a way to assert they are sent in the correct order (we could not find any suitable methods for doing this in the documentation). Any ideas how we achieve this?
Below is a minimal implementation that highlights the problem.
Implementation
case class Message(message: String)
case class ForwardedMessage(message: String)
class ForwardingActor(forwardTo: ActorRef) extends Actor {
def receive = {
case Message(message) =>
forwardTo ! ForwardedMessage(message)
}
}
Unit Test
class ForwardMessagesInOrderTest extends TestKit(ActorSystem("testSystem"))
with WordSpecLike
with MustMatchers {
"A forwarding actor" must {
val forwardingReceiver = TestProbe()
val forwardingActor = system.actorOf(Props(new ForwardingActor(forwardingReceiver.ref)))
"forward messages in the order they are received" in {
forwardingActor ! Message("First message")
forwardingActor ! Message("Second message")
// This is the closest way we have found of achieving what we are looking for, it asserts
// that both messages were received, but doesn't assert correct order. The test will pass
// regardless which way round we put the messages below.
forwardingReceiver.expectMsgAllOf(
ForwardedMessage("Second message"),
ForwardedMessage("First message"))
}
}
}
I'm going to suggest two changes to your test spec. First, when creating the actor under test, use a TestActorRef like so:
val forwardingActor = TestActorRef(new ForwardingActor(forwardingReceiver.ref))
Using a TestActorRef will assure that the CallingThreadDispatcher is used, removing any complications from testing async code (which an actor is). Once you do that, you can change your assertions to:
forwardingReceiver.expectMsg(ForwardedMessage("First message"))
forwardingReceiver.expectMsg(ForwardedMessage("Second message"))
These assertions are inherently In-Order, so if things came in out of this order, they will fail. This should fix your issues.

spray, scala : change the timeout

I want to change the timeout in a spray application, but what is the simplest way to achieve this? I saw some examples on github but they are rather complicated.
thanks.
I tried this :
class MyServiceActor extends Actor with MyService {
sender ! SetRequestTimeout(scala.concurrent.duration.Duration(120,"second"))
sender ! spray.io.ConnectionTimeouts.SetIdleTimeout(scala.concurrent.duration.Duration(120,"second"))
// the HttpService trait defines only one abstract member, which
// connects the services environment to the enclosing actor or test
def actorRefFactory = context
// this actor only runs our route, but you could add
// other things here, like request stream processing
// or timeout handling
def receive = runRoute( myRoute )
}
but the timeout seems to stay at ~5 seconds.
you should be able to configure the timeout using the timeout configuration value for the underlying spray can server
spray.can.server {
request-timeout = 2s
}

Puzzled by the spawned actor from a Spray route

I am doing some Http request processing using Spray. For a request I spin up an actor and send the payload to the actor for processing and after the actor is done working on the payload, I call context.stop(self) on the actor to wind the actor down.The idea is to prevent oversaturation of actors on the physical machine.
This is how I have things set up..
In httphandler.scala, I have the route set up as follows:
path("users"){
get{
requestContext => {
val userWorker = actorRefFactory.actorOf(Props(new UserWorker(userservice,requestContext)))
userWorker ! getusers //get user is a case object
}
}
} ~ path("users"){
post{
entity(as[UserInfo]){
requestContext => {
userInfo => {
val userWorker = actorRefFactory.actorOf(Props(new UserWorker(userservice,requestContext)))
userWorker ! userInfo
}
}
}
}
}
My UserWorker actor is defined as follows:
trait RouteServiceActor extends Actor{
implicit val system = context.system
import system.dispatcher
def processRequest[responseModel:ToResponseMarshaller](requestContex:RequestContext)(processFunc: => responseModel):Unit = {
Future{
processFunc
} onComplete {
case Success(result) => {
requestContext.complete(result)
}
case Failure(error) => requestContext.complete(error)
}
}
}
class UserWorker(userservice: UserServiceComponent#UserService,requestContext:RequestContext) extends RouteServiceActor{
def receive = {
case getusers => processRequest(requestContext){
userservice.getAllUsers
}
context.stop(self)
}
case userInfo:UserInfo => {
processRequest(requestContext){
userservice.createUser(userInfo)
}
context.stop(self)
}
}
My first question is, am I handling the request in a true asynchronous fashion? What are some of the pitfalls with my code?
My second question is how does the requestContext.complete work? Since the original request processing thread is no longer there, how does the requestContext send the result of the computation back to the client.
My third question is that since I am calling context.stop(self) after each of my partial methods, is it possible that I terminate the worker while it is in the midst of processing a different message.
What I mean is that while the Actor receives a message to process getusers, the same actor is done processing UserInfo and terminates the Actor before it can get to the "getusers" message. I am creating new actors upon every request, but is it possible that under the covers, the actorRefFactory provides a reference to a previously created actor, instead of a new one?
I am pretty confused by all the abstractions and it would be great if somebody could break it down for me.
Thanks
1) Is the request handled asynchronously? Yes, it is. However, you don't gain much with your per-request actors if you immediately delegate the actual processing to a future. In this simple case a cleaner way would be to write your route just as
path("users") {
get {
complete(getUsers())
}
}
def getUsers(): Future[Users] = // ... invoke userservice
Per-request-actors make more sense if you also want to make route-processing logic run in parallel or if handling the request has more complex requirements, e.g. if you need to query things from multiple service in parallel or need to keep per-request state while some background services are processing the request. See https://github.com/NET-A-PORTER/spray-actor-per-request for some information about this general topic.
2) How does requestContext.complete work? Behind the scenes it sends the HTTP response to the spray-can HTTP connection actor as a normal actor message "tell". So, basically the RequestContext just wraps an ActorRef to the HTTP connection which is safe to use concurrently.
3) Is it possible that "the worker" is terminated by context.stop(self)? I think there's some confusion about how things are scheduled behind the scenes. Of course, you are terminating the actor with context.stop but that just stops the actor but not any threads (as threads are managed completely independently from actor instances in Akka). As you didn't really make use of an actor's advantages, i.e. encapsulating and synchronizing access to mutable state, everything should work (but as said in 1) is needlessly complex for this use case). The akka documentation has lots of information about how actors, futures, dispatchers, and ExecutionContexts work together to make everything work.
In addition to jrudolph answer your spray routing structure shouldn't even compile, cause in your post branch you don't explicitly specify a requestContext. This structure can be simplified a bit to this:
def spawnWorker(implicit ctx: RequestContext): ActorRef = {
actorRefFactory actorOf Props(new UserWorker(userservice, ctx))
}
lazy val route: Route = {
path("users") { implicit ctx =>
get {
spawnWorker ! getUsers
} ~
(post & entity(as[UserInfo])) {
info => spawnWorker ! info
}
}
}
The line info => spawnWorker ! info can be also simplified to spawnWorker ! _.
Also there is an important point concerning explicit ctx declaration and complete directive. If you explicitly declared ctx in your route, you can't use complete directive, you have to explicitly write ctx.complete(...), link on this issue

Is it possible to receive or react to a message outside of act function?

I'm using RXTX library to send some data to serial port. After sending data I must wait 1 second for an ACK. I have this functionality implemented using an ArrayBlockingQueue:
Eg.
val queue = ArrayBlockingQueue(1)
def send(data2Send : Array[Byte]) : Array[Byte]{
out.write(data2Send)
queue.poll(1000)
}
def receive(receivedData : Array[Byte]){
queue.add(receivedData)
}
This works perfectly, but since I'm learning Scala I would like to use Actors insted of threads and locking structures.
My first attempt is as follows:
class Serial {
sender = new Sender(...)
new Receiver(...).start
class Sender {
def send(data2Send : Array[Byte]) : Array[Byte]{
out.write(data2Send)
receiveWithin(WAIT_TIMEOUT_MILLIS) {
case response => response
case TIMEOUT => null
}
}
}
class Receiver extends Actor{
def act{
loop{
sender ! read()
}
}
}
}
But this code throws a java.lang.AssertionError: assertion failed: receive from channel belonging to other actor. I think the problem is that I can't use receive or react outside the act definition. Is right the aproach that I'm following?
Second attempt:
class Serial {
new Sender(...).start
new Receiver(...).start
def send() = (sender ?! data2Send).asInstanceOf(Array[Byte])
class Sender {
def act() {
loop{
receive{
out.write(data2Send)
receiveWithin(WAIT_TIMEOUT_MILLIS) {
case response => response
case TIMEOUT => null
}
}
}
}
}
class Receiver extends Actor{
def act{
loop{
sender ! read()
}
}
}
}
In this second attempt I get java.util.NoSuchElementException: head of empty list when sender ! read() line is executed. And it looks a lot more complex
Unless you are using NIO, you can't avoid blocking in this situation. Ultimately your message is coming from a socket which you need to read() from (i.e. some thread, somewhere, must block).
Looking at your 3 examples (even assuming they all worked), if you found a bug at 2am in six months' time, which code snippet do you think you'd rather be looking at? I know which one I would choose!
Actors are great for sending asynchronous messages around event-driven systems. Where you hit some external API which uses sockets, you can wrap an actor-like facade around them, so that they can interact with other parts of the system (so that the rest of the system is protected from knowing the implementation details). A few pointers:
For the actual actor which has to deal with reading/writing to the socket, keep it simple.
Try and organize the system such that communication with this actor is asynchronous (i.e. other actors are not blocking and awaiting replies)
Given that the scala actor library is imminently being deprecated, I'd start using akka