How to ensure message consistency when using futures in Akka - scala

I would like to understand how to work with a stateful actor when I have async calls within the action.
Consider the following actor:
#Singleton
class MyActor #Inject() () extends Actor with LazyLogging {
import context.dispatcher
override def receive: Receive = {
case Test(id: String) =>
Future { logger.debug(s"id [$id]") }
}
}
and a call to this actor:
Stream.range(1, 100).foreach { i =>
MyActor ! Test(i.toString)
}
This will give me an inconsistent printing of the series.
How am I supposed to use futures inside an actor without loosing the entire "one message after another" functionality?

You should store that Future in a var then on every next message you should make a flatMap call.
if(storedFut == null) storedFut = Future { logger.debug(s"id [$id]") }
else storedFut = storedFut.flatMap(_ => Future { logger.debug(s"id [$id]") })
flatMap is exactly for ordering of Futures.
Sidenote
If you want thing to happen in parallel you're in the nondeterministic zone, where you cannot impose ordering

What you're observing is not a violation of Akka's "message ordering per sender–receiver pair" guarantee; what you're observing is the nondeterministic nature of Futures, as #almendar mentions in his or her answer.
The Test messages are being sent to MyActor sequentially, from Test("1") to Test("100"), and MyActor is processing each message in its receive block in that same order. However, you're logging each message inside a Future, and the order in which those Futures are completed is nondeterministic. This is why you see the "inconsistent printing of the series."
To get the desired behavior of sequential logging of the messages, don't wrap the logging in a Future. If you must use a Future inside an actor's receive block, then #almendar's approach of using a var inside the actor is safe.

You can use context.become and stash messages, wait for the end of the future and process another message.
More about how to use stash with example you can find in documentation http://doc.akka.io/api/akka/current/akka/actor/Stash.html
Remember - messages ordering is guarantee only if messages are sent from the same machine because of network characteristic.

Another way would be to send itself a message on Future.onComplete assuming that there are no restrictions on the order of processing
//in receive
val future = Future { logger.debug(s"id [$id]") }
f.onComplete {
case Success(value) => self ! TestResult(s"Got the callback, meaning = $value")
case Failure(e) => self ! TestError(e)
}

Related

Akka Actor - pipeTo - Is it inadvisable to use values from the received message in the piped reply?

I'm handling a Future in an Actor with the pipeTo pattern which seems to be working ok.
In the example I've mocked up below UserProxyActor asks UserActivityActor with a Get(userId) message.
I want to include the parameters of the Get message in the response so that the receiving actor has everything it needs to process the message. For example, insert the activities into a DB with the related userId.
Is the userId available in the map call or does it get "closed over"?
Is this going to work because the ask pattern will block?
Is there some much nicer way to do this that I haven't come across?
class UserActivityActor(repository: UserActivityRepository) extends Actor {
import akka.pattern.pipe
import UserActivityActor._
implicit val ec: ExecutionContext = context.dispatcher
def receive = {
case Get(userId) =>
// user's historical activities are retrieved
// via the separate repository
repository.queryHistoricalActivities(userId)
.map(a => UserActivityReceived(userId, a)) // wrap the completed future value in a message
.recover{case ex => RepoFailure(ex.getMessage)} // wrap failure in a local message type
.pipeTo(sender())
class UserProxyActor(userActivities: ActorRef) extends Actor {
import UserProxyActor._
import akka.pattern.{ ask, pipe }
implicit val ec: ExecutionContext = context.dispatcher
implicit val timeout = Timeout(5 seconds)
def receive = {
case GetUserActivities(user) =>
(userActivities ? UserActivityActor.Get(user))
.pipeTo(sender())
}
}
Is the userId available in the map call or does it get "closed over"?
Get should be immutable if yes userId will be available.
Is this going to work because the ask pattern will block?
The actor receives a Get message, creates a Future and then process another message. No blocking at all. Ask's Future is not completed until the Future is completed or a timeout occurs.
Is there some much nicer way to do this that I haven't come across?
Looks nice if repository.queryHistoricalActivities(userId) is not blocking call.
I don't think there's anything wrong with what you did.
The only thing I want to mention is that I personally prefer not to use ask when communicating between actors (and keep in mind, this is largely a personal preference), so I'd do something like this:
class UserActivityActor(repository: UserActivityRepository) extends Actor {
import akka.pattern.pipe
import UserActivityActor._
implicit val ec: ExecutionContext = context.dispatcher
def receive = {
case Get(userId) =>
// user's historical activities are retrieved
// via the separate repository
repository.queryHistoricalActivities(userId)
.map(a => UserActivityReceived(userId, a)) // wrap the completed future value in a message
.recover{case ex => RepoFailure(ex.getMessage)} // wrap failure in a local message type
.pipeTo(sender())
class UserProxyActor(userActivities: ActorRef) extends Actor {
import UserProxyActor._
def receive = {
case GetUserActivities(user, sender()) =>
userActivities.forward(UserActivityActor.Get(user))
}
}
This does remove the time out behavior though (in your original implementation, the requesting actor will wait at most 5 seconds, or receive a failure, in mine, it may wait indefinitely).

Akka: Keep unmatched messages in the mailbox

I'm familiar with Erlang/Elixir, in which messages that are in a process' mailbox remain in the mailbox until they are matched:
The patterns Pattern are sequentially matched against the first message in time order in the mailbox, then the second, and so on. If a match succeeds and the optional guard sequence GuardSeq is true, the corresponding Body is evaluated. The matching message is consumed, that is, removed from the mailbox, while any other messages in the mailbox remain unchanged.
(http://erlang.org/doc/reference_manual/expressions.html#receive)
However, with Akka Actors unmatched messages are removed from the mailbox.
This is annoying when implementing for instance forks in a dining philosophers simulation:
import akka.actor._
object Fork {
def props(id: Int): Props = Props(new Fork(id))
final case class Take(philosopher: Int)
final case class Release(philosopher: Int)
final case class TookFork(fork: Int)
final case class ReleasedFork(fork: Int)
}
class Fork(val id: Int) extends Actor {
import Fork._
object Status extends Enumeration {
val FREE, TAKEN = Value
}
private var _status: Status.Value = Status.FREE
private var _held_by: Int = -1
def receive = {
case Take(philosopher) if _status == Status.FREE => {
println(s"\tPhilosopher $philosopher takes fork $id.")
take(philosopher)
sender() ! TookFork(id)
context.become(taken, false)
}
case Release(philosopher) if _status == Status.TAKEN && _held_by == philosopher => {
println(s"\tPhilosopher $philosopher puts down fork $id.")
release()
sender() ! ReleasedFork(id)
context.unbecome()
}
}
def take(philosopher: Int) = {
_status = Status.TAKEN
_held_by = philosopher
}
def release() = {
_status = Status.FREE
_held_by = -1
}
}
When a Take(<philosopher>) message is sent to the fork,
we want the message to stay in the mailbox until the fork is released and the message is matched. However, in Akka Take(<philosopher>) messages are dropped from the mailbox if the fork is currently taken, since there is no match.
Currently, I solve this problem by overriding the unhandled method of the Fork actor and forwarding the message to the fork again:
override def unhandled(message: Any): Unit = {
self forward message
}
I believe this is terribly inefficient as it keeps sending the message to the fork until it is matched. Is there another way to solve this problem which does not involve continuously forwarding unmatched messages?
I believe that worst case I will have to implement a custom mailbox type that mimics Erlang mailboxes, as described here: http://ndpar.blogspot.com/2010/11/erlang-explained-selective-receive.html
EDIT: I modified my implementation based on Tim's advice and I use the Stash trait as suggested. My Fork actor now looks as follows:
class Fork(val id: Int) extends Actor with Stash {
import Fork._
// Fork is in "taken" state
def taken(philosopher: Int): Receive = {
case Release(`philosopher`) => {
println(s"\tPhilosopher $philosopher puts down fork $id.")
sender() ! ReleasedFork(id)
unstashAll()
context.unbecome()
}
case Take(_) => stash()
}
// Fork is in "free" state
def receive = {
case Take(philosopher) => {
println(s"\tPhilosopher $philosopher takes fork $id.")
sender() ! TookFork(id)
context.become(taken(philosopher), false)
}
}
}
However, I don't want to write the stash() and unstashAll() calls everywhere. Instead, I want to implement a custom mailbox type that does this for me, i.e. stashes unhandled messages and unstashes them when a message has been processed by the actor. Is this possible?
I tried to implement a custom mailbox which does this, however, I can't determine whether a message did or did not match the receive block.
The problem with forward is that it may re-order the messages if there are multiple messages waiting to be processed, which is probably not a good idea.
The best solution here would seem to be to implement you own queue inside the actor that gives the semantics that you want. If you can't process a message immediately then put in on the queue, and when the next message arrives you can process as much of the queue as possible. This would also allow you to detect when senders give inconsistent messages (e.g. Release on a fork that they did not Take) which otherwise will just build up in the incoming mailbox.
I would not worry about efficiency until you can prove it is a problem, but it will be more efficient if each receive function only processes the messages that are relevant in that particular state.
I would avoid using var in the actor by putting the state in the parameters to the receive methods. And the _status value is implicit in the choice of receive handler and doesn't need to be stored as a value. The taken receive handler only needs to process Release messages and the main receive handler only needs to process Take messages.
There exists a sample project in the Akka repository that houses multiple implementations of the "Dining Philosophers" problem. A key difference between your approach and theirs is that they implement both the utensils and the philosophers as actors, whereas you define only the utensil as an actor. The sample implementations show how to model the problem without dealing with unhandled messages or using a custom mailbox.

Ensure message order in test when mixing futures with actor messages

I'm testing an actor that uses an asnychronous future-based API. The actor uses the pipe pattern to send a message to itself when a future completes:
import akka.pattern.pipe
// ...
// somewhere in the actor's receive method
futureBasedApi.doSomething().pipeTo(self)
In my test I mock the API so I control future completion via promises. However, this is interleaved with other messages sent directly to the actor:
myActor ! Message("A")
promiseFromApiCall.success(Message("B"))
myActor ! Message("C")
Now I'm wondering how I can guarantee that the actor receives and
processes message B between message A and C in my test because message B is actually sent in another thread, so I can't control the order
in which the actor's mailbox receives the messages.
I thought about several possible solutions:
sleep after each message for a few milliseconds to make another
order very unlikely
wait for the actor to acknowledge each message, although
acknowledgement is only required for testing
send message B directly to the actor to simulate completion of the
future and write a separate test that ensures that the pipe pattern
is properly used (the test above would not fail if the actor would
not pipe the result message to itself)
I don't really like either of these options but I tend to use the last
one. Is there another better way I can enforce a certain message order in the tests?
Clarification: The question is not how to deal with the fact that messages might be received in random order in production. Controlling the order in the test is essential to make sure that the actor can actually deal with different message orders.
One idea is to define a flag in your actor that indicates whether the actor has received message B. When the actor receives message C, the actor can stash that message C if the flag is false, then unstash it once the actor receives message B. For example:
class MyActor extends Actor with Stash {
def receiveBlock(seenMsgB: Boolean, seenMsgC: Boolean): Receive = {
case MakeApiCall =>
callExternalApi().mapTo[MessageB].pipeTo(self)
case m: MessageB if seenMsgC => // assume msg C has been stashed
unstashAll()
// ...do something with msg B
become(receiveBlock(true, seenMsgC)) // true, true
case m: MessageB if !seenMsgC =>
// ...do something with message B
become(receiveBlock(true, seenMsgC)) // true, false
case m: MessageC if seenMsgB =>
// ...do something with message C
context.become(receiveBlock(seenMsgB, true)) // true, true
case m: MessageC if !seenMsgB =>
stash()
context.become(receiveBlock(seenMsgB, true)) // false, true
case ...
}
def receive = receiveBlock(false, false)
}
After reading a lot more about akka, I finally found a better solution: Replacing the actor mailbox with one I can observe in the tests. This way I can wait until the actor receives a new message after I complete the promise. Only then the next message is sent. The code for this TestingMailbox is given at the end of the post.
Update: In Akka Typed this can be achieved very elegantly with a BehaviorInterceptor. Just wrap the Behavior under test with a custom interceptor that forwards all messages and signals but lets you observe them.
The mailbox solution for untyped Akka is given below.
The actor can be configured like this:
actorUnderTest = system.actorOf(Props[MyActor]).withMailbox("testing-mailbox"))
I have to make sure the "testing-mailbox" is known by the actor system by providing a configuration:
class MyTest extends TestKit(ActorSystem("some name",
ConfigFactory.parseString("""{
testing-mailbox = {
mailbox-type = "my.package.TestingMailbox"
}
}""")))
with BeforeAndAfterAll // ... and so on
With this being set up, I can change my test like this:
myActor ! Message("A")
val nextMessage = TestingMailbox.nextMessage(actorUnderTest)
promiseFromApiCall.success(Message("B"))
Await.ready(nextMessage, 3.seconds)
myActor ! Message("C")
With a little helper method, I can even write it like this:
myActor ! Message("A")
receiveMessageAfter { promiseFromApiCall.success(Message("B")) }
myActor ! Message("C")
And this is my custom mailbox:
import akka.actor.{ActorRef, ActorSystem}
import akka.dispatch._
import com.typesafe.config.Config
import scala.concurrent.{Future, Promise}
object TestingMailbox {
val promisesByReceiver =
scala.collection.concurrent.TrieMap[ActorRef, Promise[Any]]()
class MessageQueue extends UnboundedMailbox.MessageQueue {
override def enqueue(receiver: ActorRef, handle: Envelope): Unit = {
super.enqueue(receiver, handle)
promisesByReceiver.remove(receiver).foreach(_.success(handle.message))
}
}
def nextMessage(receiver: ActorRef): Future[Any] =
promisesByReceiver.getOrElseUpdate(receiver, Promise[Any]).future
}
class TestingMailbox extends MailboxType
with ProducesMessageQueue[TestingMailbox.MessageQueue] {
import TestingMailbox._
def this(settings: ActorSystem.Settings, config: Config) = this()
final override def create(owner: Option[ActorRef],
system: Option[ActorSystem]) =
new MessageQueue()
}
If it is so important to order messages you should use ask (?) which returns Future and chain them even if you dont expect any response from an actor.

Puzzled by the spawned actor from a Spray route

I am doing some Http request processing using Spray. For a request I spin up an actor and send the payload to the actor for processing and after the actor is done working on the payload, I call context.stop(self) on the actor to wind the actor down.The idea is to prevent oversaturation of actors on the physical machine.
This is how I have things set up..
In httphandler.scala, I have the route set up as follows:
path("users"){
get{
requestContext => {
val userWorker = actorRefFactory.actorOf(Props(new UserWorker(userservice,requestContext)))
userWorker ! getusers //get user is a case object
}
}
} ~ path("users"){
post{
entity(as[UserInfo]){
requestContext => {
userInfo => {
val userWorker = actorRefFactory.actorOf(Props(new UserWorker(userservice,requestContext)))
userWorker ! userInfo
}
}
}
}
}
My UserWorker actor is defined as follows:
trait RouteServiceActor extends Actor{
implicit val system = context.system
import system.dispatcher
def processRequest[responseModel:ToResponseMarshaller](requestContex:RequestContext)(processFunc: => responseModel):Unit = {
Future{
processFunc
} onComplete {
case Success(result) => {
requestContext.complete(result)
}
case Failure(error) => requestContext.complete(error)
}
}
}
class UserWorker(userservice: UserServiceComponent#UserService,requestContext:RequestContext) extends RouteServiceActor{
def receive = {
case getusers => processRequest(requestContext){
userservice.getAllUsers
}
context.stop(self)
}
case userInfo:UserInfo => {
processRequest(requestContext){
userservice.createUser(userInfo)
}
context.stop(self)
}
}
My first question is, am I handling the request in a true asynchronous fashion? What are some of the pitfalls with my code?
My second question is how does the requestContext.complete work? Since the original request processing thread is no longer there, how does the requestContext send the result of the computation back to the client.
My third question is that since I am calling context.stop(self) after each of my partial methods, is it possible that I terminate the worker while it is in the midst of processing a different message.
What I mean is that while the Actor receives a message to process getusers, the same actor is done processing UserInfo and terminates the Actor before it can get to the "getusers" message. I am creating new actors upon every request, but is it possible that under the covers, the actorRefFactory provides a reference to a previously created actor, instead of a new one?
I am pretty confused by all the abstractions and it would be great if somebody could break it down for me.
Thanks
1) Is the request handled asynchronously? Yes, it is. However, you don't gain much with your per-request actors if you immediately delegate the actual processing to a future. In this simple case a cleaner way would be to write your route just as
path("users") {
get {
complete(getUsers())
}
}
def getUsers(): Future[Users] = // ... invoke userservice
Per-request-actors make more sense if you also want to make route-processing logic run in parallel or if handling the request has more complex requirements, e.g. if you need to query things from multiple service in parallel or need to keep per-request state while some background services are processing the request. See https://github.com/NET-A-PORTER/spray-actor-per-request for some information about this general topic.
2) How does requestContext.complete work? Behind the scenes it sends the HTTP response to the spray-can HTTP connection actor as a normal actor message "tell". So, basically the RequestContext just wraps an ActorRef to the HTTP connection which is safe to use concurrently.
3) Is it possible that "the worker" is terminated by context.stop(self)? I think there's some confusion about how things are scheduled behind the scenes. Of course, you are terminating the actor with context.stop but that just stops the actor but not any threads (as threads are managed completely independently from actor instances in Akka). As you didn't really make use of an actor's advantages, i.e. encapsulating and synchronizing access to mutable state, everything should work (but as said in 1) is needlessly complex for this use case). The akka documentation has lots of information about how actors, futures, dispatchers, and ExecutionContexts work together to make everything work.
In addition to jrudolph answer your spray routing structure shouldn't even compile, cause in your post branch you don't explicitly specify a requestContext. This structure can be simplified a bit to this:
def spawnWorker(implicit ctx: RequestContext): ActorRef = {
actorRefFactory actorOf Props(new UserWorker(userservice, ctx))
}
lazy val route: Route = {
path("users") { implicit ctx =>
get {
spawnWorker ! getUsers
} ~
(post & entity(as[UserInfo])) {
info => spawnWorker ! info
}
}
}
The line info => spawnWorker ! info can be also simplified to spawnWorker ! _.
Also there is an important point concerning explicit ctx declaration and complete directive. If you explicitly declared ctx in your route, you can't use complete directive, you have to explicitly write ctx.complete(...), link on this issue

How to check scala actors status from main process?

I am relatively new Scala actors. I have huge map,that is grouped into smaller blocks and executed through actors. Based on map size,the number of actors created vary. The actors work well and the process is completed. But how to check the status of the generated actors? In java i am familiar with use of thread-pool executor services. In Scala how this is done?
There are multiple ways to do what you want:
Have the worker actor send a message back to the sender to inform it that an operation completed. Each actor has a reference to the sender actor (the one that sent the message), which you can use to send back a completion message. The sender can then handle that message.
Instead of sending a message via a tell (e.g. actor ! msg), use ask, which returns a Future. You can setup a callback on the Future that runs upon completion.
If the worker actors are launched for a one-time operation, have it terminate itself by stopping it once the operation finishes. The parent actor (the one which created the worker actor) can monitor the worker via a DeathWatch mechanism that informs the parent when the child actor is terminated. In this approach, termination means the operation has been completed. However, you will need to keep track of how many terminations the parent receives in order to determine when all the worker actors have finished.
Which approach to use depends on your use case and nature of the operations. Most common and flexible approach is #1. Example (not tested):
case class PerformWork(i: Int)
case object WorkDone
class ParentActor(size: Int) extends Actor {
for (i <- 1 to size) {
val actor = context.actorOf(Props[Worker], s"worker$i")
actor ! PerformWork(i)
}
var result = 0
def receive = {
case WorkDone => {
result += 1
if (result == size) {
// work is done
}
}
}
}
class Worker extends Actor {
def receive = {
case m: PerformWork => {
// do some work
// ...
sender ! WorkDone
}
}
}