Multiple Future calls in an Actor's receive method - scala

I'm trying to make two external calls (to a Redis database) inside an Actor's receive method. Both calls return a Future and I need the result of the first Future inside the second.
I'm wrapping both calls inside a Redis transaction to avoid anyone else from modifying the value in the database while I'm reading it.
The internal state of the actor is updated based on the value of the second Future.
Here is what my current code looks like which I is incorrect because I'm updating the internal state of the actor inside a Future.onComplete callback.
I cannot use the PipeTo pattern because I need both both Future have to be in a transaction.
If I use Await for the first Future then my receive method will block.
Any idea how to fix this ?
My second question is related to how I'm using Futures. Is this usage of Futures below correct? Is there a better way of dealing with multiple Futures in general? Imagine if there were 3 or 4 Future each depending on the previous one.
import akka.actor.{Props, ActorLogging, Actor}
import akka.util.ByteString
import redis.RedisClient
import scala.concurrent.Future
import scala.util.{Failure, Success}
object GetSubscriptionsDemo extends App {
val akkaSystem = akka.actor.ActorSystem("redis-demo")
val actor = akkaSystem.actorOf(Props(new SimpleRedisActor("localhost", "dummyzset")), name = "simpleactor")
actor ! UpdateState
}
case object UpdateState
class SimpleRedisActor(ip: String, key: String) extends Actor with ActorLogging {
//mutable state that is updated on a periodic basis
var mutableState: Set[String] = Set.empty
//required by Future
implicit val ctx = context dispatcher
var rClient = RedisClient(ip)(context.system)
def receive = {
case UpdateState => {
log.info("Start of UpdateState ...")
val tran = rClient.transaction()
val zf: Future[Long] = tran.zcard(key) //FIRST Future
zf.onComplete {
case Success(z) => {
//SECOND Future, depends on result of FIRST Future
val rf: Future[Seq[ByteString]] = tran.zrange(key, z - 1, z)
rf.onComplete {
case Success(x) => {
//convert ByteString to UTF8 String
val v = x.map(_.utf8String)
log.info(s"Updating state with $v ")
//update actor's internal state inside callback for a Future
//IS THIS CORRECT ?
mutableState ++ v
}
case Failure(e) => {
log.warning("ZRANGE future failed ...", e)
}
}
}
case Failure(f) => log.warning("ZCARD future failed ...", f)
}
tran.exec()
}
}
}
The compiles but when I run it gets struck.
2014-08-07 INFO [redis-demo-akka.actor.default-dispatcher-3] a.e.s.Slf4jLogger - Slf4jLogger started
2014-08-07 04:38:35.106UTC INFO [redis-demo-akka.actor.default-dispatcher-3] e.c.s.e.a.g.SimpleRedisActor - Start of UpdateState ...
2014-08-07 04:38:35.134UTC INFO [redis-demo-akka.actor.default-dispatcher-8] r.a.RedisClientActor - Connect to localhost/127.0.0.1:6379
2014-08-07 04:38:35.172UTC INFO [redis-demo-akka.actor.default-dispatcher-4] r.a.RedisClientActor - Connected to localhost/127.0.0.1:6379
UPDATE 1
In order to use pipeTo pattern I'll need access to the tran and the FIRST Future (zf) in the actor where I'm piping the Future to because the SECOND Future depends on the value (z) of FIRST.
//SECOND Future, depends on result of FIRST Future
val rf: Future[Seq[ByteString]] = tran.zrange(key, z - 1, z)

Without knowing too much about the redis client you are using, I can offer an alternate solution that should be cleaner and won't have issues with closing over mutable state. The idea is to use a master/worker kind of situation, where the master (the SimpleRedisActor) receives the request to do the work and then delegates off to a worker that performs the work and responds with the state to update. That solution would look something like this:
object SimpleRedisActor{
case object UpdateState
def props(ip:String, key:String) = Props(classOf[SimpleRedisActor], ip, key)
}
class SimpleRedisActor(ip: String, key: String) extends Actor with ActorLogging {
import SimpleRedisActor._
import SimpleRedisWorker._
//mutable state that is updated on a periodic basis
var mutableState: Set[String] = Set.empty
val rClient = RedisClient(ip)(context.system)
def receive = {
case UpdateState =>
log.info("Start of UpdateState ...")
val worker = context.actorOf(SimpleRedisWorker.props)
worker ! DoWork(rClient, key)
case WorkResult(result) =>
mutableState ++ result
case FailedWorkResult(ex) =>
log.error("Worker got failed work result", ex)
}
}
object SimpleRedisWorker{
case class DoWork(client:RedisClient, key:String)
case class WorkResult(result:Seq[String])
case class FailedWorkResult(ex:Throwable)
def props = Props[SimpleRedisWorker]
}
class SimpleRedisWorker extends Actor{
import SimpleRedisWorker._
import akka.pattern.pipe
import context._
def receive = {
case DoWork(client, key) =>
val trans = client.transaction()
trans.zcard(key) pipeTo self
become(waitingForZCard(sender, trans, key) orElse failureHandler(sender, trans))
}
def waitingForZCard(orig:ActorRef, trans:RedisTransaction, key:String):Receive = {
case l:Long =>
trans.zrange(key, l -1, l) pipeTo self
become(waitingForZRange(orig, trans) orElse failureHandler(orig, trans))
}
def waitingForZRange(orig:ActorRef, trans:RedisTransaction):Receive = {
case s:Seq[ByteString] =>
orig ! WorkResult(s.map(_.utf8String))
finishAndStop(trans)
}
def failureHandler(orig:ActorRef, trans:RedisTransaction):Receive = {
case Status.Failure(ex) =>
orig ! FailedWorkResult(ex)
finishAndStop(trans)
}
def finishAndStop(trans:RedisTransaction) {
trans.exec()
context stop self
}
}
The worker starts the transaction and then makes calls into redis and ultimately completes the transaction before stopping itself. When it calls redis, it gets the future and pipes back to itself for the continuation of the processing, changing the receive method between as a mechanism of showing progressing through its states. In a model like this (which I suppose is somewhat similar to the error kernal pattern), the master owns and protects the state, delegating the "risky" work off to a child who can figure out what the change for the state should be, but the changing is still owned by the master.
Now again, I have no idea about the capabilities of the redis client you are using and if it is safe enough to even do this kind of stuff, but that's not really the point. The point was to show a safer structure for doing something like this that involves futures and state that needs to be changed safely.

Using the callback to mutate internal state is not a good idea, excerpt from the akka docs:
When using future callbacks, such as onComplete, onSuccess, and onFailure, inside actors you need to carefully avoid closing over the containing actor’s reference, i.e. do not call methods or access mutable state on the enclosing actor from within the callback.
Why do you worry about pipeTo and transactions?
Not sure how redis transactions work, but I would guess that the transaction does not encompass the onComplete callback on the second future anyways.
I would put the state into a separate actor which you pipe the future too. This way you have a separate mailbox, and the ordering there will be the same as the ordering of the messages that came in to modify the state. Also if any read requests come in, they will also be put in the correct order.
Edit to respond to edited question: Ok, so you don't want to pipe the first future, that makes sense, and should be no problem as the first callback is harmless. The callback of the second future is the problem, as it manipulates the state. But this future can be pipe without the need for access to the transaction.
So basically my suggestion is:
val firstFuture = tran.zcard
firstFuture.onComplete {
val secondFuture = tran.zrange
secondFuture pipeTo stateActor
}
With stateActor containing the mutable state.

Related

How to read only Successful values from a Seq of Futures

I am learning akka/scala and am trying to read only those Futures that succeeded from a Seq[Future[Int]] but cant get anything to work.
I simulated an array of 10 Future[Int] some of which fail depending on the value FailThreshold takes (all fail for 10 and none fail for 0).
I then try to read them into an ArrayBuffer (could not find a way to return immutable structure with the values).
Also, there isn't a filter on Success/Failure so had to run an onComplete on each future and update buffer as a side-effect.
Even when the FailThreshold=0 and the Seq has all Future set to Success, the array buffer is sometimes empty and different runs return array of different sizes.
I tried a few other suggestions from the web like using Future.sequence on the list but this throws exception if any of future variables fail.
import akka.actor._
import akka.pattern.ask
import scala.concurrent.{Await, Future, Promise}
import scala.concurrent.duration._
import scala.util.{Timeout, Failure, Success}
import concurrent.ExecutionContext.Implicits.global
case object AskNameMessage
implicit val timeout = Timeout(5, SECONDS)
val FailThreshold = 0
class HeyActor(num: Int) extends Actor {
def receive = {
case AskNameMessage => if (num<FailThreshold) {Thread.sleep(1000);sender ! num} else sender ! num
}
}
class FLPActor extends Actor {
def receive = {
case t: IndexedSeq[Future[Int]] => {
println(t)
val b = scala.collection.mutable.ArrayBuffer.empty[Int]
t.foldLeft( b ){ case (bf,ft) =>
ft.onComplete { case Success(v) => bf += ft.value.get.get }
bf
}
println(b)
}
}
}
val system = ActorSystem("AskTest")
val flm = (0 to 10).map( (n) => system.actorOf(Props(new HeyActor(n)), name="futureListMake"+(n)) )
val flp = system.actorOf(Props(new FLPActor), name="futureListProcessor")
// val delay = akka.pattern.after(500 millis, using=system.scheduler)(Future.failed( throw new IllegalArgumentException("DONE!") ))
val delay = akka.pattern.after(500 millis, using=system.scheduler)(Future.successful(0))
val seqOfFtrs = (0 to 10).map( (n) => Future.firstCompletedOf( Seq(delay, flm(n) ? AskNameMessage) ).mapTo[Int] )
flp ! seqOfFtrs
The receive in FLPActor mostly gets
Vector(Future(Success(0)), Future(Success(1)), Future(Success(2)), Future(Success(3)), Future(Success(4)), Future(Success(5)), Future(Success(6)), Future(Success(7)), Future(Success(8)), Future(Success(9)), Future(Success(10)))
but the array buffer b has varying number of values and empty at times.
Can someone please point me to gaps here,
why would the array buffer have varying sizes even when all Future have resolved to Success,
what is the correct pattern to use when we want to ask different actors with TimeOut and use only those asks that have successfully returned for further processing.
Instead of directly sending the IndexedSeq[Future[Int]], you should transform to Future[IndexedSeq[Int]] and then pipe it to the next actor. You don't send the Futures directly to an actor. You have to pipe it.
HeyActor can stay unchanged.
After
val seqOfFtrs = (0 to 10).map( (n) => Future.firstCompletedOf( Seq(delay, flm(n) ? AskNameMessage) ).mapTo[Int] )
do a recover, and use Future.sequence to turn it into one Future:
val oneFut = Future.sequence(seqOfFtrs.map(f=>f.map(Some(_)).recover{ case (ex: Throwable) => None})).map(_.flatten)
If you don't understand the business with Some, None, and flatten, then make sure you understand the Option type. One way to remove values from a sequence is to map values in the sequence to Option (either Some or None) and then to flatten the sequence. The None values are removed and the Some values are unwrapped.
After you have transformed your data into a single Future, pipe it over to FLPActor:
oneFut pipeTo flp
FLPActor should be rewritten with the following receive function:
def receive = {
case printme: IndexedSeq[Int] => println(printme)
}
In Akka, modifying some state in the main thread of your actor from a Future or the onComplete of a Future is a big no-no. In the worst case, it results in race conditions. Remember that each Future runs on its own thread, so running a Future inside an actor means you have concurrent work being done in different threads. Having the Future directly modify some state in your actor while the actor is also processing some state is a recipe for disaster. In Akka, you process all changes to state directly in the primary thread of execution of the main actor. If you have some work done in a Future and need to access that work from the main thread of an actor, you pipe it to that actor. The pipeTo pattern is functional, correct, and safe for accessing the finished computation of a Future.
To answer your question about why FLPActor is not printing out the IndexedSeq correctly: you are printing out the ArrayBuffer before your Futures have been completed. onComplete isn't the right idiom to use in this case, and you should avoid it in general as it isn't good functional style.
Don't forget the import akka.pattern.pipe for the pipeTo syntax.

integrating with a non thread safe service, while maintaining back pressure, in a concurrent environment using Futures, akka streams and akka actors

I am using a third party library to provide parsing services (user agent parsing in my case) which is not a thread safe library and has to operate on a single threaded basis. I would like to write a thread safe API that can be called by multiple threads to interact with it via Futures API as the library might introduce some potential blocking (IO). I would also like to provide back pressure when necessary and return a failed future when the parser doesn't catch up with the producers.
It could actually be a generic requirement/question, how to interact with any client/library which is not thread safe (user agents/geo locations parsers, db clients like redis, loggers collectors like fluentd), with back pressure in a concurrent environments.
I came up with the following formula:
encapsulate the parser within a dedicated Actor.
create an akka stream source queue that receives ParseReuqest that contains the user agent and a Promise to complete, and using the ask pattern via mapAsync to interact with the parser actor.
create another actor to encapsulate the source queue.
Is this the way to go? Is there any other way to achieve this, maybe simpler ? maybe using graph stage? can it be done without the ask pattern and less code involved?
the actor mentioned in number 3, is because I'm not sure if the source queue is thread safe or not ?? I wish it was simply stated in the docs, but it doesn't. there are multiple versions over the web, some stating it's not and some stating it is.
Is the source queue, once materialized, is thread safe to push elements from different threads?
(the code may not compile and is prone to potential failures, and is only intended for this question in place)
class UserAgentRepo(dbFilePath: String)(implicit actorRefFactory: ActorRefFactory) {
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.duration._
implicit val askTimeout = Timeout(5 seconds)
// API to parser - delegates the request to the back pressure actor
def parse(userAgent: String): Future[Option[UserAgentData]] = {
val p = Promise[Option[UserAgentData]]
parserBackPressureProvider ! UserAgentParseRequest(userAgent, p)
p.future
}
// Actor to provide back pressure that delegates requests to parser actor
private class ParserBackPressureProvider extends Actor {
private val parser = context.actorOf(Props[UserAgentParserActor])
val queue = Source.queue[UserAgentParseRequest](100, OverflowStrategy.dropNew)
.mapAsync(1)(request => (parser ? request.userAgent).mapTo[Option[UserAgentData]].map(_ -> request.p))
.to(Sink.foreach({
case (result, promise) => promise.success(result)
}))
.run()
override def receive: Receive = {
case request: UserAgentParseRequest => queue.offer(request).map {
case QueueOfferResult.Enqueued =>
case _ => request.p.failure(new RuntimeException("parser busy"))
}
}
}
// Actor parser
private class UserAgentParserActor extends Actor {
private val up = new UserAgentParser(dbFilePath, true, 50000)
override def receive: Receive = {
case userAgent: String =>
sender ! Try {
up.parseUa(userAgent)
}.toOption.map(UserAgentData(userAgent, _))
}
}
private case class UserAgentParseRequest(userAgent: String, p: Promise[Option[UserAgentData]])
private val parserBackPressureProvider = actorRefFactory.actorOf(Props[ParserBackPressureProvider])
}
Do you have to use actors for this?
It does not seem like you need all this complexity, scala/java hasd all the tools you need "out of the box":
class ParserFacade(parser: UserAgentParser, val capacity: Int = 100) {
private implicit val ec = ExecutionContext
.fromExecutor(
new ThreadPoolExecutor(
1, 1, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue(capacity)
)
)
def parse(ua: String): Future[Option[UserAgentData]] = try {
Future(Some(UserAgentData(ua, parser.parseUa(ua)))
.recover { _ => None }
} catch {
case _: RejectedExecutionException =>
Future.failed(new RuntimeException("parser is busy"))
}
}

Am I safely mutating in my akka actors or is this not thread safe?

I am a little confused if I am safely mutating my mutable maps/queue inside of my actor.
Can someone tell me if this code is thread-safe and correct?
class SomeActor extends Actor {
val userQ = mutable.Queue.empty[User]
val tranQ = mutable.Map.empty[Int, Transaction]
def receive = {
case Blank1 =>
if(userQ.isEmpty)
userQ ++= getNewUsers()
case Blank2 =>
val companyProfile = for {
company <- api.getCompany() // Future[Company]
location <- api.getLoc() // Future[Location]
} yield CompanyProfile(company, location)
companyProfile.map { cp =>
tranQ += cp.id -> cp.transaction // tranQ mutatated here
}
}
}
Since I am mutating the tranQ with futures, is this safe?
It is my understanding that each actor message is handled in a serial fashion, so although maybe frowned upon I can use mutable state like this.
I am just confused if using it inside of a future call like tranQ is safe or not.
No, your code is not safe.
While an actor processes one message at a time, you will lose this guarantee as soon as Futures are involed. At that point, the code inside the Future is executed on a (potentially) different thread and the next message might be handled by the actor.
A typical pattern to work around this issue is to send a message with the result of the Future using the pipeTo pattern, like so:
import akka.pattern.pipe
def receive: Receive {
case MyMsg =>
myFutureOperation()
.map(res => MyMsg2(res))
.pipeTo(self)
case MyMsg2(res) =>
// do mutation now
}
More information about using Futures can be found in akka's documentation: http://doc.akka.io/docs/akka/2.5/scala/futures.html

One-off object pooling with Actor provider

I have an object with heavy initialization cost and memory footprint. Initialization time is human-noticeable but creation frequency is low.
class HeavyClass {
heavyInit()
}
My solution is to create a Provider actor that would have a single object created ahead of time and provide it instantly on request. The provider would then go on with creating the next object.
class HeavyClassProvider extends Actor {
var hc: Option[HeavyClass] = Some(new HeavyClass())
override def receive = {
case "REQUEST" =>
sender ! { hc getOrElse new HeavyClass() }
self ! "RESPAWN"
hc = None
case "RESPAWN" if (hc == None) => hc = Some(new HeavyClass())
}
}
And a consumer:
abstract class HeavyClassConsumer extends Actor {
import context.dispatcher
import akka.pattern.ask
import scala.concurrent.duration._
import akka.util.Timeout
implicit val timeout = Timeout(5, SECONDS)
var provider: ActorRef
var hc: Option[HeavyClass] = None
override def receive = {
case "START" =>
((provider ask "REQUEST").mapTo[HeavyClass]
onSuccess { case h: HeavyClass => hc = Some(h) })
}
}
Is this a common pattern ? The code feels wacky, is there an obvious cleaner way of doing this ?
The problem with your solution is that when you call new HeavyClass() your actor will block until it will process that computation. Doing it in a Future or in another Actor avoids that. Here is one way to do it:
import akka.pattern.pipe
...
class HeavyClassProvider extends Actor {
// start off async computation during init:
var hc: Future[HeavyClass] = Future(new HeavyClass)
override def receive = {
case "REQUEST" =>
// send result to requester when it's complete or
// immediately if its already complete:
hc pipeTo sender
// start a new computation and send to self:
Future(new HeavyClass) pipeTo self
case result: HeavyClass => // new result is ready
hc = Future.successful(result) // update with newly computed result
case Status.Failure(f) => // computation failed
hc = Future.failed[HeavyClass](f)
// maybe request a recomputation again
}
}
(I didn't compile it)
One particularity about my first solution is that it does not restrict how many Futures are computed at the same time. If you receive multiple requests it will compute multiple futures which might not be desirable, although there is no race condition in this Actor. To restrict that simply introduce a Boolean flag in the Actor that tells you if you are computing something already. Also, all these vars can be replaced with become/unbecome behaviors.
Example of a single concurrent Future computation given multiple requests:
import akka.pattern.pipe
...
class HeavyClassProvider extends Actor {
// start off async computation during init:
var hc: Future[HeavyClass] = Future(new HeavyClass) pipeTo self
var computing: Boolean = true
override def receive = {
case "REQUEST" =>
// send result to requester when it's complete or
// immediately if its already complete:
hc pipeTo sender
// start a new computation and send to self:
if(! computing)
Future(new HeavyClass) pipeTo self
case result: HeavyClass => // new result is ready
hc = Future.successful(result) // update with newly computed result
computing = false
case Status.Failure(f) => // computation failed
hc = Future.failed[HeavyClass](f)
computing = false
// maybe request a recomputation again
}
}
EDIT:
After discussing requirements further in the comments here is yet another implementation that sends a new object to the sender/client on each request in non-blocking manner:
import akka.pattern.pipe
...
class HeavyClassProvider extends Actor {
override def receive = {
case "REQUEST" =>
Future(new HeavyClass) pipeTo sender
}
}
And then it can be simplified to:
object SomeFactoryObject {
def computeLongOp: Future[HeavyClass] = Future(new HeavyClass)
}
In this case no actors are needed. The purpose of using an Actor in these cases as a synchronization mechanism and non-blocking computation is for that Actor to cache results and provide async computation with more complex logic than just Future, otherwise Future is sufficient.
I suspect it's more often done with a synchronized factory of some sort, but the actor seems as good of a synchronization mechanism as any, especially if the calling code is already built on async patterns.
One potential problem with your current implementation is that it can't parallelize creation of multiple HeavyClass objects that are requested "all at once". It might be the case that this is a feature and that parallel-creation of several would bog down the system. If, on the other hand, it's "just slow", you might want to spin off the creation of the "on-demand" instances into its own thread/actor.

program hangs when using multiple futures with multiple remote actors

I start two remote actors on one host which just echo whatever is sent to them. I then create another actor which sends some number of messages (using !! ) to both actors and keep a List of Future objects holding the replies from these actors. Then I loop over this List fetching the result of each Future. The problem is that most of the time some futures never return, even thought the actor claims it has sent the reply. The problem happens randomly, sometimes it will get through the whole list, but most of the time it gets stuck at some point and hangs indefinitely.
Here is some code which produces the problem on my machine:
Sink.scala:
import scala.actors.Actor
import scala.actors.Actor._
import scala.actors.Exit
import scala.actors.remote.RemoteActor
import scala.actors.remote.RemoteActor._
object Sink {
def main(args: Array[String]): Unit = {
new RemoteSink("node03-0",43001).start()
new RemoteSink("node03-1",43001).start()
}
}
class RemoteSink(name: String, port: Int) extends Actor
{
def act() {
println(name+" starts")
trapExit=true
alive(port)
register(Symbol(name),self)
loop {
react {
case Exit(from,reason) =>{
exit()
}
case msg => reply{
println(name+" sending reply to: "+msg)
msg+" back at you from "+name
}
}
}
}
}
Source.scala:
import scala.actors.Actor
import scala.actors.Actor._
import scala.actors.remote.Node;
import scala.actors.remote.RemoteActor
import scala.actors.remote.RemoteActor._
object Source {
def main(args: Array[String]):Unit = {
val peer = Node("127.0.0.1", 43001)
val source = new RemoteSource(peer)
source.start()
}
}
class RemoteSource(peer: Node) extends Actor
{
def act() {
trapExit=true
alive(43001)
register(Symbol("source"),self)
val sinks = List(select(peer,Symbol("node03-0"))
,select(peer,Symbol("node03-1"))
)
sinks.foreach(link)
val futures = for(sink <- sinks; i <- 0 to 20) yield sink !! "hello "+i
futures.foreach( f => println(f()))
exit()
}
}
What am I doing wrong?
I'm guessing your problem is due to this line:
futures.foreach( f => println(f()))
in which you loop through all your futures and block on each in turn, waiting for its result. Blocking on futures is generally a bad idea and should be avoided. What you want to do instead is specify an action to carry out when the future's result is available. Try this:
futures.foreach(f => f.foreach(r => println(r)))
Here's an alternate way to say that with a for comprehension:
for (future <- futures; result <- future) { println(result) }
This blog entry is an excellent primer on the problem of blocking on futures and how monadic futures overcome it.
I have seen a similar case as well. When the code inside the thread throws certain types of exception and exits, the corresponding future.get never returns. One can try with raising an exception of java.lang.Error vs java.lang.NoSuchMethodError. The latter's corresponding future would never return.