I have an intermediate remote actor (B) that is supposed forward back and forth messages from A and C ( like A <-> B <-> C ). In B's code I have something like
loop {
react {
case msg => val A = sender
//2) Should this be synchronous with !?
C ! msg
//1) What's better react or receive?
react {
case response => A ! response
}
}
}
3 Questions:
1) What's better react or receive (to nest within a react)?
2) Given that a response will be sent back, should !? be used instead of !
3) Any other recommendation for this scenario?
Thank you all!
For what concerns the standard Actor model, messages must be handled atomically (i.e. you cannot receive and process messages when you are processing another -- and that's exactly what you'd like to do here)
However, Scala Actors have a relaxed semantics, which may allow to do that.
For Question 1, you should have clear which are the differences between react and receive. Anyway, you can easily use react (as used here http://www.scala-lang.org/docu/files/actors-api/actors_api_guide.html)
Alternatively, you could not use nesting. After your actor has sent the request, its state should change so that that next loop cycle it will look for the reply.
You may also want to upgrade to Scala 2.10 which integrate actors from Akka; that model is more clear and easy to use.
Related
We have a fairly complex system developed using Akka HTTP and Actors model. Until now, we extensively used ask pattern and mixed Futures and Actors.
For example, an actor gets message, it needs to execute 3 operations in parallel, combine a result out of that data and returns it to sender. What we used is
declare a new variable in actor receive message callback to store a sender (since we use Future.map it can be another sender).
executed all those 3 futures in parallel using Future.sequence (sometimes its call of function that returns a future and sometimes it is ask to another actor to get something from it)
combine the result of all 3 futures using map or flatMap function of Future.sequence result
pipe a final result to a sender using pipeTo
Here is a code simplified:
case RetrieveData(userId, `type`, id, lang, paging, timeRange, platform) => {
val sen = sender
val result: Future[Seq[Map[String, Any]]] = if (paging.getOrElse(Paging(0, 0)) == Paging(0, 0)) Future.successful(Seq.empty)
else {
val start = System.currentTimeMillis()
val profileF = profileActor ? Get(userId)
Future.sequence(Seq(profileF, getSymbols(`type`, id), getData(paging, timeRange, platform)).map { result =>
logger.info(s"Got ${result.size} news in ${System.currentTimeMillis() - start} ms")
result
}.recover { case ex: Throwable =>
logger.error(s"Failure on getting data: ${ex.getMessage}", ex)
Seq.empty
}
}
result.pipeTo(sen)
}
Function getAndProcessData contains Future.sequence with executing 3 futures in parallel.
Now, as I'm reading more and more on Akka, I see that using ask is creating another actor listener. Questions are:
As we extensively use ask, can it lead to a to many threads used in a system and perhaps a thread starvation sometimes?
Using Future.map much also means different thread often. I read about one thread actor illusion which can be easily broken with mixing Futures.
Also, can this affect performances in a bad way?
Do we need to store sender in temp variable send, since we're using pipeTo? Could we do only pipeTo(sender). Also, does declaring sen in almost each receive callback waste to much resources? I would expect its reference will be removed once operation in complete.
Is there a chance to design such a system in a better way, meadning that we don't use map or ask so much? I looked at examples when you just pass a replyTo reference to some actor and the use tell instead of ask. Also, sending message to self and than replying to original sender can replace working with Future.map in some scenarios. But how it can be designed having in mind we want to perform 3 async operations in parallel and returns a formatted data to a sender? We need to have all those 3 operations completed to be able to format data.
I tried not to include to many examples, I hope you understand our concerns and problems. Many questions, but I would really love to understand how it works, simple and clear
Thanks in advance
If you want to do 3 things in parallel you are going to need to create 3 Future values which will potentially use 3 threads, and that can't be avoided.
I'm not sure what the issue with map is, but there is only one call in this code and that is not necessary.
Here is one way to clean up the code to avoid creating unnecessary Future values (untested!):
case RetrieveData(userId, `type`, id, lang, paging, timeRange, platform) =>
if (paging.forall(_ == Paging(0, 0))) {
sender ! Seq.empty
} else {
val sen = sender
val start = System.currentTimeMillis()
val resF = Seq(
profileActor ? Get(userId),
getSymbols(`type`, id),
getData(paging, timeRange, platform),
)
Future.sequence(resF).onComplete {
case Success(result) =>
val dur = System.currentTimeMillis() - start
logger.info(s"Got ${result.size} news in $dur ms")
sen ! result
case Failure(ex)
logger.error(s"Failure on getting data: ${ex.getMessage}", ex)
sen ! Seq.empty
}
}
You can avoid ask by creating your own worker thread that collects the different results and then sends the result to the sender, but that is probably more complicated than is needed here.
An actor only consumes a thread in the dispatcher when it is processing a message. Since the number of messages the actor spawned to manage the ask will process is one, it's very unlikely that the ask pattern by itself will cause thread starvation. If you're already very close to thread starvation, an ask might be the straw that breaks the camel's back.
Mixing Futures and actors can break the single-thread illusion, if and only if the code executing in the Future accesses actor state (meaning, basically, vars or mutable objects defined outside of a receive handler).
Request-response and at-least-once (between them, they cover at least most of the motivations for the ask pattern) will in general limit throughput compared to at-most-once tells. Implementing request-response or at-least-once without the ask pattern might in some situations (e.g. using a replyTo ActorRef for the ultimate recipient) be less overhead than piping asks, but probably not significantly. Asks as the main entry-point to the actor system (e.g. in the streams handling HTTP requests or processing messages from some message bus) are generally OK, but asks from one actor to another are a good opportunity to streamline.
Note that, especially if your actor imports context.dispatcher as its implicit ExecutionContext, transformations on Futures are basically identical to single-use actors.
Situations where you want multiple things to happen (especially when you need to manage partial failure (Future.sequence.recover is a possible sign of this situation, especially if the recover gets nontrivial)) are potential candidates for a saga actor to organize one particular request/response.
I would suggest instead of using Future.sequence, Souce from Akka can be used which will run all the futures in parallel, in which you can provide the parallelism also.
Here is the sample code:
Source.fromIterator( () => Seq(profileF, getSymbols(`type`, id), getData(paging, timeRange, platform)).iterator )
.mapAsync( parallelism = 1 ) { case (seqIdValue, row) =>
row.map( seqIdValue -> _ )
}.runWith( Sink.seq ).map(_.map(idWithDTO => idWithDTO))
This will return Future[Seq[Map[String, Any]]]
I'm working on implementing a small language to send tasks to execution and control execution flow. After the sending a task to my system, the user gets a future (on which it can call a blocking get() or flatMap() ). My question is: is it OK to send futures in Akka messages?
Example: actor A sends a message Response to actor B and Response contains a future among its fields. Then at some point A will fulfill the promise from which the future was created. After receiving the Response, B can call flatMap() or get() at any time.
I'm asking because Akka messages should be immutable and work even if actors are on different JVMs. I don't see how my example above can work if actors A and B are on different JVMs. Also, are there any problems with my example even if actors are on same JVM?
Something similar is done in the accepted answer in this stackoverflow question. Will this work if actors are on different JVMs?
Without remoting it's possible, but still not advisable. With remoting in play it won't work at all.
If your goal is to have an API that returns Futures, but uses actors as the plumbing underneath, one approach could be that the API creates its own actor internally that it asks, and then returns the future from that ask to the caller. The actor spawned by the API call is guaranteed to be local to the API instance and can communicate with the rest of the actor system via the regular tell/receive mechanism, so that there are no Futures sent as messages.
class MyTaskAPI(actorFactory: ActorRefFactory) {
def doSomething(...): Future[SomethingResult] = {
val taskActor = actorFactory.actorOf(Props[MyTaskActor])
taskActor ? DoSomething(...).mapTo[SomethingResult]
}
}
where MyTaskActor receives the DoSomething, captures the sender, sends out the request for task processince and likely becomes a receiving state for SomethingResult which finally responds to the captured sender and stops itself. This approach creates two actors per request, one explicitly, the MyTaskActor and one implicitly, the handler of the ask, but keeps all state inside of actors.
Alternately, you could use the ActorDSL to create just one actor inline of doSomething and use a captured Promise for completion instead of using ask:
class MyTaskAPI(system: System) {
def doSomething(...): Future[SomethingResult] = {
val p = Promise[SomethingResult]()
val tmpActor = actor(new Act {
become {
case msg:SomethingResult =>
p.success(msg)
self.stop()
}
}
system.actorSelection("user/TaskHandler").tell(DoSomething(...), tmpActor)
p.future
}
}
This approach is a bit off the top of my head and it does use a shared value between the API and the temp actor, which some might consider a smell, but should give an idea how to implement your workflow.
If you're asking if it's possible, then yes, it's possible. Remote actors are basically interprocess communication. If you set everything up on both machines to a state where both can properly handle the future, then it should be good. You don't give any working example so I can't really delve deeper into it.
From the class Principles of Reactive Programming on Coursera:
"If an actor sends multiple messages to the same destination, they will not arrive out of order (this is Akka-specific)."
actors A and B
A sends B msg1
A sends B msg2
B will receive msg1 and then msg2
Warning: I've never programmed in Erlang
I believe that this message ordering semantic isn't guaranteed in Erlang. This seems like a HUGE difference that effects the different types of programs you could write using what are supposed to be similar frameworks.
For example in Akka you could do:
case class msg(x: Int)
case object report
class B extends Actor {
var i: Int = 0
def recieve = {
case msg(x) => i = i + x
case report => sender ! i
}
}
then you could do
A send B msg(5)
A send B msg(6)
A send B report // guarantees that the sum will be 11
My main point is in that Erlang it seems that you couldn't guarantee that the sum returned would be 11. Does Erlang discourage or even forbid Actors from containing any mutable state? Can anyone elaborate on the different type of programs that can and cannot be written with Actors in Scala's Akka vs. Erlang?
As Pascal said, the order of messages between two processes IS guaranteed.
In Erlang, the only way to have some "mutable state" is to hide it behind actor. It is usually done this way:
loop(Sum) ->
NewSum = receive Message of
{msg, Number, Sender} -> add_and_reply(Sum, Number, Sender);
_ -> Sum
end,
loop(NewSum).
add_and_reply(Sum, Number, Sender) ->
NewSum = Sum + Number,
Sender ! NewSum,
NewSum.
This way, you don't mutate anything. You create new state and pass it as an argument in endless recursion. Actor that runs the loop makes sure, that all the calls are served one by one, because it accepts only one message at a time.
For me, the main difference between Erlang and Akka is preemptive scheduling. In Erlang, you can write actor that does this:
loop(Foo) ->
something_processor_consuming(),
loop(Foo).
and your system will work. Most of languages, where actors are added by library would go to this thread and it will be running forever blocking one CPU core from execution. Erlang is able to stop that thread, run something else and return back to it, so it plays nicely, even if you screw something up. Read more here. In Erlang, you can kill the process from outside exit(Pid, kill) and it will terminate immediately. In Akka it will continue processing until it is ready for next message (you can simulate that with Erlang using trap_exit flag).
Erlang was built with fault tolerance and soft real time systems in mind from the start, so OTP emphasises supervising processes. For example in Scala supervisor has the chance to resume child on error, while in Erlang child crashes on error and has to be restarted. This is because of assumption, that crash means bad state and we don't want propagate it.
the answer is yes, it is guaranteed: see the FAQ (it is not explicitly written, but the only way to send messages in a known order is to send them from the same process)
10.8 Is the order of message reception guaranteed?
Yes, but only within one process.
If there is a live process and you send it message A and then message B, it's guaranteed that if message B arrived, message A arrived before it.
On the other hand, imagine processes P, Q and R. P sends message A to Q, and then message B to R. There is no guarantee that A arrives before B. (Distributed Erlang would have a pretty tough time if this was required!)
Yes, you can guarantee this case -- "Erlang messages" does not mean "simple UDP".
A can send B "1,2,3,4,5" and it will get exactly "1,2,3,4,5" in that order, regardless where A and B are in a cluster -- consider the implication of the last part of that statement....
What is not guaranteed is what order messages "a,b,c,d,e" from C to B will arrive at B relative to possible interleaving with A's concurrent message stream. "1,2,a,b,c,3,4,5,d,e" is as likely as "1,2,3,4,5,a,b,c,d,e" or any other interleaving of the two streams of independently ordered messages.
This is a design question;
Say I have a tree of actors which do a bunch of processing. The processing is kicked off by a client/connection actor (i.e. the tree is the server). Eventually the client actor wants a response. I.e. I have an actor system that looks like this.
ActorA <---reqData--- Client_Actor
| msgA /|\
\|/ |
ActorB |
msgB | \ msgD |
\|/ \/ |
ActorC ActorD---------msgY-->|
|_____________msgX_________|
The response that the client system wants is the output from the leaf actors (i.e. ActorC and/or ActorD). These actors in the tree may be interacting with external systems. This tree may be a set of pre-defined possibly routed actors (i.e. so Client_actor just has a actorref to the root of the actor tree, ActorA).
The question is what is the best pattern to manage sending the response (msgX &/or msgY) from the final/leaf actors back to the client actor?
I can think of the following options;
Create a tree for each connection client and get the actors to keep track of the sender, when they get a msgX or msgY, send it back to the original sender ref so the messages are passed back up through the tree. I.e each actor will keep a ref of the original sender.
Somehow send down the Client_Actor ref in the reqData message and replicate this for all messages used in the tree so the leaf actors can reply directly to the Client_actor... This seems like the most performant option. Not sure how to do this (I'm thinking a trait somehow on the message case classes that holds the client actor ref)...
Somehow lookup the client actor based on a unique id in the messages passed through the tree or use the actorselection (not sure how well this would work with remoting)...
Something better...
FYI I'm using Akka 2.2.1.
Cheers!
You could use the forward method to forward the message from the original sender to the child sender at each level.
in Client_Actor:
actorA ! "hello"
in ActorA:
def receive = {
case msg =>
???
actorB forward msg
}
in ActorB:
def receive = {
case msg =>
???
actorC forward msg
}
in ActorC:
def receive = {
case msg =>
???
sender ! "reply" // sender is Client_Actor!
}
In this case, the 'sender' field of the message will never change, so ActorC will reply to the original Client_Actor!
You can extend this further by using the tell method variant that lets you specify the sender:
destinationActor.tell("my message", someSenderActor);
The simpliest way is to sending messages with the ref to the source Client_Actor
Client
sendMsg(Client to, Client resultTo)
Client_Actor
req_data(Client to){
sendMsg(to, this);
}
This is good option, if you dont know, which Client has the result for the original poster and which is not.
If you know this and the Client_Actor is only one (like we have a tree and these and only LEAFS will always response to and only Client_Actor), you can do something like this :
Client
register_actor(Client actor){this.actor = actor;}
call_actor(){ this.actor.sendMsg(); }
For situations like this, I wrote something called a ResponseAggregator. It is an Actor instantiated as needed (rather than as a persistent single instance) taking as arguments a destination ActorRef, an arbitrary key (to distinguish the aggregator if a single destination gets fed by more than one aggregator), a completion predicate that takes a Seq[Any] holding responses received by the aggregator so far and which returns true if those responses represent completion of the aggregation process and a timeout value. The aggregator accepts and collects incoming messages until the predicate returns true or the timeout expires. Once aggregation is complete (including due to timeout) all the messages that have been received are sent to the destination along with a flag indicating whether or not aggregation timed out.
The code is a bit too big to include here and is not open source.
For this to work, the messages propagating through the system must bear ActorRefs indicating to whom a response message is to be sent (I rarely design actors that reply only to sender).
I often define the replyTo field of a message value as ActorRef* and then use my MulticastActor class, which enables the !* "send to multiple recipients" operator. This has the advantage of syntactic cleanliness in the message construction (by comparison to using Option[ActorRef] or Seq[ActorRef]) and has equal overhead (requiring the construction of something to capture the reply-to actor ref or refs).
Anyway, with these things, you can set up pretty flexible routing topologies.
I've now written a few applications using scala actors and I'm interested in how people have approached or dealt with some of the problems I've encountered.
A plethora of Message classes or !?
I have an actor which reacts to a user operation and must cause something to happen. Let's say it reacts to a message UserRequestsX(id). A continuing problem I have is that, because I want to modularize my programs, a single actor on its own is unable to complete the action without involving other actors. For example, suppose I need to use the id parameter to retrieve a bunch of values and then these need to be deleted via some other actor. If I were writing a normal Java program, I might do something like:
public void reportTrades(Date date) {
Set<Trade> trades = persistence.lookup(date);
reportService.report(trades);
}
Which is simple enough. However, using actors this becomes a bit of a pain because I want to avoid using !?. One actor reacts to the ReportTrades(date) message but it must ask a PersistenceActor for the trades and then a ReportActor to report them. The only way I've found of doing this is to do:
react {
case ReportTrades(date) =>
persistenceActor ! GetTradesAndReport(date)
}
So that in my PersistenceActor I have a react block:
react {
case GetTradesAndReport(date) =>
val ts = trades.get(date) //from persietent store
reportActor ! ReportTrades(ts)
}
But now I have 2 problems:
I have to create extra message classes to represent the same request (i.e. "report trades"). In fact I have three in this scenario but I may have many more - it becomes a problem keeping track of these
What should I call the first and third message ReportTrades? It's confusing to call them both ReportTrades (or if I do, I must put them in separate packages). Essentially there is no such thing as overloading a class by val type.
Is there something I'm missing? Can I avoid this? Should I just give up and use !? Do people use some organizational structure to clarify what is going on?
To me, your ReportTrades message is mixing two different concepts. One is a Request, the order is a Response. They might be named GetTradesReport(Date) and SendTradesReport(List[Trade]), for example. Or, maybe, ReportTradesByDate(Date) and GenerateTradesReport(List[Trade]).
Are there some objections to using reply? Or passing trades around? If not, your code would probably look like
react {
case ReportTrades(date) => persistenceActor ! GetTrades(date)
case Trades(ts) => // do smth with trades
}
and
react {
case GetTrades(date) => reply(Trades(trades.get(date)))
}
respectively.