Why is the actor "ask" pattern considered an anti-pattern or "code smell?" - scala

From what I've gathered, the "ask" pattern is considered a bad practice and should be avoided. Instead, the recommended pattern is the "actor per request" model. However, this doesn't make sense to me, as the "ask" pattern does exactly this - it creates a lightweight actor per request. So why is this then considered bad, especially when futures are far more composable and are able to more elegantly handle the collation of multiple send/receives?

From Akka docs:
"There are performance implications of using ask since something needs
to keep track of when it times out, there needs to be something that
bridges a Promise into an ActorRef and it also needs to be reachable
through remoting. So always prefer tell for performance, and only ask
if you must."
But sometimes you want to send a message from outside of an actor in which case you can use ask. Using ask will guarantee that you get a response within the specified timeout and sometimes that's what you want. However, when you use ask pattern you should ask yourself a question whether you could just use Futures instead.
There is a place for ask but it should have very limited use due to the aforementioned reasons.
You don't have to use actor per request. Some actors are meant to be long lived and some not. If an actor performs a potentially dangerous or blocking operation you might want to create one per request. Whatever fits your application logic.

Related

Akka and singleton actors

I've recently started messing around with akka's actors and http modules. However I've stumbled upon a rather annoying little quirk, namely, creating singelton actors.
Here are two examples:
1)
I have an in-memory cache, my service is quite small (its an app rather) so I really like this in memory model. I can hold most information relevant to the user in a Map (well, a map of lists, but still, quite an easy to reason about structure) and I don't get the overhead and complexity of a redis, geode or aerospike.
The only problem is that this in-memory chache can be modified, by multiple sources and said modifications must be synchronous. Instead of synchornizing all 3 acess methods for this structure (e.g. by building a message queue or implementing locks) I thought I'd just wrap the structure and its access methods into an actor, build in message queue, easy receive->send logic and if things scale up it will be very easy to replace with a DA actors over a dedicated in memory db.
2) I have a "Service" layer that should be used to dispatch actors for various jobs (access the database, access the in-memory cache, do this computation with data and deliver the result to the user... etc).
It makes sense of this Service layer to be a "singleton" of sorts, a closure over some functions, since it does nothing that's blocking or cpu/memory intensive in any way, it simply assigns tasks further down the line (e.g. decides how many actors/thread/w.e should be created and where a request should go)
However, this thing would require either:
a) Making both object singleton actors or
b) Making both objects actual "objects"(as in the scala object notation that designates a single named singleton with functions that have closures over its scope)
There are plenty of problems with b), namely that the service layer will either have to get an actors system "passed" to it (and I'm not sure that's a best practice) in order o create actors, rather than creating its own "childrens" it will create children's using the global actors system and the messaging and monitoring logic will be a lot more awkward and unintuitive. Also, that the in-memory cache will not have the advantage of the built in message que (I'm not saying its hard to implement one, but this seems like one of those situation where one goes "Oh, jolly, its good that I have actors and I don't have to spend time implementing and testing this code")
a) seems to have the problem of being generally speaking poorly documented and unadvised in the akka documentation. I mean:
http://doc.akka.io/docs/akka/2.4/scala/cluster-singleton.html
Look at this shit, half of the docs are warning against using it, it was its own dependency and quite frankly its very hard to read for a poor sod like me which hasn't set foot in the functional&concurrent programming ivory tower.
So, ahm. Could any of you guys explain to me why its bad to use singleton actors ? How do you design singletons if they can't be actors ? Is there any way to design singleton actors that won't cause a lot of damage down the line ? Is the whole "service" model of having "global" services that are called rather than instantiated "un akka like" ?
Just to clarify the documentation, they're not warning against using it. They're warning that there are circumstances in which using a singleton will cause problems, which are expected given the circumstances. They mention the following situations:
If the singleton is a performance bottleneck. This makes sense. If everything relies on a single object that does work slowly, everything will be slow.
If the actor needs to be non-stop available, you'll run into problems if the singleton ever goes down, because those messages can't just be handled by another instance. It will take some amount of time to re-start the singleton before its work can be resumed.
The biggest problem happens if you have auto-downing turned on. Auto-downing is a policy by which an unreachable node is assumed to be down, and removed from the network. If you do this, but the node is not actually down but just unreachable due to a network partition, both sides of the partition will decide that they're the surviving nodes and create their own singletons. So now you have two singletons. Which is, of course, not what you want from a singleton. But you should never use auto-downing outside of testing anyway. It's a terrible recovery strategy that was included for completeness and convenience in testing.
So I don't read that as recommending against using it. Just being clear about the expected pitfalls if you do use it, based on the nature of the structure.

How to use Scala Futures the right way?

I'm wondering if Futures are better to be used in conjunction with Actors only, rather than in a program that does not use Actor. Said differently, is performing asynchronous computation with future something that should better be done within an Actors system?
Here why i'm saying that:
1 -
You perform a computation for which the result, would trigger some action that you may do in another thread.
For instance, i have a long operation to determine the price of something, from my main thread, i decide to launch an asynchronous process for it. In the mean time i could be doing other thing, then when the response is ready/availble or communicated back to me, i go on on that path.
I can see that with actor this is handy, because you can pipe a result to an actor. But with a typical threading model, you can either block or .... ?
2 -
Another issue, let say i need to update the age of a list of participant, by getting some information online. Let assume i just have one future for that task. Isn't closing over the participant list something wrong to do. Multiple thread maybe accessing that participant list at the same time. So making the update within the future would simply be wrong and in that case, we would need java concurrent collection isn't it ?
Maybe i see it the wrong way, future are not meant to do side effect
at all
But in that case, fair enough, no side effect, but we still have the problem of getting a value back from the calling thread, which can only be blocking. I mean let's imagine that, the result, would help the calling thread, to update some data structure. How to do that update asynchronously without closing over that data structure somehow.
I believe the call backs such as OnComplete can be use for
side-effecting (Am it right here?)
still, the call back would have to close over the data structure anyway. Hence i don't see how not using Actor.
PS: I like actors, i'm just trying to understand better the usage of future without actors. I read everywhere, that one should use actor only when necessary that is when state need to be manage. It seems to me that overall, using future, without actor, always involve blocking somewhere down the line, if the result need to be communicated back at some point to the thread that initiated the asynchronous task.
Actors are good when you are dealing with mutable state because they encapsulate the mutable state. and allow only message-based interaction.
You can use Future to execute in a different thread. You don't have to block on a Future because Scala's Future compose. So if you have multiple Futures in your code, you don't have to wait/block for all of them to compete. For example, if your pipeline is completely non-block or asyn (e.g., Play and Spray) you can return a Future back to the client.
Futures are lightweight compared to actors because you don't need a complete actorsystem.
Here is a quote from Martin Odersky that I really like.
There is no silver bullet for all concurrency issues; the right
solution depends on what one needs to achieve. Do you want to define
asynchronous computations that react to events or streams of values?
Or have autonomous, isolated entities communicating via messages? Or
define transactions over a mutable store? Or, maybe the primary
purpose of parallel execution is to increase the performance? For each
of these tasks, there is an abstraction that does the job: futures,
reactive streams, actors, transactional memory, or parallel
collections.
So choose your abstraction based on your use case and needs.

Does akka FSM should be used in batch operations?

I have read about FSM in akka. Should be they used for batch opertions? E.g. should I store for example 100 messages and pass them into batch method? Or akka FSM just designed to work with states and should not be used for performance optimiztion with "classical" batch operations?
You certainly could use it like that if you want to, I see no direct problem with it. It all depends on your exact use case though.
In general switching a state is as expensive as an Actor#become, which may be little or much – depending on your access patterns.

Typed messages in akka

Akka framework recommends using typed actor only for interacting with external code. However, standard actors from akka are untyped. Is there any better way to create type safe actors? Are there some other actor frameworks or type safe wrappers around akka?
If you really want actors with static typing, then you might as well go ahead and use typed actors throughout your code. This is strongly discouraged for a couple of reasons.
1.) You run the risk of your system degenerating into a bunch of RPCs. An actor's receive method makes it pretty obvious that the whole thing is about message passing, much less so if you're just calling methods on a typed actor.
2.) An actor just really doesn't have a type. While it's running, the messages an actor is able to process may change depending on what state is in, as may what it does with those messages. This is an excellent way of modeling a lot of protocols, and Akka actors have first class support for it with FSMs.
So if you really want to do it, you're free to used typed actors everywhere and it'll work, but you should really think hard about the problem you're trying to solve before doing so.
For compile time checking see SynapseGrid framework. It defines a SystemBuilder that constructs the DataFlow topology. While constructing it is guaranteed that types that pass by are checked. Then the resulting system is converted to RuntimeSystem with nested and properly interconnected actors.
Why is this a problem for you? akka.actor.Actor has the receive method of type PartialFunction that will only be called for messages that it can handle. Why do you need compile time checks? But to answer your question: one way would be - for an external api - to build a wrapper around your ActorRef that then sends the messages to the actor.
Things are going quite fast, I thought about giving an update
1. Typed actors are deprecated
2. Instead a new concept of Akka Typed is being devloped at the momemnt
As I understood this should be the definitive solution to an typed actor system. But since this is at least the third try and planned earliest for Akka 2.4, this claim remains to be proven.
I personally do look forward to have both systems available: the existing one for more dynamic use cases, the new one for more robust ones

Akka framework support for finding duplicate messages

I'm trying to build a high-performance distributed system with Akka and Scala.
If a message requesting an expensive (and side-effect-free) computation arrives, and the exact same computation has already been requested before, I want to avoid computing the result again. If the computation requested previously has already completed and the result is available, I can cache it and re-use it.
However, the time window in which duplicate computation can be requested may be arbitrarily small. e.g. I could get a thousand or a million messages requesting the same expensive computation at the same instant for all practical purposes.
There is a commercial product called Gigaspaces that supposedly handles this situation.
However there seems to be no framework support for dealing with duplicate work requests in Akka at the moment. Given that the Akka framework already has access to all the messages being routed through the framework, it seems that a framework solution could make a lot of sense here.
Here is what I am proposing for the Akka framework to do:
1. Create a trait to indicate a type of messages (say, "ExpensiveComputation" or something similar) that are to be subject to the following caching approach.
2. Smartly (hashing etc.) identify identical messages received by (the same or different) actors within a user-configurable time window. Other options: select a maximum buffer size of memory to be used for this purpose, subject to (say LRU) replacement etc. Akka can also choose to cache only the results of messages that were expensive to process; the messages that took very little time to process can be re-processed again if needed; no need to waste precious buffer space caching them and their results.
3. When identical messages (received within that time window, possibly "at the same time instant") are identified, avoid unnecessary duplicate computations. The framework would do this automatically, and essentially, the duplicate messages would never get received by a new actor for processing; they would silently vanish and the result from processing it once (whether that computation was already done in the past, or ongoing right then) would get sent to all appropriate recipients (immediately if already available, and upon completion of the computation if not). Note that messages should be considered identical even if the "reply" fields are different, as long as the semantics/computations they represent are identical in every other respect. Also note that the computation should be purely functional, i.e. free from side-effects, for the caching optimization suggested to work and not change the program semantics at all.
If what I am suggesting is not compatible with the Akka way of doing things, and/or if you see some strong reasons why this is a very bad idea, please let me know.
Thanks,
Is Awesome, Scala
What you are asking is not dependent on the Akka framework but rather it's how you architect your actors and messages. First ensuring that your messages are immutable and have an appropriately defined identities via the equals/hashCode methods. Case classes give you both for free however if you have actorRefs embedded in the message for reply purposes you will have to override the identity methods. The case class parameters should also have the same properties recursively (immutable and proper identity).
Secondly you need to figure out how the actors will handle storing and identifying current/past computations. The easiest is to uniquely map requests to actors. This way that actor and only that actor will ever process that specific request. This can be done easily given a fixed set of actors and the hashCode of the request. Bonus points if the actor set is supervised where the supervisor is managing the load balancing/mapping and replacing failed actors ( Akka makes this part easy ).
Finally the actor itself can maintain a response caching behavior based on the criteria you described. Everything is thread safe in the context of the actor so a LRU cache keyed by the request itself ( good identity properties remember ) is easy with any type of behavior you want.
As Neil says, this is not really framework functionality, it's rather trivial to implement this and even abstract it into it's own trait.
trait CachingExpensiveThings { self: Actor =>
val cache = ...
def receive: Actor.Receive = {
case s: ExpensiveThing => cachedOrCache(s)
}
def cacheOrCached(s: ExpensiveThing) = cache.get(s) match {
case null => val result = compute(s)
cache.put(result)
self.reply_?)(result)
case cached => self.reply_?)(cached)
}
def compute(s: ExpensiveThing): Any
}
class MyExpensiveThingCalculator extends Actor with CachingExpensiveThings {
def compute(s: ExpensiveThing) = {
case l: LastDigitOfPi => ...
case ts: TravellingSalesman => ...
}
}
I do not know if all of these responsibilities should be handled only by the Akka. As usual, it all depends on the scale, and in particular - the number of attributes that defines the uniqueness of the message.
In case of cache mechanism, already mentioned approach with uniquely mapping requests to actors is way to go especially that it could be supported by the persistency.
In case of identity, instead of checking simple equality (which may be bottleneck) I will rather use graph based algorithm like signal-collect.