Typed messages in akka - scala

Akka framework recommends using typed actor only for interacting with external code. However, standard actors from akka are untyped. Is there any better way to create type safe actors? Are there some other actor frameworks or type safe wrappers around akka?

If you really want actors with static typing, then you might as well go ahead and use typed actors throughout your code. This is strongly discouraged for a couple of reasons.
1.) You run the risk of your system degenerating into a bunch of RPCs. An actor's receive method makes it pretty obvious that the whole thing is about message passing, much less so if you're just calling methods on a typed actor.
2.) An actor just really doesn't have a type. While it's running, the messages an actor is able to process may change depending on what state is in, as may what it does with those messages. This is an excellent way of modeling a lot of protocols, and Akka actors have first class support for it with FSMs.
So if you really want to do it, you're free to used typed actors everywhere and it'll work, but you should really think hard about the problem you're trying to solve before doing so.

For compile time checking see SynapseGrid framework. It defines a SystemBuilder that constructs the DataFlow topology. While constructing it is guaranteed that types that pass by are checked. Then the resulting system is converted to RuntimeSystem with nested and properly interconnected actors.

Why is this a problem for you? akka.actor.Actor has the receive method of type PartialFunction that will only be called for messages that it can handle. Why do you need compile time checks? But to answer your question: one way would be - for an external api - to build a wrapper around your ActorRef that then sends the messages to the actor.

Things are going quite fast, I thought about giving an update
1. Typed actors are deprecated
2. Instead a new concept of Akka Typed is being devloped at the momemnt
As I understood this should be the definitive solution to an typed actor system. But since this is at least the third try and planned earliest for Akka 2.4, this claim remains to be proven.
I personally do look forward to have both systems available: the existing one for more dynamic use cases, the new one for more robust ones

Related

Akka and singleton actors

I've recently started messing around with akka's actors and http modules. However I've stumbled upon a rather annoying little quirk, namely, creating singelton actors.
Here are two examples:
1)
I have an in-memory cache, my service is quite small (its an app rather) so I really like this in memory model. I can hold most information relevant to the user in a Map (well, a map of lists, but still, quite an easy to reason about structure) and I don't get the overhead and complexity of a redis, geode or aerospike.
The only problem is that this in-memory chache can be modified, by multiple sources and said modifications must be synchronous. Instead of synchornizing all 3 acess methods for this structure (e.g. by building a message queue or implementing locks) I thought I'd just wrap the structure and its access methods into an actor, build in message queue, easy receive->send logic and if things scale up it will be very easy to replace with a DA actors over a dedicated in memory db.
2) I have a "Service" layer that should be used to dispatch actors for various jobs (access the database, access the in-memory cache, do this computation with data and deliver the result to the user... etc).
It makes sense of this Service layer to be a "singleton" of sorts, a closure over some functions, since it does nothing that's blocking or cpu/memory intensive in any way, it simply assigns tasks further down the line (e.g. decides how many actors/thread/w.e should be created and where a request should go)
However, this thing would require either:
a) Making both object singleton actors or
b) Making both objects actual "objects"(as in the scala object notation that designates a single named singleton with functions that have closures over its scope)
There are plenty of problems with b), namely that the service layer will either have to get an actors system "passed" to it (and I'm not sure that's a best practice) in order o create actors, rather than creating its own "childrens" it will create children's using the global actors system and the messaging and monitoring logic will be a lot more awkward and unintuitive. Also, that the in-memory cache will not have the advantage of the built in message que (I'm not saying its hard to implement one, but this seems like one of those situation where one goes "Oh, jolly, its good that I have actors and I don't have to spend time implementing and testing this code")
a) seems to have the problem of being generally speaking poorly documented and unadvised in the akka documentation. I mean:
http://doc.akka.io/docs/akka/2.4/scala/cluster-singleton.html
Look at this shit, half of the docs are warning against using it, it was its own dependency and quite frankly its very hard to read for a poor sod like me which hasn't set foot in the functional&concurrent programming ivory tower.
So, ahm. Could any of you guys explain to me why its bad to use singleton actors ? How do you design singletons if they can't be actors ? Is there any way to design singleton actors that won't cause a lot of damage down the line ? Is the whole "service" model of having "global" services that are called rather than instantiated "un akka like" ?
Just to clarify the documentation, they're not warning against using it. They're warning that there are circumstances in which using a singleton will cause problems, which are expected given the circumstances. They mention the following situations:
If the singleton is a performance bottleneck. This makes sense. If everything relies on a single object that does work slowly, everything will be slow.
If the actor needs to be non-stop available, you'll run into problems if the singleton ever goes down, because those messages can't just be handled by another instance. It will take some amount of time to re-start the singleton before its work can be resumed.
The biggest problem happens if you have auto-downing turned on. Auto-downing is a policy by which an unreachable node is assumed to be down, and removed from the network. If you do this, but the node is not actually down but just unreachable due to a network partition, both sides of the partition will decide that they're the surviving nodes and create their own singletons. So now you have two singletons. Which is, of course, not what you want from a singleton. But you should never use auto-downing outside of testing anyway. It's a terrible recovery strategy that was included for completeness and convenience in testing.
So I don't read that as recommending against using it. Just being clear about the expected pitfalls if you do use it, based on the nature of the structure.

Is the actor model not an anti-pattern, as the fire-and-forget style forces actors to remember a state?

When learning Scala, one of the first things I learned was that every function returns something. There is no "void"-function/method as there is, for instance in Java. Thus many Scala-functions are true functions, in a mathematic way, and objects can remain largely stateless.
Now I learned that the actor model is a very popular model among functional languages like Scala. However, actors promote a fire-and-forget style of programming, and callers usually don't expect callees to directly reply to messages (except when using the "ask"/"?"-method). Therefore, actors need to remember some sort of state.
Am I right assuming that the actor model is more like a trade-off between scalability and maintainability (due to its statefulness), and could sometimes even be considered an anti-pattern?
Yes you're essentially right (I'm not quite sure what you have in mind when you say scalability vs maintainability).
Actors are popular in Scala because of Akka (which presumably is in turn popular because of the support it gets from Lightbend). It is, not however, the case that actors are overwhelmingly popular in general in the functional programming world (although implementations exist for all the languages I'm thinking of). Below are my vastly simplified impressions (so take them with the requisite amount of salt) of two other FP language communities, both of which use actors (far?) less frequently than Scala does.
The Haskell community tends to use either STM/channels (often in an STM context). Straight up MVars also get used surprisingly often.
The Clojure community sometimes touts its own built-in version of STM, but its flagship concurrency model is really core.async, which is at its heart again channels.
As an aside STM, channels, and actors can all be layered upon one another; its sort of weird to compare them as if they were mutually exclusive approaches. In practice though it's rare to see them all used in tandem.
Actors do indeed involve state (and in the case of Akka skirt type safety) and as a result are very expressive and can pretty much do anything concurrency-wise. In this way they're similar to side-effectful functions, which are more expressive than pure functions. Indeed actors in a way are the pure essence of OO, with all its pros and cons.
As such there is a sizable chunk of the Scala community that would say yes, if most of the time when you face concurrency issues, you're using actors, that's probably an anti-pattern.
If you can, try to get away with just using Futures or scalaz.concurrent.Tasks. In return for less expressiveness you get more composability.
If your problem naturally lends itself to a single, global state (e.g. in the form of global invariants that you want to enforce), think about STM. In the Scala community, although an STM library exists, my impression is that STM is usually emulated by using actors.
If your concurrency problems mainly revolves around streaming multiple sources of data, think about using one of Scala's streaming libraries.
Actors are specifically a tool in the toolbox for handling and distributing state. So yes, they should have state - if they don't then you just could use Futures.
Please note however that Actors (at least Akka Actors) handle distribution (running location-transparently on multiple nodes) which neither functions of Futures are able to do. The concurrency aspects of Actors are a result of them handling the more complex case - networking. In that sense, Actors unify the remote case with the local case, by making the remote case be first-class. And as it turns out, on networks messaging is exactly what you can both count and build on if you want reliable, resilient and also fast systems.
Hope this answers the "big picture" part of your question.

Process-level parallelism in Scala

I'd like to use reflection in combination with parallel processing in Scala, but I'm getting bitten by reflection's lack of thread safety.
So, I'm considering just running each task in its own process (not thread).
Is there any easy way to do this?
For example, is there a way to configure .par so it spawns processes, not threads? Or is there some function fork that takes a closure and runs it in a new process?
EDIT: Futures are apparently a good way to go.
However, I still need to figure out how to run them in separate processes.
EDIT 2: I'm still having concurrency issues, even when using Akka's "fork-join-executor" dispatcher, which sure sounds like it should be forking processes. However, when I run ManagementFactory.getRuntimeMXBean().getName() inside the Futures, it seems everything still lives in the same process.
Is this the right way to check for actual process-level parallelism?
Am I using the correct Akka dispatcher?
EDIT 3: I realize reflection sucks. Unfortunately it is used in a library I need.
Have you looked into Scala Actors or Akka? There may be no more compelling reason to use Scala than for parallel and asynchronous programming. It's baked into the language. Check out these facilities. I'm pretty sure you'll find what you need.
There's little information as regards the problem you're trying to solve here...previous answers are pretty much on the ball - look at Actors etc...Akka and you may find that you don't need to necessarily do anything too complicated. Introspection/reflection in a multi-threaded environment usually means a messy and not well thought-out strategy in terms of decomposing the problem in hand.

Is the actor model limited to specific languages?

I was reading an interesting blog post about Erlang/OTP and the actor model. I also hear that Scala supports the actor model. From the little I gathered so far, the actor model breaks down processing into components that communicate with each other by passing messages. Typically, those processes are immutable.
Are those features language-specific though or more at the architecture level? more specifically, can't you just implement the same actor model in almost any language, and just use some form of message-queue to pass messages between worker processes? (for example, use something like celery). Or is it that those languages like Erlang and Scala simply do this transparently and much faster?
Certainly you can define an "Actor Library" in virtually any language, but in Erlang the model is baked-in to the language, and is really the only concurrency model available.
While Scala's actors system is well implemented, at the end of the day, it still vulnerable to some hazards that Erlang is immune from. I'll draw your attention to this paper.
This would be the case for any Actor library implemented in any imperative language that supports shared mutable state.
An interesting exception to this is Nodes.js. Some work is being done with actors between Nodes that probably exhibit the same isolation properties as Erlang, simply because there is no shared mutable state.
Actor model is not limited to any specific platform or programming language, it's just a model after all.
Erlang and Scala have really good and useful implementations of this model, which fits nicely in typical technology stack of these platforms and helps to effectively solve certain kinds of tasks.
To add to the points mentioned above, the fact that in Erlang actor model is the only way you can program, makes your code scalable from the get-go. Erlang processes are lightweight, and you can spawn 10-100K on one machine (I don't think you can do it with python), this changes the way you approach problems. For example, in our product we parse web server logs with Erlang and spawn an Erlang process to handle each line. That way, if one log line is corrupted, or the process that handles it crashes, nothing happens to the other ones.
Another difference is when you start using OTP you get processes supervisors and you can make processes connected so if one terminates all the others do.
Other than that, Erlang has some other nice feature (which can be found in other languages through libraries, but again here it's baked in) like pattern matching and hot deploy.
No, there is nothing language-specific about the Actor Model. In fact, you already mention Scala in your question, where actors are not part of the language but are instead implemented as a library. (Three competing libraries, actually.)
However, just like Functional Programming or Object-Oriented Programming, having direct support for Actor Programming, or at least support for some abstractions that make it easier to implement, in the language will lead to a very different programming experience. Anyone who has ever done Functional Programming or Object-Oriented Programming in C will probably understand this.

Scala actor memory leaks, are they as bad as It was or improving?

I am currently studding Scala 2.8 using Programming in Scala 2nd edition.
But I am beginning to get really concerned about posts like this Clojure vs Scala
Is Scala this bad about memory leaks, this is not the first info I heard about issues with actors and memory leaks.
Is it that bad? are new versions fixing it in reasonable time? Is Akka going to solve all of it if or when if it's merged?
Because seeing big issues with one of scala biggest strong points (at least for me Erlang like actors are one of the major candies of the lang) is really a major drawback if they are not being able to fix them and improve on top of it.
I know of people using huge numbers of actors, so I'm pretty sure memory leaks are not widespread.
Did Scala Actors have memory leaks back in 2009 (Scala 2.7.x)? Yes, they did. For example, SI-1801 and SI-1948.
Right now, there are three tickets open on memory leaks that I could find: SI-3467,SI-3920 and SI-3921.
I do take issue with one comment you made, however:
one of scala biggest strong points (at least for me Erlang like actors
are one of the major candies of the lang)
Actors are NOT part of the language! They are a library! That is the whole point of Scala, it is the very meaning of "scalable" from which the name Scala came: that you can add stuff like this through libraries.
There are, right now, four different actor implementations in Scala: main library, Scalaz, Lift and Akka. There's absolutely no reason for you to tie yourself to the standard library one. In fact, one of the problems with the actors in the main library is that they were written more to prove that one could do it than to solve real problems.
If you want to use actors, use Akka. You can use it right now. Hell, you can even use it with Java, if you're into syntactic masochism. Akka is a superb library, which goes far beyond simply providing actors, and into providing all the supporting tool to make them useful (like supervisors and load balancers), plus other tools to fully support concurrency, like Agents (Clojure-style), STM (Multiverse-based), integration with Spring, Camel, AMQP, etc.
Scala's strength is making it possible to seemingly extend it through libraries. If you limit yourself to what's on the standard library, you are throwing that away.
You should try Akka. It's really robust, lightweight and tunable. For instance, you can bound mailbox sizes (and choose what to do when mailboxes are full).
I don't know Scala internals much, but I would guess that Scala's actors implementation is queuing every message without limit.
If the actors don't pull from the queue fast enough, the queue grows and consumes memory.
I guess a less memory angry implementation would limit the number of queued messages at once, and thus consume less memory (but would also block message senders while the queue is full).