Concurrency overview

Concurrency overview - scala

As Scala provides a great suite to deal with concurrency (Akka, parallel collections, futures and so on) it also leaves me a bit puzzled. Is there some kind of guide line when to use what? Some kind of best practices?

First of all, concurrency != parallelism. The latter can be employed for problems which you reason about essentially in a sequential manner, but which can be efficiently partitioned into chunks which can be independently processed (before being put together again in the end). For example, mapping and filtering a collection, that would be a scenario for parallel collections.
Some others have reasoned about actors versus futures. In short, actors are more OO in the sense that each actor can encapsulate its own internal state, they are more like black boxes. Also actor concurrency is nondeterministic, whereas dataflow and futures are deterministic. Actors are a natural choice when you want to distribute tasks across multiple computers. Actors can accept multiple types of messages, whereas futures allow function composition over one specific type. (This is simplified, as Akka now has typed channels, which I guess makes it more composable). Actors would be suitable for services which wait for requests, whereas futures can be thought of as lazy answers.
If you have multiple concurrent threads, software transactional memory (STM) is also a useful abstraction. STM doesn't manage threadpools or concurrent tasks by itself, but when combined with them, it handles mutable state in a safe manner.

Related

How heavy are akka actors?

I am aware this is a very imprecise question and might be deemed inappropriate for stackoverflow. Unfortunately smaller applications (in terms of the number of actors) and 'tutorial-like' ones don't help me develop intuition about the overhead of message dispatch and a swift spot for granularity between a 'scala object' and a 'CORBA object'.
While almost certainly keeping a state of conversation with a client for example deserves an actor, in most real use cases it would involve conditional/parallel/alternative interactions modeled by many classes. This leaves the choice between treating actors as facades to quite complex services, similar to the justly retired EJB, or akin to smalltalk objects, firing messages between each other willy-nilly whenever communication can possibly be implemented in an asynchronous manner.
Apart from the overhead of message passing itself, there will also be overhead involved with lifecycle management, and I am wary of potential problems caused by chained-restarts of whole subtrees of actors due to exceptions or other errors in their root.
For the sake of this question we may assume that vast majority of the communication happens within a single machine and network crossing is insignificant.

I am not sure what you mean by an "overhead of message passing itself".
When network/serialisation is not involved then the overhead is negligible: one side pushes a message in a queue, another reads it from it.
Akka claims that it can go as fast as 50 millions messages per second on a single machine. This means that you wouldn't use actors just as façade for complex subsystems. You would rather model them as mush smaller "working units". They can be more complex compare to smalltalk objects when convenient. You could have, say, KafkaConsumerActor which would utilise internally other "normal" classes such as Connection, Configuration, etc., these don't have to be akka actors. But it is still small enough to be a simple working unit doing one simple thing (consuming a message and sending it somewhere).
50 millions a second is really a lot.
A memory footprint is also extremely small. Akka itself claims that you can have ~2.5 millions actors for just 1GB of heap. Compare to what a typical system does it is, indeed, nothing.
As for lifecycle, creating an actor is not much heavier than creating an class instance and a mailbox so I don't really expect it to be that significant.
Saying that, typically you don't have many actors in your system that would handle one message and die. Normally you spawn actors which live much longer. Like, an actor that calculates your mortgage repayments based on parameters you provide doesn't have any reason to die at all.
Also Akka makes it very simple to use actor pools (different kinds of them).
So performance here is very tweakable.
Last point is that you should compare Akka overhead in a context. For example, if your system is doing database queries, or serving/performing HTTP requests, or even doing significant IO of some sort, then probably overhead of these activities makes overhead of Akka so insignificant so you wouldn't even bother thinking about it. Like a roundtrip to the DB for 50 millis would be an equivalent of an overhead from ~2.5 millions akka messages. Does it matter?
So can you find an edge case scenario where Akka would force you to pay performance penalties? Probably. Akka is not a golden hammer (and nothing is).
But with all the above in mind you should think if it is Akka that is a performance bottleneck in your specific context or you are wasting time in micro-optimisation.

Is Scala's actors similar to Go's coroutines?

If I wanted to port a Go library that uses Goroutines, would Scala be a good choice because its inbox/akka framework is similar in nature to coroutines?

Nope, they're not. Goroutines are based on the theory of Communicating Sequential Processes, as specified by Tony Hoare in 1978. The idea is that there can be two processes or threads that act independently of one another but share a "channel," which one process/thread puts data into and the other process/thread consumes. The most prominent implementations you'll find are Go's channels and Clojure's core.async, but at this time they are limited to the current runtime and cannot be distributed, even between two runtimes on the same physical box.
CSP evolved to include a static, formal process algebra for proving the existence of deadlocks in code. This is a really nice feature, but neither Goroutines nor core.async currently support it. If and when they do, it will be extremely nice to know before running your code whether or not a deadlock is possible. However, CSP does not support fault tolerance in a meaningful way, so you as the developer have to figure out how to handle failure that can occur on both sides of channels, and such logic ends up getting strewn about all over the application.
Actors, as specified by Carl Hewitt in 1973, involve entities that have their own mailbox. They are asynchronous by nature, and have location transparency that spans runtimes and machines - if you have a reference (Akka) or PID (Erlang) of an actor, you can message it. This is also where some people find fault in Actor-based implementations, in that you have to have a reference to the other actor in order to send it a message, thus coupling the sender and receiver directly. In the CSP model, the channel is shared, and can be shared by multiple producers and consumers. In my experience, this has not been much of an issue. I like the idea of proxy references that mean my code is not littered with implementation details of how to send the message - I just send one, and wherever the actor is located, it receives it. If that node goes down and the actor is reincarnated elsewhere, it's theoretically transparent to me.
Actors have another very nice feature - fault tolerance. By organizing actors into a supervision hierarchy per the OTP specification devised in Erlang, you can build a domain of failure into your application. Just like value classes/DTOs/whatever you want to call them, you can model failure, how it should be handled and at what level of the hierarchy. This is very powerful, as you have very little failure handling capabilities inside of CSP.
Actors are also a concurrency paradigm, where the actor can have mutable state inside of it and a guarantee of no multithreaded access to the state, unless the developer building an actor-based system accidentally introduces it, for example by registering the Actor as a listener for a callback, or going asynchronous inside the actor via Futures.
Shameless plug - I'm writing a new book with the head of the Akka team, Roland Kuhn, called Reactive Design Patterns where we discuss all of this and more. Green threads, CSP, event loops, Iteratees, Reactive Extensions, Actors, Futures/Promises, etc. Expect to see a MEAP on Manning by early next month.

There are two questions here:
Is Scala a good choice to port goroutines?
This is an easy question, since Scala is a general purpose language, which is no worse or better than many others you can choose to "port goroutines".
There are of course many opinions on why Scala is better or worse as a language (e.g. here is mine), but these are just opinions, and don't let them stop you.
Since Scala is general purpose, it "pretty much" comes down to: everything you can do in language X, you can do in Scala. If it sounds too broad.. how about continuations in Java :)
Are Scala actors similar to goroutines?
The only similarity (aside the nitpicking) is they both have to do with concurrency and message passing. But that is where the similarity ends.
Since Jamie's answer gave a good overview of Scala actors, I'll focus more on Goroutines/core.async, but with some actor model intro.
Actors help things to be "worry free distributed"
Where a "worry free" piece is usually associated with terms such as: fault tolerance, resiliency, availability, etc..
Without going into grave details how actors work, in two simple terms actors have to do with:
Locality: each actor has an address/reference that other actors can use to send messages to
Behavior: a function that gets applied/called when the message arrives to an actor
Think "talking processes" where each process has a reference and a function that gets called when a message arrives.
There is much more to it of course (e.g. check out Erlang OTP, or akka docs), but the above two is a good start.
Where it gets interesting with actors is.. implementation. Two big ones, at the moment, are Erlang OTP and Scala AKKA. While they both aim to solve the same thing, there are some differences. Let's look at a couple:
I intentionally do not use lingo such as "referential transparency", "idempotence", etc.. they do no good besides causing confusion, so let's just talk about immutability [a can't change that concept]. Erlang as a language is opinionated, and it leans towards strong immutability, while in Scala it is too easy to make actors that change/mutate their state when a message is received. It is not recommended, but mutability in Scala is right there in front of you, and people do use it.
Another interesting point that Joe Armstrong talks about is the fact that Scala/AKKA is limited by the JVM which just wasn't really designed with "being distributed" in mind, while Erlang VM was. It has to do with many things such as: process isolation, per process vs. the whole VM garbage collection, class loading, process scheduling and others.
The point of the above is not to say that one is better than the other, but it's to show that purity of the actor model as a concept depends on its implementation.
Now to goroutines..
Goroutines help to reason about concurrency sequentially
As other answers already mentioned, goroutines take roots in Communicating Sequential Processes, which is a "formal language for describing patterns of interaction in concurrent systems", which by definition can mean pretty much anything :)
I am going to give examples based on core.async, since I know internals of it better than Goroutines. But core.async was built after the Goroutines/CSP model, so there should not be too many differences conceptually.
The main concurrency primitive in core.async/Goroutine is a channel. Think about a channel as a "queue on rocks". This channel is used to "pass" messages. Any process that would like to "participate in a game" creates or gets a reference to a channel and puts/takes (e.g. sends/receives) messages to/from it.
Free 24 hour Parking
Most of work that is done on channels usually happens inside a "Goroutine" or "go block", which "takes its body and examines it for any channel operations. It will turn the body into a state machine. Upon reaching any blocking operation, the state machine will be 'parked' and the actual thread of control will be released. This approach is similar to that used in C# async. When the blocking operation completes, the code will be resumed (on a thread-pool thread, or the sole thread in a JS VM)" (source).
It is a lot easier to convey with a visual. Here is what a blocking IO execution looks like:
You can see that threads mostly spend time waiting for work. Here is the same work but done via "Goroutine"/"go block" approach:
Here 2 threads did all the work, that 4 threads did in a blocking approach, while taking the same amount of time.
The kicker in above description is: "threads are parked" when they have no work, which means, their state gets "offloaded" to a state machine, and the actual live JVM thread is free to do other work (source for a great visual)
note: in core.async, channel can be used outside of "go block"s, which will be backed by a JVM thread without parking ability: e.g. if it blocks, it blocks the real thread.
Power of a Go Channel
Another huge thing in "Goroutines"/"go blocks" is operations that can be performed on a channel. For example, a timeout channel can be created, which will close in X milliseconds. Or select/alt! function that, when used in conjunction with many channels, works like a "are you ready" polling mechanism across different channels. Think about it as a socket selector in non blocking IO. Here is an example of using timeout channel and alt! together:
(defn race [q]
(searching [:.yahoo :.google :.bing])
(let [t (timeout timeout-ms)
start (now)]
(go
(alt!
(GET (str "/yahoo?q=" q)) ([v] (winner :.yahoo v (took start)))
(GET (str "/bing?q=" q)) ([v] (winner :.bing v (took start)))
(GET (str "/google?q=" q)) ([v] (winner :.google v (took start)))
t ([v] (show-timeout timeout-ms))))))
This code snippet is taken from wracer, where it sends the same request to all three: Yahoo, Bing and Google, and returns a result from the fastest one, or times out (returns a timeout message) if none returned within a given time. Clojure may not be your first language, but you can't disagree on how sequential this implementation of concurrency looks and feels.
You can also merge/fan-in/fan-out data from/to many channels, map/reduce/filter/... channels data and more. Channels are also first class citizens: you can pass a channel to a channel..
Go UI Go!
Since core.async "go blocks" has this ability to "park" execution state, and have a very sequential "look and feel" when dealing with concurrency, how about JavaScript? There is no concurrency in JavaScript, since there is only one thread, right? And the way concurrency is mimicked is via 1024 callbacks.
But it does not have to be this way. The above example from wracer is in fact written in ClojureScript that compiles down to JavaScript. Yes, it will work on the server with many threads and/or in a browser: the code can stay the same.
Goroutines vs. core.async
Again, a couple of implementation differences [there are more] to underline the fact that theoretical concept is not exactly one to one in practice:
In Go, a channel is typed, in core.async it is not: e.g. in core.async you can put messages of any type on the same channel.
In Go, you can put mutable things on a channel. It is not recommended, but you can. In core.async, by Clojure design, all data structures are immutable, hence data inside channels feels a lot safer for its wellbeing.
So what's the verdict?
I hope the above shed some light on differences between the actor model and CSP.
Not to cause a flame war, but to give you yet another perspective of let's say Rich Hickey:
"I remain unenthusiastic about actors. They still couple the producer with the consumer. Yes, one can emulate or implement certain kinds of queues with actors (and, notably, people often do), but since any actor mechanism already incorporates a queue, it seems evident that queues are more primitive. It should be noted that Clojure's mechanisms for concurrent use of state remain viable, and channels are oriented towards the flow aspects of a system."(source)
However, in practice, Whatsapp is based on Erlang OTP, and it seemed to sell pretty well.
Another interesting quote is from Rob Pike:
"Buffered sends are not confirmed to the sender and can take arbitrarily long. Buffered channels and goroutines are very close to the actor model.
The real difference between the actor model and Go is that channels are first-class citizens. Also important: they are indirect, like file descriptors rather than file names, permitting styles of concurrency that are not as easily expressed in the actor model. There are also cases in which the reverse is true; I am not making a value judgement. In theory the models are equivalent."(source)

Moving some of my comments to an answer. It was getting too long :D (Not to take away from jamie and tolitius's posts; they're both very useful answers.)
It isn't quite true that you could do the exact same things that you do with goroutines in Akka. Go channels are often used as synchronization points. You cannot reproduce that directly in Akka. In Akka, post-sync processing has to be moved into a separate handler ("strewn" in jamie's words :D). I'd say the design patterns are different. You can kick off a goroutine with a chan, do some stuff, and then <- to wait for it to finish before moving on. Akka has a less-powerful form of this with ask, but ask isn't really the Akka way IMO.
Chans are also typed, while mailboxes are not. That's a big deal IMO, and it's pretty shocking for a Scala-based system. I understand that become is hard to implement with typed messages, but maybe that indicates that become isn't very Scala-like. I could say that about Akka generally. It often feels like its own thing that happens to run on Scala. Goroutines are a key reason Go exists.
Don't get me wrong; I like the actor model a lot, and I generally like Akka and find it pleasant to work in. I also generally like Go (I find Scala beautiful, while I find Go merely useful; but it is quite useful).
But fault tolerance is really the point of Akka IMO. You happen to get concurrency with that. Concurrency is the heart of goroutines. Fault-tolerance is a separate thing in Go, delegated to defer and recover, which can be used to implement quite a bit of fault tolerance. Akka's fault tolerance is more formal and feature-rich, but it can also be a bit more complicated.
All said, despite having some passing similarities, Akka is not a superset of Go, and they have significant divergence in features. Akka and Go are quite different in how they encourage you to approach problems, and things that are easy in one, are awkward, impractical, or at least non-idiomatic in the other. And that's the key differentiators in any system.
So bringing it back to your actual question: I would strongly recommend rethinking the Go interface before bringing it to Scala or Akka (which are also quite different things IMO). Make sure you're doing it the way your target environment means to do things. A straight port of a complicated Go library is likely to not fit in well with either environment.

These are all great and thorough answers. But for a simple way to look at it, here is my view. Goroutines are a simple abstraction of Actors. Actors are just a more specific use-case of Goroutines.
You could implement Actors using Goroutines by creating the Goroutine aside a Channel. By deciding that the channel is 'owned' by that Goroutine you're saying that only that Goroutine will consume from it. Your Goroutine simply runs an inbox-message-matching loop on that Channel. You can then simply pass the Channel around as the 'address' of your "Actor" (Goroutine).
But as Goroutines are an abstraction, a more general design than actors, Goroutines can be used for far more tasks and designs than Actors.
A trade-off though, is that since Actors are a more specific case, implementations of actors like Erlang can optimize them better (rail recursion on the inbox loop) and can provide other built-in features more easily (multi process and machine actors).

can we say that in Actor Model, the addressable entity is the Actor, the recipient of message. whereas in Go channels, the addressable entity is the channel, the pipe in which message flows.
in Go channel, you send message to the channel, and any number of recipients can be listening, and one of them will receive the message.
in Actor only one actor to whose actor-ref you send the message, will receive the message.

Reactive services and scalability without Akka

I've read Reactive Manifesto for a couple of times and tried to wrap my head around all this reactive, async, non-blocking stuff. It's clear how to make scalable systems on top of Actors, but will i get the same effect, in term of scalability, asynchronous execution execution if i would actively use scala Future all over my code, every method would either accept or return a Future. Would such service be scalable and responsive?. Lets say that in this question i'm not much interested in event-driven and resilient part of the service.

This answer reflects my experience using Akka in Scala and its Actor and Future types. I don't consider myself an expert, yet, but I've done a few systems using these libraries and feel I am beginning to develop a sense for how to use them.
The choice of using an Actor vs. a Future is about the nature of the concurrency you require. Futures compose monadically and in DAG-structured graphs, making certain computational structures very elegantly composed. But they have severe limitations. Basically, the computation that is performed concurrently within a Future must be self-contained (or at most reference only immutable external state) or you have not solved the problem of inter-thread interference with all the attendant risks such as deadlock, race conditions, unpredictable behavior or data structure corruption.
When your computation involves long-lived, mutable state, an Actor encapsulates that securely preventing corruption and race conditions. On the other hand, actors are not composable, but they do provide a lot of flexibility in how you construct networks of interacting computations. This is true only providing you don't limit actor responses to always and only sending them (the response to a request) back to the actor from which the request was received. If you only send responses to the requesting actor, you're limited to a tree-structured pattern of inter-actor interaction.
In any real, non-trivial system, you're very likely to use both formalisms, Future and Actor.

What is the benefit of using Futures over parallel collections in scala?

Is there a good reason for the added complexity of Futures (vs parallel collections) when processing a list of items in parallel?
List(...).par.foreach(x=>longRunningAction(x))
vs
Future.traverse(List(...)) (x=>Future(longRunningAction(x)))

I think the main advantage would be that you can access the results of each future as soon as it is computed, while you would have to wait for the whole computation to be done with a parallel collection. A disadvantage might be that you end up creating lots of futures. If you later end up calling Future.sequence, there really is no advantage.

Parallel collections will kill of some threads as we get closer to processing all items. So last few items might be processed by single thread.
Please see my question for more details on this behavior Using ThreadPoolTaskSupport as tasksupport for parallel collections in scala
Future does no such thing, and all your threads are in use until all objects are processed. Hence unless your tasks are so small that you dont care about loss of parallelism for last few tasks and you are using huge number of threads, which have to killed of as soon as possible, Futures are better.

Futures become useful as soon as you want to compose your deferred / concurrent computations. Futures (the good kind, anyway, such as Akka's) are monadic and hence allow you to build arbitrarily complex computational structures with all the concurrency and synchronization handled properly by the Futures library.

Using Actors to exploit cores

I'm new to Scala in general and Actors in particular and my problem is so basic, the online resources I have found don't cover it.
I have a CPU-intensive, easily parallelized algorithm that will be run on an n-core machine (I don't know n). How do I implement this in Actors so that all available cores address the problem?
The first way I thought of was to simple break the problem into m pieces (where m is some medium number like 10,000) and create m Actors, one for each piece, give each Actor its little piece and let 'em go.
Somehow, this struck me as inefficient. Zillions of Actors just hanging around, waiting for some CPU love, pointlessly switching contexts...
Then I thought, make some smaller number of Actors, and feed each one several pieces. The problem was, there's no reason to expect the pieces are the same size, so one core might get bogged down, with many of its tasks still queued, while other cores are idle.
I noodled around with a Supervisor that knew which Actors were busy, and eventually realized that this has to be a solved problem. There must be a standard pattern (maybe even a standard library) for dealing with this very generic issue. Any suggestions?

Take a look at the Akka library, which includes an implementaton of actors. The Dispatchers Module gives you more options for limiting actors to cpu threads (HawtDispatch-based event-driven) and/or balancing the workload (Work-stealing event-based).

Generally, there're 2 kinds of actors: those that are tied to threads (one thread per actor), and those that share 1+ thread, working behind a scheduler/dispatcher that allocates resources (= possibility to execute a task/handle incoming message against controlled thread-pool or a single thread).
I assume, you use second type of actors - event-driven actors, because you mention that you run 10k of them. No matter how many event-driven actors you have (thousands or millions), all of them will be fighting for the small thread pool to handle the message. Therefore, you will even have a worse performance dividing your task queue into that huge number of portions - scheduler will either try to handle messages sent to 10k actors against a fixed thread pool (which is slow), or will allocate new threads in the pool (if the pool is not bounded), which is dangerous (in the worst case, there will be started 10k threads to handle messages).
Event-driven actors are good for short-time (ideally, non-blocking) tasks. If you're dealing with CPU-intensive tasks I'd limit number of threads in the scheduler/dispatcher pool (when you use event-driven actors) or actors themselves (when you use thread-based actors) to the number of cores to achieve the best performance.
If you want this to be done automatically (adjust number of threads in dispatcher pool to the number of cores), you should use HawtDisaptch (or it's Akka implementation), as it was proposed earlier:
The 'HawtDispatcher' uses the
HawtDispatch threading library which
is a Java clone of libdispatch. All
actors with this type of dispatcher
are executed on a single system wide
fixed sized thread pool. The number of
of threads will match the number of
cores available on your system. The
dispatcher delivers messages to the
actors in the order that they were
producer at the sender.

You should look into Futures I think. In fact, you probably need a threadpool which simply queues threads when a max number of threads has been reached.
Here is a small example involving futures: http://blog.tackley.net/2010/01/scala-futures.html
I would also suggest that you don't pay too much attention to the context switching since you really can't do anything but rely on the underlying implementation. Of course a rule of thumb would be to keep the active threads around the number of physical cores, but as I noted above this could be handled by a threadpool with a fifo-queue.
NOTE that I don't know if Actors in general or futures are implemented with this kind of pool.
For thread pools, look at this: http://www.scala-lang.org/api/current/scala/concurrent/ThreadPoolRunner.html
and maybe this: http://www.scala-lang.org/api/current/scala/actors/scheduler/ResizableThreadPoolScheduler.html
Good luck
EDIT
Check out this piece of code using futures:
import scala.actors.Futures._
object FibFut {
def fib(i: Int): Int = if (i < 2) 1 else fib(i - 1) + fib(i - 2)
def main(args: Array[String]) {
val fibs = for (i <- 0 to 42) yield future { fib(i) }
for (future <- fibs) println(future())
}
}
It showcases a very good point about futures, namely that you decide in which order to receive the results (as opposed to the normal mailbox-system which employs a fifo-system i.e. the fastest actor sends his result first).

For any significant project, I generally have a supervisor actor, a collection of worker actors each of which can do any work necessary, and a large number of pieces of work to do. Even though I do this fairly often, I've never put it in a (personal) library because the operations end up being so different each time, and the overhead is pretty small compared to the whole coding project.

Be aware of actor starvation if you end up utilizing the general actor threadpool. I ended up simply using my own algorithm-task-owned threadpool to handle the parallelization of a long-running, concurrent task.

The upcoming Scala 2.9 is expected to include parallel data structures which should automatically handle this for some uses. While it does not use Actors, it may be something to consider for your problem.
While this feature was originally slated for 2.8, it has been postponed until the next major release.
A presentation from the last ScalaDays is here:
http://days2010.scala-lang.org/node/138/140

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse