What do Futures and Agents in Akka offer over Clojure's counterparts? - scala

Having watched the presentation Composable Futures with Akka 2.0, I am curious to know what additional features the Akka implementation of Futures and Agents bring over Clojure's ones.

"Agents in Akka are inspired by agents in Clojure." This is the first line in Agent documentation on Akka and hopefully it clears the agents part of question. As far as futures are concerned, they are both same conceptually (i.e invoking an operation on a separate thread). The underlying implementation are based on java.util.concurrent, so both using same underlying infrastructure.
Scala part:
The important part how the composable word come into play (both for agents and futures). If you go to akka docs you will find that you can use higher-order functions like map, filter etc. on Akka futures, i.e. a map operation on a future returns another future (and similarly for filter). This allows you to easily compose/chain together futures and wait on the final future for the final value. Now, all this is possible because the map, filter, for comprehension etc. are all based on Scala (monadic) API which basically allows any new type to provide specific implementations of these functions.
Clojure part:
Now on clojure side of things, you know that map, filter etc are just normal functions that work on collections, i.e they work on something that can be traversed and hence are different concept from the Monadic API of scala. So in Clojure you will use different ways to compose futures after all Clojure (or Lisp in general) allow composability in many many ways

Related

What is the use of FastFuture in akka

What is the use of Fastfuture in akka, not clear from the documentation:
Provides alternative implementations of the basic transformation operations defined on Future, which try to avoid scheduling to an ExecutionContext if possible, i.e. if the given future value is already present.
How is it different from Future, Can someone explain with an example in what cases this to be used and what benefit does it provide in terms of performance or any other aspects?
When an ExecutionContext is used in map calls, that involves extra scheduling cost in scala Futures, while with akka FastFutures it can perform map in the same thread avoiding potential context switch and potentially causing cache misses for very short tasks (like simple number crunching). So for fast map operations FastFuture should be faster.
Please note that flatMap usually requires an ExecutionContext in FastFutures too as it should use that for scheduling the generated Futures.
It might worth checking Viktor Klang's blog and the discussion related Futures on Scala contributors page.

Is the actor model not an anti-pattern, as the fire-and-forget style forces actors to remember a state?

When learning Scala, one of the first things I learned was that every function returns something. There is no "void"-function/method as there is, for instance in Java. Thus many Scala-functions are true functions, in a mathematic way, and objects can remain largely stateless.
Now I learned that the actor model is a very popular model among functional languages like Scala. However, actors promote a fire-and-forget style of programming, and callers usually don't expect callees to directly reply to messages (except when using the "ask"/"?"-method). Therefore, actors need to remember some sort of state.
Am I right assuming that the actor model is more like a trade-off between scalability and maintainability (due to its statefulness), and could sometimes even be considered an anti-pattern?
Yes you're essentially right (I'm not quite sure what you have in mind when you say scalability vs maintainability).
Actors are popular in Scala because of Akka (which presumably is in turn popular because of the support it gets from Lightbend). It is, not however, the case that actors are overwhelmingly popular in general in the functional programming world (although implementations exist for all the languages I'm thinking of). Below are my vastly simplified impressions (so take them with the requisite amount of salt) of two other FP language communities, both of which use actors (far?) less frequently than Scala does.
The Haskell community tends to use either STM/channels (often in an STM context). Straight up MVars also get used surprisingly often.
The Clojure community sometimes touts its own built-in version of STM, but its flagship concurrency model is really core.async, which is at its heart again channels.
As an aside STM, channels, and actors can all be layered upon one another; its sort of weird to compare them as if they were mutually exclusive approaches. In practice though it's rare to see them all used in tandem.
Actors do indeed involve state (and in the case of Akka skirt type safety) and as a result are very expressive and can pretty much do anything concurrency-wise. In this way they're similar to side-effectful functions, which are more expressive than pure functions. Indeed actors in a way are the pure essence of OO, with all its pros and cons.
As such there is a sizable chunk of the Scala community that would say yes, if most of the time when you face concurrency issues, you're using actors, that's probably an anti-pattern.
If you can, try to get away with just using Futures or scalaz.concurrent.Tasks. In return for less expressiveness you get more composability.
If your problem naturally lends itself to a single, global state (e.g. in the form of global invariants that you want to enforce), think about STM. In the Scala community, although an STM library exists, my impression is that STM is usually emulated by using actors.
If your concurrency problems mainly revolves around streaming multiple sources of data, think about using one of Scala's streaming libraries.
Actors are specifically a tool in the toolbox for handling and distributing state. So yes, they should have state - if they don't then you just could use Futures.
Please note however that Actors (at least Akka Actors) handle distribution (running location-transparently on multiple nodes) which neither functions of Futures are able to do. The concurrency aspects of Actors are a result of them handling the more complex case - networking. In that sense, Actors unify the remote case with the local case, by making the remote case be first-class. And as it turns out, on networks messaging is exactly what you can both count and build on if you want reliable, resilient and also fast systems.
Hope this answers the "big picture" part of your question.

Pub/Sub Vs Observer Vs Reactive

When I have used Pub/Sub pattern frameworks like MVVMLight before, I have seen that the subscriber's calls are handled synchronously. From a scalability point of view, does a reactive framework like Rx help scalability where the pub and sub are completely decoupled and scalable? Which pattern helps scalability?
I don't know the specifics of MVVMLight, but in general the Pub/Sub is a pattern, where:
Publishers and subscribers don't know about each other. They only know about a broker, where they publish/consume messages.
As a result, the publication and consumption of messages is done asynchronously and is completely decoupled. This means that the publication/consumption side can be scaled independently and in case of failures of one part, the other part is able to keep working.
Now, reactive programming is a pattern used to model changes and their propagation across multiple actors. As such, it's not so much concerned with implementation details, but more focused on providing an abstract, declarative interface, which makes it easier to work with streams of events and perform processing on top of them. Straight from ReactiveX's documentation:
ReactiveX is not biased toward some particular source of concurrency or asynchronicity. Observables can be implemented using thread-pools, event loops, non-blocking I/O, actors (such as from Akka), or whatever implementation suits your needs, your style, or your expertise. Client code treats all of its interactions with Observables as asynchronous, whether your underlying implementation is blocking or non-blocking and however you choose to implement it.
So, the decoupling/scalability will be mainly dependent on the implementation used underneath; the main benefit of the framework is mainly the abstract, declarative interface provided.
Regarding the observer pattern (which is mentioned in the question's title): it's a rather low-level primitive that can be used to achieve the same goal, but can probably lead to a much more complex codebase. For more details on the pitfalls of observer pattern when compared with more abstract reactive frameworks, you can read the following paper:
Deprecating the Observer pattern with Scala.React
The reactive programming paradigm is often presented in object-oriented languages as an extension of the Observer design pattern. You can also compare the main reactive streams pattern with the familiar Iterator design pattern, as there is a duality to the Iterable-Iterator pair in all of these libraries. One major difference is that, while an Iterator is pull-based, reactive streams are push-based.
Using an iterator is an imperative programming pattern, even though the method of accessing values is solely the responsibility of the Iterable. Indeed, it is up to the developer to choose when to access the next() item in the sequence. In reactive streams, the equivalent of the above pair is Publisher-Subscriber. But it is the Publisher that notifies the Subscriber of newly available values as they come, and this push aspect is the key to being reactive. Also, operations applied to pushed values are expressed declaratively rather than imperatively: The programmer expresses the logic of the computation rather than describing its exact control flow.
Source: https://projectreactor.io/docs/core/release/reference/#intro-reactive

Mixing Parallel Collections with Akka

How well to scala parallel collection operations get along with the concurrency/parallelism used by Akka Actors (and Futures) with respect to efficient scheduling on the system?
Actors' and Futures' execution is handled by an ExecutionContext generally provided by the Dispatcher. What I find on parallel collections indicates they use a TaskSupport object. I found a ExecutionContextTaskSupport object that may connect the two but am not sure.
What is the proper way to mix the two concurrency solutions, or is it advised not to?
At present this is not supported / handled well.
Prior to Scala 2.11-M7, attempting to use the dispatcher as the ContextExecutor throws an exception.
That is, the following code in an actor's receive will throw a NotImplementedError:
val par = List(1,2,3).par
par.tasksupport = new ExecutionContextTaskSupport(context.dispatcher)
par foreach println
Incidentally, this has been fixed in 2.11-M7, though it was not done to correct the above issue.
In reading through the notes on the fix it sounds like the implementation provided by ExecutionContextTaskSupport in the above case could have some overhead over directly using one of the other TaskSupport implementations; however, I have done nothing to test that interpretation or evaluate the magnitude of any impact.
A Note on Parallel Collections:
By default Parallel Collections will use the global ExecutorContext (ExecutionContext.Implicits.global) just as you might use for Futures. While this is well behaved, if you want to be constrained by the dispatcher (using context.dispatcher)—as you are likely to do with Futures in Akka—you need to set a different TaskSupport as shown in the code sample above.

How is Scala suitable for Big Scalable Application

I am taking course Functional Programming Principles in Scala | Coursera on Scala.
I fail to understand with immutability , so many functions and so much dependencies on recursion , how is Scala is really suitable for real world applications.
I mean coming from imperative languages I see a risk of StackOverflow or Garbage Collection kicking in and with multiple copies of everything I am running Out Of Memory
What I a missing here?
Stack overflow: it's possible to make your recursive function tail recursive. Add #tailrec from scala.annotation.tailrec to make sure your function is 100% tail recursive. This is basically a loop.
Most importantly recursive solutions is only one of many patterns available. See "Effective Java" why mutability is bad. Immutable data is much better suitable for large applications: no need to synchronize access, client can't mess with data internals, etc. Immutable structures are very efficient in many cases. If you add an element to the head of a list: elem :: list all data is shared between 2 lists - awesome! Only head is created and pointed to the list. Imagine that you have to create a new deep clone of a list every time client asks for.
Expressions in Scala are more succinct and maybe more lazy - create filter and map and all that applied as needed. You can do the same in Java but ceremony takes forever so usually devs just create multiple temp collections on the way.
Martin Odersky defines mutability as a dependence on time/history. That's very interesting because you can use var inside of a function as long as no other code can be affected in any way, i.e. results are always the same.
Look at Option[T] and compare to null. Use them in for comprehensions. Exception becomes really exceptional and Option, Try, Box, Either communicate failures in a very nice way.
Scala allows to write more modular and generic code with less effort compared to Java.
Find a good piece of Scala code and try to see how you would do it in Java - it will be self evident.
Real world applications are getting more event-driven which involves passing around data across different processes or systems needing immutable data structures
In most of the cases we are either manipulating data or waiting on a resource.
In that case its easy to hook in a callback with Actors
Take a look at
http://pavelfatin.com/scala-for-project-euler/
Which gives you some examples on using functions like map fllter etc. Functions like these are used routinely by Ruby applications
Combination of immutability and recursion avoids a lot of stackoverflow problems. This come in handly while dealing with event driven applications
akka.io is a classic example which could have been build very concisely in scala.