How to use Scala Futures the right way? - scala

I'm wondering if Futures are better to be used in conjunction with Actors only, rather than in a program that does not use Actor. Said differently, is performing asynchronous computation with future something that should better be done within an Actors system?
Here why i'm saying that:
1 -
You perform a computation for which the result, would trigger some action that you may do in another thread.
For instance, i have a long operation to determine the price of something, from my main thread, i decide to launch an asynchronous process for it. In the mean time i could be doing other thing, then when the response is ready/availble or communicated back to me, i go on on that path.
I can see that with actor this is handy, because you can pipe a result to an actor. But with a typical threading model, you can either block or .... ?
2 -
Another issue, let say i need to update the age of a list of participant, by getting some information online. Let assume i just have one future for that task. Isn't closing over the participant list something wrong to do. Multiple thread maybe accessing that participant list at the same time. So making the update within the future would simply be wrong and in that case, we would need java concurrent collection isn't it ?
Maybe i see it the wrong way, future are not meant to do side effect
at all
But in that case, fair enough, no side effect, but we still have the problem of getting a value back from the calling thread, which can only be blocking. I mean let's imagine that, the result, would help the calling thread, to update some data structure. How to do that update asynchronously without closing over that data structure somehow.
I believe the call backs such as OnComplete can be use for
side-effecting (Am it right here?)
still, the call back would have to close over the data structure anyway. Hence i don't see how not using Actor.
PS: I like actors, i'm just trying to understand better the usage of future without actors. I read everywhere, that one should use actor only when necessary that is when state need to be manage. It seems to me that overall, using future, without actor, always involve blocking somewhere down the line, if the result need to be communicated back at some point to the thread that initiated the asynchronous task.

Actors are good when you are dealing with mutable state because they encapsulate the mutable state. and allow only message-based interaction.
You can use Future to execute in a different thread. You don't have to block on a Future because Scala's Future compose. So if you have multiple Futures in your code, you don't have to wait/block for all of them to compete. For example, if your pipeline is completely non-block or asyn (e.g., Play and Spray) you can return a Future back to the client.
Futures are lightweight compared to actors because you don't need a complete actorsystem.
Here is a quote from Martin Odersky that I really like.
There is no silver bullet for all concurrency issues; the right
solution depends on what one needs to achieve. Do you want to define
asynchronous computations that react to events or streams of values?
Or have autonomous, isolated entities communicating via messages? Or
define transactions over a mutable store? Or, maybe the primary
purpose of parallel execution is to increase the performance? For each
of these tasks, there is an abstraction that does the job: futures,
reactive streams, actors, transactional memory, or parallel
collections.
So choose your abstraction based on your use case and needs.

Related

Atomic function/method in scala (without introducing actor system overheads)

I currently use an Akka actor to establish a code block that is executed atomically and in a thread safe manner (Akka mailbox semantics impose atomicity by virtue of processing one message at a time).
However this introduces the need for an actor system, and additional side-effects or bloat (having to manually propagate exceptions to the caller, losing type safety on ask, and in general using message semantics rather than function calls).
Can a thread-safe atomic code block be accomplished in scala in a simpler way? would you apply #volatile to a function?
It depends on what kind of shared state you want to protect here:
The easiest and universal choice is using same old synchronized. However, unlike the Akka, it's completely blocking, so may easily kill your performance and of course the code-style, as it's hard to control messy side effects. It may also allow for dead-locks.
Java's locks is same approach, but might be a little better for performance.
Another option is same old Java's AtomicReference(implements CAS operations) and related classes. The positive thing about is that they're non-blocking - developers actually use them to build high-performant collections. The ways of using locks and CAS are decribed here. They both are pretty low-level mechanizms, so I would not recommend to use them much, especially for business-logic (any actor's implementation would be better).
If your shared state is a collection - you may want use same old Java's concurrent collections (they have atomic operations like putIfAbscent). Scala has interesting non-blocking TrieMap for instance.
Scala STM is also an alternative
Finally, this question is dedicated to lightweight actor model implementations.
P.S. Volatile annotation is nothing more than volatile keyword analog from Java. You can put it on the method just because any annotation can be put on anything.
Depending on what you're trying to achieve, the simplest might be old synchronized:
//your mutable state
private var x = 0
//better than locking on 'this' is to have a dedicated lock
private val lock = new Object
def add(i:Int) = lock.synchronized { x += i }
This is the 'old Java' way, but it might work for you depending on what you're doing. Of course, this is the fastest way to deadlocks if your synchronize operation is more complex and/or you need high throughput.

Why is the actor "ask" pattern considered an anti-pattern or "code smell?"

From what I've gathered, the "ask" pattern is considered a bad practice and should be avoided. Instead, the recommended pattern is the "actor per request" model. However, this doesn't make sense to me, as the "ask" pattern does exactly this - it creates a lightweight actor per request. So why is this then considered bad, especially when futures are far more composable and are able to more elegantly handle the collation of multiple send/receives?
From Akka docs:
"There are performance implications of using ask since something needs
to keep track of when it times out, there needs to be something that
bridges a Promise into an ActorRef and it also needs to be reachable
through remoting. So always prefer tell for performance, and only ask
if you must."
But sometimes you want to send a message from outside of an actor in which case you can use ask. Using ask will guarantee that you get a response within the specified timeout and sometimes that's what you want. However, when you use ask pattern you should ask yourself a question whether you could just use Futures instead.
There is a place for ask but it should have very limited use due to the aforementioned reasons.
You don't have to use actor per request. Some actors are meant to be long lived and some not. If an actor performs a potentially dangerous or blocking operation you might want to create one per request. Whatever fits your application logic.

Scala futures and threads

Reading the scala source code for scala.concurrent.Future and scala.concurrent.impl.Future, it seems that every future composition via map dispatches a new task for an executor. I assume this commonly triggers a context switch for the current thread, and/or an assignment of a thread for the job.
Considering that function flows need to pass around Futures between them to act on results of Futures without blocking (or without delving into callback spaghetti), isn't this "reactive" paradigm very pricy in practice, when code is well written in a modular way where each function only does something small and passes along to other ones?
It depends on the execution context. So you can choose the strategy.
You're executor can also just do it in the calling thread, keeping the map-calls on the same thread. You can pass your own strategy by passing explicitly the execution context or use the implicit.
I would first test what the default fork/join pool does, by logging which thread was used. Afaik newer versions of it sometimes utilize the submitting thread. However, I don't know if that's used / applied for scala future callbacks.

What are the options to use JDBC in a non-blocking way in Play?

I wonder what is the best (recommended, approved etc.) way to do non-blocking JDBC queries in Play! application using Play's connection pool (in Scala and to PostgreSQL if it matters)? I understand that JDBC is definitely blocking per se, but surely there are approaches to do the calls in separated threads (e.g. using futures or actors) to avoid blocking of the calling thread.
Suppose I decided to wrap the calls in futures, which execution context should I use, the Play's default one? Or it's better to create separated execution context for handling DB queries?
I know that there are some libraries for this like postgresql-async, but I really want to understand the mechanics :)
Suppose I decided to wrap the calls in futures, which execution context should I use, the Play's default one? Or it's better to create separated execution context for handling DB queries?
It is better to use separate execution context in this case. This way there will be no chance that your non-blocking jobs (most of the default Play's stuff) submitted to default execution context will be jammed by blocking JDBC calls in jobs you submit to the same execution context.
I suggest to read this (especially second part) to get a general idea of how you could deal with execution contexts in different situations (including case with blocking database queries), and then refer this to get more details on configuring your scenario in Play.
Suppose I decided to wrap the calls in futures, which execution
context should I use, the Play's default one?
If you do that, you gain nothing, it's like not using futures at all. Wrapping blocking calls in futures only helps you if you execute them in separate execution contexts.
In Play, you can basically choose between the following two approaches when dealing with blocking IO:
Turn Play into a one-thread-per-request Framework by drastically increasing the default execution context. No futures needed, just call your blocking database as always. Simple, but not the intention behind Play
Create specific execution contexts for your blocking IO-calls and gain fine-grained control of what you are doing
See the docs: "Understanding Play thread pools"

CQRS - sending response by command with immediate consistency

I have the following architecture:
Ofc. there are ports and adapters, and everything else you can imagine...
What do you suggest, how to send a rest response by immediate consistency? Should I add another event bus and raise an event? (I guess the projection must send something about the success.)
How to handle errors in an event based system like this? (The event bus is not necessary, I can solve loose coupling with an IoC container, but I don't think sending a callback through so many objects would be a good solution.)
It's not hard, instead of sending a command, you can call directly the command handler from controller. Or have a service method which will handle the input and returns something. The important bit is that all these are done synchronously (i.e you need to wait until the handler finishes). The domain events handlers are unaffected, they can be async.
If you don't want to go 'hybrid' and want to always use the same workflow (as described in your pic) things are more complicated, you need the client to check often if the operation has completed. I think the better way is to be flexible so, for some tasks you can use the 'old' ways. The domain events will still be generated and handled, that part would not change. You're just changing the way a 'command' is executed.
Also, it's worth mentioning that you shouldn't expect responses from event handlers and if it makes you feel better, use the 'request-response' terminology instead of command-response.
Btw, you don't break CQRS this way, as long as your domain model isn't used to do queries i.e you have different model for writes and reads, it is CQRS.
Immediate consistency, at what cost? are you using DTC?
What if you later on want to have more than one subscriber for a given event in the read model, how many transactions will be involved in a DTC transaction scope? In order for you to have immediate consistency your events need to be handled sync, so what is the benefit in this architecture?
You can have immediate consistency and even immediate user notifications with client callback (signalR), but IMHO you should changes a few things in your architecture, starting with the drop of the immediate consistency bit.
Why do you think you need that btw?