Actors (scala/akka): is it implied that the receive method will be accessed in a threadsafe manner?

Actors (scala/akka): is it implied that the receive method will be accessed in a threadsafe manner? - scala

I assume that the messages will be received and processed in a threadsafe manner. However, I have been reading (some) akka/scala docs but I didn't encounter the keyword 'threadsafe' yet.

It is probably because the actor model assumes that each actor instance processes its own mailbox sequentially. That means it should never happen, that two or more concurrent threads execute single actor instance's code. Technically you could create a method in an actor's class (because it is still an object) and call it from multiple threads concurrently, but this would be a major departure from the actor's usage rules and you would do it "at your own risk", because then you would lose all thread-safety guarantees of that model.
This is also one of the reasons, why Akka introduced a concept of ActorRef - a handle, that lets you communicate with the actor through message passing, but not by calling its methods directly.

I think we have it pretty well documented: http://doc.akka.io/docs/akka/2.3.9/general/jmm.html

Actors are 'Treadsafe'. The Actor System (AKKA), provides each actor with its own 'light-weight thread'. Meaning that this is not a tread, but the AKKA system will give the impression that an Actor is always running in it's own thread to the developer. This means that any operations performed as a result of acting on a message are, for all purposes, thread safe.
However, you should not undermine AKKA by using mutable messages or public state. If you develop you actors to be stand alone units of functionality, then they will be threadsafe.
See also:
http://doc.akka.io/docs/akka/2.3.12/general/actors.html#State
and
http://doc.akka.io/docs/akka/2.3.12/general/jmm.html for a more indepth study of the AKKA memory model and how it manages 'tread' issues.

Related

Num of actor instance

I'm new to akka-actor and confused with some problems:
when I create an actorSystem, and use actorOf(Props(classOf[AX], ...)) to create actor in main method, how many instances are there for my actor AX?
If the answer to Q1 was just one, does this mean whatever data-structure I created in the AX actor class's definition will only appear in one thread and I should not concern about concurrency problems?
What if one of my actor's action (one case in receive method) is a time consuming task and would take quite long time to finish? Will my single Actor instance not responding until it finish that task?
If the answer to Q3 is right, what I am supposed to do to prevent my actor from not responding? Should I start another thread and send another message back to it until finish the task? Is there a best practice there I should follow?

yes, the actor system will only create 1 actor instance for each time you call the 'actorOf' method. However, when using a Router it is possible to create 1 router which spreads the load to any number of actors. So in that case it is possible to construct multiple instances, but 'normally' using actorOf just creates 1 instance.
Yes, within an actor you do not have to worry about concurrency because Akka guarantees that any actor only processes 1 message at the time. You must take care not to somehow mutate the state of the actor from code outside the actor. So whenever exposing the actor state, always do this using an immutable class. Case classes are excellent for this. But also be ware of modifying the actor state when completing a Future from inside the actor. Since the Future runs on it's own thread you could have a concurrency issue when the Future completes and the actor is processing a next message at the same time.
The actor executes on 1 thread at the time, but this might be a different thread each time the actor executes.
Akka is a highly concurrent and distributed framework, everything is asynchronous and non-blocking and you must do the same within your application. Scala and Akka provide several solutions to do this. Whenever you have a time consuming task within an actor you might either delegate the time consuming task to another actor just for this purpose, use Futures or use Scala's 'async/await/blocking'. When using 'blocking' you give a hint to the compiler/runtime a blocking action is done and the runtime might start additional thread to prevent thread starvation. The Scala Concurrent programming book is an excellent guide to learn this stuff. Also look at the concurrent package ScalaDocs and Neophyte's Guide to Scala.
If the actor really has to wait for the time consuming task to complete, then yes, your actor can only respond when that's finished. But this is a very 'request-response' way of thinking. Try to get away from this. The actor could also respond immediately indicating the task has started and send an additional message once the task has been completed.
With time consuming tasks always be sure to use a different threadpool so the ActorSystem will not be blocked because all of it's available threads are used up by time consuming tasks. For Future's you can provide a separate ExecutionContext (do not use the ActorSystem's Dispatch context for this!), but via Akka's configuration you can also configure certain actors to run on a different thread pool.
See 3.
Success!

one instance (if you declare a router in your props then (maybe) more than one)
Yes. This is one of the advantages of actors.
Yes. An Actor will process messages sequentially.
You can use scala.concurrent.Future (do not use actor state in the future) or delegate the work to a child actor (the main actor can manage the state and can respond to messages). Future or child-actor depends on use case.

Akka: what is the reason of processing messages one at a time in an Actor?

It is said:
Akka ensures that each instance of an actor runs in its own lightweight thread and that messages are processed one at a time.
Can you please explain what is the reason of processing messages one at a time in an Actor?

This way we can guarantee thread safety inside an Actor.
Because an actor will only ever handle one message at any given time, we can guarantee that accessing the actor's local state is safe to access, even though the Actor itself may be switching Threads which it is executing on. Akka guarantees that the state written while handling message M1 are visible to the Actor once it handles M2, even though it may now be running on a different thread (normally guaranteeing this kind of safety comes at a huge cost, Akka handles this for you).
It also originates from the original Actor model description, which is an concurrency abstraction, described as actors who can only one by one handle messages and respond to these by performing one of these actions: send other messages, change it's behaviour or create new actors.

How to handle concurrent access to a Scala collection?

I have an Actor that - in its very essence - maintains a list of objects. It has three basic operations, an add, update and a remove (where sometimes the remove is called from the add method, but that aside), and works with a single collection. Obviously, that backing list is accessed concurrently, with add and remove calls interleaving each other constantly.
My first version used a ListBuffer, but I read somewhere it's not meant for concurrent access. I haven't gotten concurrent access exceptions, but I did note that finding & removing objects from it does not always work, possibly due to concurrency.
I was halfway rewriting it to use a var List, but removing items from Scala's default immutable List is a bit of a pain - and I doubt it's suitable for concurrent access.
So, basic question: What collection type should I use in a concurrent access situation, and how is it used?
(Perhaps secondary: Is an Actor actually a multithreaded entity, or is that just my wrong conception and does it process messages one at a time in a single thread?)
(Tertiary: In Scala, what collection type is best for inserts and random access (delete / update)?)
Edit: To the kind responders: Excuse my late reply, I'm making a nasty habit out of dumping a question on SO or mailing lists, then moving on to the next problem, forgetting the original one for the moment.

Take a look at the scala.collection.mutable.Synchronized* traits/classes.
The idea is that you mixin the Synchronized traits into regular mutable collections to get synchronized versions of them.
For example:
import scala.collection.mutable._
val syncSet = new HashSet[Int] with SynchronizedSet[Int]
val syncArray = new ArrayBuffer[Int] with SynchronizedBuffer[Int]

You don't need to synchronize the state of the actors. The aim of the actors is to avoid tricky, error prone and hard to debug concurrent programming.
Actor model will ensure that the actor will consume messages one by one and that you will never have two thread consuming message for the same Actor.

Scala's immutable collections are suitable for concurrent usage.
As for actors, a couple of things are guaranteed as explained here the Akka documentation.
the actor send rule: where the send of the message to an actor happens before the receive of the same actor.
the actor subsequent processing rule: where processing of one message happens before processing of the next message by the same actor.
You are not guaranteed that the same thread processes the next message, but you are guaranteed that the current message will finish processing before the next one starts, and also that at any given time, only one thread is executing the receive method.
So that takes care of a given Actor's persistent state. With regard to shared data, the best approach as I understand it is to use immutable data structures and lean on the Actor model as much as possible. That is, "do not communicate by sharing memory; share memory by communicating."

What collection type should I use in a concurrent access situation, and how is it used?
See #hbatista's answer.
Is an Actor actually a multithreaded entity, or is that just my wrong conception and does it process messages one at a time in a single thread
The second (though the thread on which messages are processed may change, so don't store anything in thread-local data). That's how the actor can maintain invariants on its state.

How does I/O work in Akka?

How does the actor model (in Akka) work when you need to perform I/O (ie. a database operation)?
It is my understanding that a blocking operation will throw an exception (and essentially ruin all concurrency due to the evented nature of Netty, which Akka uses). Hence I would have to use a Future or something similar - however I don't understand the concurrency model.
Can 1 actor be processing multiple message simultaneously?
If an actor makes a blocking call in a future (ie. future.get()) does that block only the current actor's execution; or will it prevent execution on all actors until the blocking call has completed?
If it blocks all execution, how does using a future assist concurrency (ie. wouldn't invoking blocking calls in a future still amount to creating an actor and executing the blocking call)?
What is the best way to deal with a multi-staged process (ie. read from the database; call a blocking webservice; read from the database; write to the database) where each step is dependent on the last?
The basic context is this:
I'm using a Websocket server which will maintain thousands of sessions.
Each session has some state (ie. authentication details, etc);
The Javascript client will send a JSON-RPC message to the server, which will pass it to the appropriate session actor, which will execute it and return a result.
Execution of the RPC call will involve some I/O and blocking calls.
There will be a large number of concurrent requests (each user will be making a significant amount of requests over the WebSocket connection and there will be a lot of users).
Is there a better way to achieve this?

Blocking operations do not throw exceptions in Akka. You can do blocking calls from an Actor (which you probably want to minimize, but thats another story).
no, 1 actor instance cannot.
It will not block any other actors. You can influence this by using a specific Dispatcher. Futures use the default dispatcher (the global event driven one normally) so it runs on a thread in a pool. You can choose which dispatcher you want to use for your actors (per actor, or for all). I guess if you really wanted to create a problem you might be able to pass exactly the same (thread based) dispatcher to futures and actors, but that would take some intent from your part. I guess if you have a huge number of futures blocking indefinitely and the executorservice has been configured to a fixed amount of threads, you could blow up the executorservice. So a lot of 'ifs'. a f.get blocks only if the Future has not completed yet. It will block the 'current thread' of the Actor from which you call it (if you call it from an Actor, which is not necessary by the way)
you do not necessarily have to block. you can use a callback instead of f.get. You can even compose Futures without blocking. check out talk by Viktor on 'the promising future of akka' for more details: http://skillsmatter.com/podcast/scala/talk-by-viktor-klang
I would use async communication between the steps (if the steps are meaningful processes on their own), so use an actor for every step, where every actor sends a oneway message to the next, possibly also oneway messages to some other actor that will not block which can supervise the process. This way you could create chains of actors, of which you could make many, in front of it you could put a load balancing actor, so that if one actor blocks in one chain another of the same type might not in the other chain. That would also work for your 'context' question, pass of workload to local actors, chain them up behind a load balancing actor.
As for netty (and I assume you mean Remote Actors, because this is the only thing that netty is used for in Akka), pass of your work as soon as possible to a local actor or a future (with callback) if you are worried about timing or preventing netty to do it's job in some way.

Blocking operations will generally not throw exceptions, but waiting on a future (for example by using !! or !!! send methods) can throw a time out exception. That's why you should stick with fire-and-forget as much as possible, use a meaningful time-out value and prefer callbacks when possible.
An akka actor cannot explicitly process several messages in a row, but you can play with the throughput value via the config file. The actor will then process several message (i.e. its receive method will be called several times sequentially) if its message queue it's not empty: http://akka.io/docs/akka/1.1.3/scala/dispatchers.html#id5
Blocking operations inside an actor will not "block" all actors, but if you share threads among actors (recommended usage), one of the threads of the dispatcher will be blocked until operations resume. So try composing futures as much as possible and beware of the time-out value).
3 and 4. I agree with Raymond answers.

What Raymond and paradigmatic said, but also, if you want to avoid starving the thread pool, you should wrap any blocking operations in scala.concurrent.blocking.
It's of course best to avoid blocking operations, but sometimes you need to use a library that blocks. If you wrap said code in blocking, it will let the execution context know you may be blocking this thread so it can allocate another one if needed.
The problem is worse than paradigmatic describes since if you have several blocking operations you may end up blocking all threads in the thread pool and have no free threads. You could end up with deadlock if all your threads are blocked on something that won't happen until another actor/future gets scheduled.
Here's an example:
import scala.concurrent.blocking
...
Future {
val image = blocking { load_image_from_potentially_slow_media() }
val enhanced = image.enhance()
blocking {
if (oracle.queryBetter(image, enhanced)) {
write_new_image(enhanced)
}
}
enhanced
}
Documentation is here.

Scala: Why are Actors lightweight?

What makes actors so lightweight?
I'm not even sure how they work. Aren't they separate threads?

When they say lightweight they mean that each actor is not mapped to a single thread.
JVM offers shared memory threads with
locks as the primary form of
concurrency abstractions. But shared
memory threads are quite heavyweight
and incur severe performance penalties
from context switching overheads. For
an actor implementation based on a
one-to-one mapping with JVM threads,
the process payload per Scala actor
will not be as lightweight that we can
spawn a million instances of an actor
for a specific computation. Hence
Scala actors have been designed as
lightweight event objects, which get
scheduled and executed on an
underlying worker thread pool that
gets automatically resized when all
threads block on long running
operations. In fact, Scala implements
a unified model of actors - thread
based and event based. Scala actors
offer two form of suspension
mechanisms - a full stack frame
suspension(implemented as receive) and
a suspension based on a continuation
closure (implemented as react). In
case of event based actors, a wait on
react is represented by a continuation
closure, i.e. a closure that captures
the rest of the actor's computation.
When the suspended actor receives a
message that matches one of the
patterns specified in the actor, the
continuation is executed by scheduling
the task to one of the worker threads
from the underlying thread pool. The
paper "Actors that Unify Threads and
Events" by Haller and Odersky
discusses the details of the
implementation.
Source

Important Reference Actors that Unify Threads and Events
I don't think we should strengthen that actor is that lightweight.
firstly thread based actors are actor per thread so not lightweight at all.
event based actors are the point where we start to feel actors are light weight. it is light weight because it does not have working thread wait and switched to another , working thread just switch from a piece of data work to another piece of data work, thus keep spinning on effective calculations.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse