Suspend Akka Actors - scala

I am trying to use Akka to implement the following (I think I'm trying to use Akka the proper way):
I have a system where I have n resource listeners. Essentially a resource listener is an entity that will listen on an input resource and publish what it sees (i.e. polling a database, tailing a log file, etc.).
So I want to use Akka actors to do these little bits of work units (listening on a resource). I've noticed that the Akka gives me a thread pool of t threads which may be less than the number of listeners. Unfortunately for me, getting a message from these resource listeners might be blocking, so it could take seconds, minutes, before the next message pops up.
Is there any way to suspend a resource listener so it leaves the thread to another actor and we'll come back to it a little later in time?

Executive Summary
What you want is for your producer API (the resources) to be asynchronous, or at least support non-blocking operations (so that you can do polling). If the API does not support that, then there is no way to retrofit this property, not even using the almighty actors ;-)
Strategies for Different Situations
Only Blocking API
If the resources only support the blocking getWhatever() method of retrieving things, then you must allocate one thread per resource. An Actor with a PinnedDispatcher could be a way to do this. But be aware that the actor will not be responsive while waiting for events from the resource.
Non-Blocking but Synchronous API
If there is a peek() or poll() method on the resource API, you can use one actor per resource, have them share a thread (or pool) and schedule the polling as required (i.e. every 100ms or whatever you need). This has the huge advantage that nobody is actually blocked and the whole system remains responsive. But latency for event reception will be of the order of your schedule interval.
Proper Asynchronous API
If you have enough good karma to encounter a nice asynchronous API, then simply register a callback which will send a message to the actor whenever an event occurs. Sadly, this is not the norm.
PS:
The JVM does not support wrapping up the current call stack, doing something else and return to that same processing state later. A method can only be popped of the stack when it is actually finished.

In general, you should try to avoid blocking operations in actors. For file IO, there are asynchronous libraries and for some databases, too. If that is not an option for you, you can set change the default dispatcher so that the underlying thread pool expands as needed.

One option is to call your blocking APIs inside Futures. The Futures should use an ExecutionContext (thread pool) that is separate from the Actors' ExecutionContext.
See this blog post for an example (specifically CacheActor.findValueForSender).

Related

what the essential difference between akka and ThreadPool+BlockingQueue in ONE Process?

We know Akka is one implementation of actor pattern. Without Akka, I usually implement a simple actor pattern using ThreadPool+BlockingQueue. So the message is offered into the queue, and the works(actors) take the message from the Queue, then do what they should do. Of course, this kind of implementation can be only in just ONE process.
So as to in one process,
What's the essential difference between these two(Akka vs.
ThreadPool+BlockingQueue)
Moreover, what's the difference between actor pattern and producer-consumer model?
Actor model is indeed quite similar to producer-consumer model (P-C).
However, if you use a blocking queue with P-C your application won't be completely non-blocking and asynchronous. The promise of actor model and Akka is that all messages are sent asynchronously and don't block the sender.
Another aspect of it is managing these queues gets quite cumbersome once you have many consumers and producers. With actors you simply send a message and don't have to think about these low level details. Under the hood Akka will keep a message queue aka mailbox per actor with a dispatcher assigning actors to the thread pool to process those messages.
It's much easier to use Akka to achieve highly performant and resilient application than coding it yourself. You get fault tolerance, resource management, location transparency, routing, distributed, async processing, hierarchical supervision out of the box. Not to mention other frameworks and libraries leveraging these features to give you even more (reactive streams, akka http, etc). There are lot's of patterns developed for you already there, so why bother with your own.

How to deal with long-lasting operations in Reliable Actors or stateful Reliable Service and 're-process' failed states

I'm new to Service Fabric Reliable Actors technology and trying to figure out best practices for this specific scenario:
Let's say we have some legacy code that we want to run new code built on SF Reliable Actors. Actors of certain type "ActorExecutor" are going to asynchronously call some third-party service that sometimes could stuck for pretty long time, longer than actor's calling client is ready to wait, or even experience some prolonged underling communication issues. We do not want client (legacy code) to get blocked by any sort of issues in ActorExecutor, it does not expect to receive any value or status back from actor. Should we use SF ReliableQueue for that? Should we use some sort of actor-broker to receive requests from client and storing them to queue: Client->ActorBroker->ActorExecutor? Are reminders could be helpful here?
One more question in this regard: Giving the situation is possible when many thousands of actors might stuck in 'third-party incomplete call' in the same time, and we want to reactivate and repeat the very last call for them, should we write a new tool for that? In NServiceBus you can create an error queue in MSMQ where all failed like 'unable to process' messages to be landed, and then we were able to simply re-process them anytime in the future. From my understanding, there is no such thing in Service Fabric and it's something we need to built on our own.
An event driven approach can help you here. Instead of waiting for the Actor to return from the call to a service, you can enqueue some task on it, to request it to perform some action. The service calling Actor would function autonomously, processing items from it's task queue. This will allow it to perform retries and error handling. After a successful call, a new event can notify the rest of the system.
Maybe this project can help you to get started.
edits:
At this time, I don't believe you can use reliable collections in Actors. So a queue inside the state of an Actor, is a regular (read-only) collection.
Process the queue using an Actor Timer. Don't use the threadpool, as it's not persistent and won't survive crashes and Actor garbage collections.

Making a HTTP API server asynchronous with Future, how does it make it non-blocking?

I am trying to write a HTTP API server which does basic CRUD operation on a specific resource. It talks to an external db server to do the operations.
Future support in scala is pretty good, and for all non-blocking computation, future is used. I have used future in many places where we wrap an operation with future and move on, when the value is eventually available and the call back is triggered.
Coming to an HTTP API server's context, it is possible to implement non-blocking asynchronous calls, but when a GET or a POST call still blocks the main thread right?
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
I am failing to understand how efficiently a HTTP server could be designed to maximize efficiency, what happens when few hundred requests hit a specific endpoint and how it is dealt with. I've been told that slick takes the best approach.
If someone could explain a successful http request lifecycle with future and without future, assuming there are 100 db connection threads.
When a GET request is made, a success 200 means the data is written to
the db successfully and not lost. Until the data is written to the
server, the thread that was created is still blocking until the final
acknowledgement has been received from the database that the insert is
successful right?
The thread that was created for the specific request need not be blocked at all. When you start an HTTP server, you always have the "main" thread ongoing and waiting for requests to come in. Once a request starts, it is usually offloaded to a thread which is taken from the thread pool (or ExecutionContext). The thread serving the request doesn't need to block anything, it only needs to register a callback which says "once this future completes, please complete this request with a success or failure indication". In the meanwhile, the client socket is still pending a response from your server, nothing returns. If, for example, we're on Linux and using epoll, then we pass the kernel a list of file descriptors to monitor for incoming data and wait for that data to become available, in which we will get back a notification for.
We get this for free when running on top of the JVM due to how java.NIO is implemented for Linux.
The main thread (created when http request was received) could delegate
and get a Future back, but is it still blocked until the onSuccess is
trigged which gets triggered when the value is available, which means
the db call was successful.
The main thread usually won't be blocked, as it is whats in charge of accepting new incoming connections. If you think about it logically, if the main thread blocked until your request completed, that means that we could only serve one concurrent request, and who wants a server which can only handle a single request at a time?
In order for it to be able to accept multiple request, it will never handle the processing of the route on the thread in which it accepts the connection, it will always delegate it to a background thread to do that work.
In general, there are many ways of doing efficient IO in both Linux and Windows. The former has epoll while the latter has IO completion ports. For more on how epoll works internally, see https://eklitzke.org/blocking-io-nonblocking-io-and-epoll
First off, there has to be something blocking the final main thread for it to keep running. But it's no different than having a threadpool and joining to it. I'm not exactly sure what you're asking here, since I think we both agree that using threads/concurrency is better than a single threaded operation.
Future is easy and efficient because it abstracts all the thread handling from you. By default, all new futures run in the global implicit ExecutionContext, which is just a default threadpool. Once you kick of a Future request, that thread will spawn and run, and your program execution will continue. There are also convenient constructs to directly manipulate the results of a future. For example, you can map, and flatMap on futures, and once that future(thread) returns, it will run your transformation.
It's not like single threaded languages where a single future will actually block the entire execution if you have a blocking call.
When you're comparing efficiency, what are you comparing it to?
In general "non-blocking" may mean different things in different contexts: non-blocking = asynchronous (your second question) and non-blocking = non-blocking IO (your first question). The second question is a bit simpler (addresses more traditional or well-known aspect let's say), so let's start from it.
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
It is not blocked, because Future runs on different thread, so your main thread and thread where you execute your db call logic run concurrently (main thread still able to handle other requests while db call code of previous request is executing).
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
This aspect is about IO. Thread making DB call (Network IO) is not necessary blocked. It is the case for old "thread per request" model, when thread is really blocked and you need create another thread for another DB request. However, nowadays non-blocking IO became popular. You can google for more details about it, but in general it allows you to use one thread for several IO operations.

in scala-akka actor should I open a future when handling a message?

I have an actor which receives a message to perform an operation like sending an alert and this is the actor purpose. To send an alerts. Should I open now a future and wrap the sending of an alert in another async mechanism? or is my actor receive - send alert already the async process I was waiting for?
It really depends on what sending that alert entails. More specifically, does sending that alert entail blocking code. You want to avoid blocking in the default dispatcher for your actor system. If sending the alert entails a blocking operation (like say a synchronous I/O operation), then you can consider a couple of different approaches:
1) Give this actor its own dispatcher so its blocking won't affect the main dispatcher for your actor system. This technique is referred to as "bulk heading" (I prefer "fire-walling") as you are sealing off your slow/blocking (and potentially dangerous code) from the rest of your application. If you have a single actor that sends these alerts, you might want to consider a Pinned Dispatcher which would result in this actor having its own thread for its execution.
2) Wrap the potentially dangerous alert sending code in a Future. This moves the blocking code out of your main dispatcher and into another ExecutionContext (provided you provide a different one for these types of things and don't just use the dispatcher of the actor itself), which should be safer. I don't love this approach as it can lead to closing over mutable state and potentially leading to race conditions (which actors try to avoid in the first place), but if done safely, it can be effective.
I would honestly stick with approach 1 as it better fits the Akka idiom, but that's just my opinion. Either approach can work for you as I'm sure there are a couple of other ones as well.
If sending the alert is non-risky and/or non-blocking, then there is no need IMO to use a Future.
That depends if sending this alert is a "long running blocking process". If it is, you should spin it off as a future - yes. If it's not, simply rely on the Actor to perform this action

The Scala way to use one actor per socket connection

I am wondering how it is possible to avoid one socket connection pr. thread in Scala. I have thought a lot about it, but I always end up with some code which is listening for incoming data for each client connection.
The problem is that I want to develop an application which should simultanously handle perhaps a couple of thousand connections. However I will of course not want to create a thread for each connection because of the lack of scalability and context switching.
What would be the "right" way to do this. In my world it should be possible to have one actor for each connection without the need to block one thread per actor.
In the book "Programming Scala" the authors used a library called naggati which provides a framework that combines NIO and actors, http://programming-scala.labs.oreilly.com/ch09.html.
I have an application that mixes actors with non-blocking sockets (i.e. NIO). The way I have done this is to have a dedicated IO thread, which sends messages to actors (in much the same way it would delegate work to a thread pool in a Java system) using the reactor pattern.
Obviously using the old blocking sockets, you are restricted to one thread per connection. And actor could handle this but of course this places a restriction on the number of connections which can be handled simultaneously.
In the case of a single IO thread, this is a bottleneck in theory but not much in practice (in our observations) as the IO thread is doing computationally non-intensive work. There are plenty of good discussions to be found on the NIO reactor pattern.