Deploying new Verticle every for every HTTP Request? - vert.x

Currently on application startup I'm deploying a single verticle and calling createHttpServer(serverOptions).
I've set up a request().connection().closeHandler for handling a closed connection event, primarily so when clients decide to cancel their request, we halt our execution of that request.
However, when I set up that handler in that same verticle, it only seems to execute the closeHandler code once any synchronous code is finished executing and we're waiting on databases to respond via Futures and asynchronous handlers.
If instead of that, I deploy a worker verticle for each new HTTP request, it properly interrupts execution to execute the closeHandler code.
As I understand it, the HttpServer is already supposed to handle scalability of requests on its own since it can handle many at once without deploying new verticles. Essentially, this sounds like a hacky workaround that may affect our thread loads or things of that nature once our application is in full swing. So my questions are:
Is this the right way of doing this?
If not, what is the correct method or paradigm to follow?
How do you cancel the execution of a verticle from within itself verticle and inside that closeHandler? And by cancel execution, I mean including any Futures waiting to be completed.
Why does closeHandler only execute asynchronously when doing this multiple verticle approach? Using the normal way and simply executing requests using the alloted thread pool postpones closeHandler's execution until the eventloop finishes its queue, we need this to happen asynchronously

I think you need to understand Vert.x better. Vert.x does not start and stop thread per request. Verticles are long living and each handle multiple events during their lifetime but never concurrently. Also you should not deploy worker (or non-worker) Verticles per request.
What you do is that you deploy a pool of Verticles (worker and non) and Vert.x divides the load between them. An HTTP server is placed in front and will receive requests and forward them to verticle(s) to be handled.
for stopping processing a request, you need to keep a flag somewhere which is set if connection is closed. then you can check for it in your process and stop processing. Just don't forget to clear the flag at beginning of each request.

Deploying or undeploying verticles doesn't affect threads count. Vert.x uses thread pools of a limited size.
Undeploying verticles is a mean to downscale your service. Ideally, you shouldn't undeploy verticles at all. Deploying or undeploying does have a performance impact.
closeHandler, as I mentioned previously, is a callback method to release resources.
Vert.x Future doesn't provide cancellation means. The reason is that even Java's Future.cancel() is a cooperative operation.
As a means to fix this, probably passing a reference to AtomicBoolean as was suggested above, and checking it before every synchronous step is the best way. You will still be blocked by synchronous operations, though.

Related

Why is non-blocking web request efficient as we are holding a server thread in both cases

The question is about Play framework specifically although concept is generic.
Quoting from:
https://www.playframework.com/documentation/2.6.18/ScalaAsync
The web client will be blocked while waiting for the response, but
nothing will be blocked on the server, and server resources can be
used to serve other clients.
Using a Future is only half of the picture though! If you are calling
out to a blocking API such as JDBC, then you still will need to have
your ExecutionStage run with a different executor, to move it off
Play’s rendering thread poo
I understand the part that the original web application threads will be freed however another thread will still be needed to actually perform the cpu intensive action and then calculate the result, which will be propagated to the client (which is blocked meanwhile).
How is better than synchronously performing the execution in the play's action code? We would have to increase the number of threads (as the blocking request will consume threads), however total number of active threads on the server will remain the same.
Can someone also throw light on how does Play track the blocked client thread and returns the response in the non-blocking action scenario?
Using different thread pools for rendering and long-running operations is desirable because that way the long-running operations can use all of the threads in their pool without blocking rendering.
Imagine this situation:
10 clients make requests for resources that require long-running operations.
Then a client tries to access a resource that doesn't.
Here are two ways that this could be handled:
You have a pool with 10 threads used for everything. These fill up doing your long-running operations, and the other client — who has a simpler request! — has to wait for one of the long-running calls to finish.
You have two thread pools, one with 5 threads used for rendering and another with 5 threads used for long-running operations. The rendering threads quickly give the long-running operations work to the other pool, freeing them to respond to the eleventh client's request.
The second situation is definitely better, but I would like to point out another reason for having multiple thread pools: sometimes different operations require different kinds of system resources. For example, rendering might be CPU-bound, while database calls might be mostly network-bound, or CPU-bound but done on a different machine (the database server). If you use the same thread pool for both, the threads might get busy waiting for network calls to finish while the CPU sits mostly idle, even if you have several CPU-bound tasks queued. That would be an inefficient use of resources, so you should have different thread pools for different kinds of tasks.

How to deal with long-lasting operations in Reliable Actors or stateful Reliable Service and 're-process' failed states

I'm new to Service Fabric Reliable Actors technology and trying to figure out best practices for this specific scenario:
Let's say we have some legacy code that we want to run new code built on SF Reliable Actors. Actors of certain type "ActorExecutor" are going to asynchronously call some third-party service that sometimes could stuck for pretty long time, longer than actor's calling client is ready to wait, or even experience some prolonged underling communication issues. We do not want client (legacy code) to get blocked by any sort of issues in ActorExecutor, it does not expect to receive any value or status back from actor. Should we use SF ReliableQueue for that? Should we use some sort of actor-broker to receive requests from client and storing them to queue: Client->ActorBroker->ActorExecutor? Are reminders could be helpful here?
One more question in this regard: Giving the situation is possible when many thousands of actors might stuck in 'third-party incomplete call' in the same time, and we want to reactivate and repeat the very last call for them, should we write a new tool for that? In NServiceBus you can create an error queue in MSMQ where all failed like 'unable to process' messages to be landed, and then we were able to simply re-process them anytime in the future. From my understanding, there is no such thing in Service Fabric and it's something we need to built on our own.
An event driven approach can help you here. Instead of waiting for the Actor to return from the call to a service, you can enqueue some task on it, to request it to perform some action. The service calling Actor would function autonomously, processing items from it's task queue. This will allow it to perform retries and error handling. After a successful call, a new event can notify the rest of the system.
Maybe this project can help you to get started.
edits:
At this time, I don't believe you can use reliable collections in Actors. So a queue inside the state of an Actor, is a regular (read-only) collection.
Process the queue using an Actor Timer. Don't use the threadpool, as it's not persistent and won't survive crashes and Actor garbage collections.

The right way to call fire-and-forget method on a service-fabric service

I have a method on ServiceA that I need to call from ServiceB. The method takes upwards of 5 minutes to execute and I don't care about its return value. (Output from the method is handled another way)
I have setup my method in IServiceA like this:
[OneWay]
Task LongRunningMethod(int param1);
However that doesn't appear to run, because I am getting System.TimeoutException: This can happen if message is dropped when service is busy or its long running operation and taking more time than configured Operation Timeout.
One choice is to increase the timeout, but it seems that there should be a better way.
Is there?
For fire and forget or long running operations the best solution is using a message bus as a middle-ware that will handle this dependency between both process.
To do what you want without a middle-ware, your caller would have to worry about many things, like: Timeouts (like in your case), delivery guarantee(confirmation), Service availability, Exceptions and so on.
With the middle-ware the only worry your application logic need is the delivery guarantee, the rest should be handled by the middle-ware and the receiver.
There are many options, like:
Azure Service Bus
Azure Storage Queue
MSMQ
Event Hub
and so on.
I would not recommend using the SF Communication, Task.Run(), Threads workarounds as many places suggests, because they will just bring you extra work and wont run as smooth as the middle-ware approach.

Making a HTTP API server asynchronous with Future, how does it make it non-blocking?

I am trying to write a HTTP API server which does basic CRUD operation on a specific resource. It talks to an external db server to do the operations.
Future support in scala is pretty good, and for all non-blocking computation, future is used. I have used future in many places where we wrap an operation with future and move on, when the value is eventually available and the call back is triggered.
Coming to an HTTP API server's context, it is possible to implement non-blocking asynchronous calls, but when a GET or a POST call still blocks the main thread right?
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
I am failing to understand how efficiently a HTTP server could be designed to maximize efficiency, what happens when few hundred requests hit a specific endpoint and how it is dealt with. I've been told that slick takes the best approach.
If someone could explain a successful http request lifecycle with future and without future, assuming there are 100 db connection threads.
When a GET request is made, a success 200 means the data is written to
the db successfully and not lost. Until the data is written to the
server, the thread that was created is still blocking until the final
acknowledgement has been received from the database that the insert is
successful right?
The thread that was created for the specific request need not be blocked at all. When you start an HTTP server, you always have the "main" thread ongoing and waiting for requests to come in. Once a request starts, it is usually offloaded to a thread which is taken from the thread pool (or ExecutionContext). The thread serving the request doesn't need to block anything, it only needs to register a callback which says "once this future completes, please complete this request with a success or failure indication". In the meanwhile, the client socket is still pending a response from your server, nothing returns. If, for example, we're on Linux and using epoll, then we pass the kernel a list of file descriptors to monitor for incoming data and wait for that data to become available, in which we will get back a notification for.
We get this for free when running on top of the JVM due to how java.NIO is implemented for Linux.
The main thread (created when http request was received) could delegate
and get a Future back, but is it still blocked until the onSuccess is
trigged which gets triggered when the value is available, which means
the db call was successful.
The main thread usually won't be blocked, as it is whats in charge of accepting new incoming connections. If you think about it logically, if the main thread blocked until your request completed, that means that we could only serve one concurrent request, and who wants a server which can only handle a single request at a time?
In order for it to be able to accept multiple request, it will never handle the processing of the route on the thread in which it accepts the connection, it will always delegate it to a background thread to do that work.
In general, there are many ways of doing efficient IO in both Linux and Windows. The former has epoll while the latter has IO completion ports. For more on how epoll works internally, see https://eklitzke.org/blocking-io-nonblocking-io-and-epoll
First off, there has to be something blocking the final main thread for it to keep running. But it's no different than having a threadpool and joining to it. I'm not exactly sure what you're asking here, since I think we both agree that using threads/concurrency is better than a single threaded operation.
Future is easy and efficient because it abstracts all the thread handling from you. By default, all new futures run in the global implicit ExecutionContext, which is just a default threadpool. Once you kick of a Future request, that thread will spawn and run, and your program execution will continue. There are also convenient constructs to directly manipulate the results of a future. For example, you can map, and flatMap on futures, and once that future(thread) returns, it will run your transformation.
It's not like single threaded languages where a single future will actually block the entire execution if you have a blocking call.
When you're comparing efficiency, what are you comparing it to?
In general "non-blocking" may mean different things in different contexts: non-blocking = asynchronous (your second question) and non-blocking = non-blocking IO (your first question). The second question is a bit simpler (addresses more traditional or well-known aspect let's say), so let's start from it.
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
It is not blocked, because Future runs on different thread, so your main thread and thread where you execute your db call logic run concurrently (main thread still able to handle other requests while db call code of previous request is executing).
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
This aspect is about IO. Thread making DB call (Network IO) is not necessary blocked. It is the case for old "thread per request" model, when thread is really blocked and you need create another thread for another DB request. However, nowadays non-blocking IO became popular. You can google for more details about it, but in general it allows you to use one thread for several IO operations.

Suspend Akka Actors

I am trying to use Akka to implement the following (I think I'm trying to use Akka the proper way):
I have a system where I have n resource listeners. Essentially a resource listener is an entity that will listen on an input resource and publish what it sees (i.e. polling a database, tailing a log file, etc.).
So I want to use Akka actors to do these little bits of work units (listening on a resource). I've noticed that the Akka gives me a thread pool of t threads which may be less than the number of listeners. Unfortunately for me, getting a message from these resource listeners might be blocking, so it could take seconds, minutes, before the next message pops up.
Is there any way to suspend a resource listener so it leaves the thread to another actor and we'll come back to it a little later in time?
Executive Summary
What you want is for your producer API (the resources) to be asynchronous, or at least support non-blocking operations (so that you can do polling). If the API does not support that, then there is no way to retrofit this property, not even using the almighty actors ;-)
Strategies for Different Situations
Only Blocking API
If the resources only support the blocking getWhatever() method of retrieving things, then you must allocate one thread per resource. An Actor with a PinnedDispatcher could be a way to do this. But be aware that the actor will not be responsive while waiting for events from the resource.
Non-Blocking but Synchronous API
If there is a peek() or poll() method on the resource API, you can use one actor per resource, have them share a thread (or pool) and schedule the polling as required (i.e. every 100ms or whatever you need). This has the huge advantage that nobody is actually blocked and the whole system remains responsive. But latency for event reception will be of the order of your schedule interval.
Proper Asynchronous API
If you have enough good karma to encounter a nice asynchronous API, then simply register a callback which will send a message to the actor whenever an event occurs. Sadly, this is not the norm.
PS:
The JVM does not support wrapping up the current call stack, doing something else and return to that same processing state later. A method can only be popped of the stack when it is actually finished.
In general, you should try to avoid blocking operations in actors. For file IO, there are asynchronous libraries and for some databases, too. If that is not an option for you, you can set change the default dispatcher so that the underlying thread pool expands as needed.
One option is to call your blocking APIs inside Futures. The Futures should use an ExecutionContext (thread pool) that is separate from the Actors' ExecutionContext.
See this blog post for an example (specifically CacheActor.findValueForSender).