What happens to vertx.eventloop thread once the control passes to blockingHandler? - vert.x

I am using vert.x as api gateway and each request has to go through multiple handlers
Sample code snippet
router.route(BASE_PATH)
.method(HttpMethod.POST)
.handler(LoggerHandler.create(LoggerFormat.SHORT))
.handler(BodyHandler.create())
.blockingHandler(this::authRouter)
.blockingHandler(this::reqValidationRouter)
.handler(this::downStreamRouter)
.blockingHandler(this::responseTransformRouter)
What happens to event loop threads when the control passes to blockingHandler? Do they continue to accept more requests? If yes, what happens when the blocking handler execution completes?
Does this switching from eventLoop to blockingHandler (workerPool) and then back to eventLoop has any performance implications?
What is the ideal way to handle multiple handlers?
Thanks,
Nitish Goyal

What happens to event loop threads when the control passes to blockingHandler? Do they continue to accept more requests?
Yes, the event loop will offload the blocking handler part to the worker pool and handle other events.
If yes, what happens when the blocking handler execution completes?
An event with the result is added to the event loop queue.
Does this switching from eventLoop to blockingHandler (workerPool) and then back to eventLoop has any performance implications?
Switching between threads is not free but the cost should be negligible in the overall latency for such a use case (api gateway).
What is the ideal way to handle multiple handlers?
Ideally you'd avoid blocking code in your Vert.x Web handlers.

Related

Deploying new Verticle every for every HTTP Request?

Currently on application startup I'm deploying a single verticle and calling createHttpServer(serverOptions).
I've set up a request().connection().closeHandler for handling a closed connection event, primarily so when clients decide to cancel their request, we halt our execution of that request.
However, when I set up that handler in that same verticle, it only seems to execute the closeHandler code once any synchronous code is finished executing and we're waiting on databases to respond via Futures and asynchronous handlers.
If instead of that, I deploy a worker verticle for each new HTTP request, it properly interrupts execution to execute the closeHandler code.
As I understand it, the HttpServer is already supposed to handle scalability of requests on its own since it can handle many at once without deploying new verticles. Essentially, this sounds like a hacky workaround that may affect our thread loads or things of that nature once our application is in full swing. So my questions are:
Is this the right way of doing this?
If not, what is the correct method or paradigm to follow?
How do you cancel the execution of a verticle from within itself verticle and inside that closeHandler? And by cancel execution, I mean including any Futures waiting to be completed.
Why does closeHandler only execute asynchronously when doing this multiple verticle approach? Using the normal way and simply executing requests using the alloted thread pool postpones closeHandler's execution until the eventloop finishes its queue, we need this to happen asynchronously
I think you need to understand Vert.x better. Vert.x does not start and stop thread per request. Verticles are long living and each handle multiple events during their lifetime but never concurrently. Also you should not deploy worker (or non-worker) Verticles per request.
What you do is that you deploy a pool of Verticles (worker and non) and Vert.x divides the load between them. An HTTP server is placed in front and will receive requests and forward them to verticle(s) to be handled.
for stopping processing a request, you need to keep a flag somewhere which is set if connection is closed. then you can check for it in your process and stop processing. Just don't forget to clear the flag at beginning of each request.
Deploying or undeploying verticles doesn't affect threads count. Vert.x uses thread pools of a limited size.
Undeploying verticles is a mean to downscale your service. Ideally, you shouldn't undeploy verticles at all. Deploying or undeploying does have a performance impact.
closeHandler, as I mentioned previously, is a callback method to release resources.
Vert.x Future doesn't provide cancellation means. The reason is that even Java's Future.cancel() is a cooperative operation.
As a means to fix this, probably passing a reference to AtomicBoolean as was suggested above, and checking it before every synchronous step is the best way. You will still be blocked by synchronous operations, though.

Making a HTTP API server asynchronous with Future, how does it make it non-blocking?

I am trying to write a HTTP API server which does basic CRUD operation on a specific resource. It talks to an external db server to do the operations.
Future support in scala is pretty good, and for all non-blocking computation, future is used. I have used future in many places where we wrap an operation with future and move on, when the value is eventually available and the call back is triggered.
Coming to an HTTP API server's context, it is possible to implement non-blocking asynchronous calls, but when a GET or a POST call still blocks the main thread right?
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
I am failing to understand how efficiently a HTTP server could be designed to maximize efficiency, what happens when few hundred requests hit a specific endpoint and how it is dealt with. I've been told that slick takes the best approach.
If someone could explain a successful http request lifecycle with future and without future, assuming there are 100 db connection threads.
When a GET request is made, a success 200 means the data is written to
the db successfully and not lost. Until the data is written to the
server, the thread that was created is still blocking until the final
acknowledgement has been received from the database that the insert is
successful right?
The thread that was created for the specific request need not be blocked at all. When you start an HTTP server, you always have the "main" thread ongoing and waiting for requests to come in. Once a request starts, it is usually offloaded to a thread which is taken from the thread pool (or ExecutionContext). The thread serving the request doesn't need to block anything, it only needs to register a callback which says "once this future completes, please complete this request with a success or failure indication". In the meanwhile, the client socket is still pending a response from your server, nothing returns. If, for example, we're on Linux and using epoll, then we pass the kernel a list of file descriptors to monitor for incoming data and wait for that data to become available, in which we will get back a notification for.
We get this for free when running on top of the JVM due to how java.NIO is implemented for Linux.
The main thread (created when http request was received) could delegate
and get a Future back, but is it still blocked until the onSuccess is
trigged which gets triggered when the value is available, which means
the db call was successful.
The main thread usually won't be blocked, as it is whats in charge of accepting new incoming connections. If you think about it logically, if the main thread blocked until your request completed, that means that we could only serve one concurrent request, and who wants a server which can only handle a single request at a time?
In order for it to be able to accept multiple request, it will never handle the processing of the route on the thread in which it accepts the connection, it will always delegate it to a background thread to do that work.
In general, there are many ways of doing efficient IO in both Linux and Windows. The former has epoll while the latter has IO completion ports. For more on how epoll works internally, see https://eklitzke.org/blocking-io-nonblocking-io-and-epoll
First off, there has to be something blocking the final main thread for it to keep running. But it's no different than having a threadpool and joining to it. I'm not exactly sure what you're asking here, since I think we both agree that using threads/concurrency is better than a single threaded operation.
Future is easy and efficient because it abstracts all the thread handling from you. By default, all new futures run in the global implicit ExecutionContext, which is just a default threadpool. Once you kick of a Future request, that thread will spawn and run, and your program execution will continue. There are also convenient constructs to directly manipulate the results of a future. For example, you can map, and flatMap on futures, and once that future(thread) returns, it will run your transformation.
It's not like single threaded languages where a single future will actually block the entire execution if you have a blocking call.
When you're comparing efficiency, what are you comparing it to?
In general "non-blocking" may mean different things in different contexts: non-blocking = asynchronous (your second question) and non-blocking = non-blocking IO (your first question). The second question is a bit simpler (addresses more traditional or well-known aspect let's say), so let's start from it.
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
It is not blocked, because Future runs on different thread, so your main thread and thread where you execute your db call logic run concurrently (main thread still able to handle other requests while db call code of previous request is executing).
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
This aspect is about IO. Thread making DB call (Network IO) is not necessary blocked. It is the case for old "thread per request" model, when thread is really blocked and you need create another thread for another DB request. However, nowadays non-blocking IO became popular. You can google for more details about it, but in general it allows you to use one thread for several IO operations.

in scala-akka actor should I open a future when handling a message?

I have an actor which receives a message to perform an operation like sending an alert and this is the actor purpose. To send an alerts. Should I open now a future and wrap the sending of an alert in another async mechanism? or is my actor receive - send alert already the async process I was waiting for?
It really depends on what sending that alert entails. More specifically, does sending that alert entail blocking code. You want to avoid blocking in the default dispatcher for your actor system. If sending the alert entails a blocking operation (like say a synchronous I/O operation), then you can consider a couple of different approaches:
1) Give this actor its own dispatcher so its blocking won't affect the main dispatcher for your actor system. This technique is referred to as "bulk heading" (I prefer "fire-walling") as you are sealing off your slow/blocking (and potentially dangerous code) from the rest of your application. If you have a single actor that sends these alerts, you might want to consider a Pinned Dispatcher which would result in this actor having its own thread for its execution.
2) Wrap the potentially dangerous alert sending code in a Future. This moves the blocking code out of your main dispatcher and into another ExecutionContext (provided you provide a different one for these types of things and don't just use the dispatcher of the actor itself), which should be safer. I don't love this approach as it can lead to closing over mutable state and potentially leading to race conditions (which actors try to avoid in the first place), but if done safely, it can be effective.
I would honestly stick with approach 1 as it better fits the Akka idiom, but that's just my opinion. Either approach can work for you as I'm sure there are a couple of other ones as well.
If sending the alert is non-risky and/or non-blocking, then there is no need IMO to use a Future.
That depends if sending this alert is a "long running blocking process". If it is, you should spin it off as a future - yes. If it's not, simply rely on the Actor to perform this action

What is the difference between asynchronous and synchronous HTTP request?

What is the difference between asynchronous and synchronous HTTP request?
Synchronous:
A synchronous request blocks the client until operation completes. In such case, javascript engine of the browser is blocked.
Asynchronous
An asynchronous request doesn’t block the client i.e. browser is responsive. At that time, user can perform another operations also. In such case, javascript engine of the browser is not blocked.
Check out Determining synchronous vs. asynchronous in web applications for previous discussion. In short:
Asynchronous APIs do not block. Every synchronous call waits and blocks for your results to > come back. This is just a sleeping thread and wasted computation.
Asynchronous APIs do not block. Every synchronous call waits and blocks for your results to come back. This is just a sleeping thread and wasted computation.
If you need something to happen, send an asynchronous request and do further computation when the request returns. This means your thread sits idle and can pick up other work.
Asynchronous requests is the way to scale to thousands of concurrent users.
Sachin Gandhwani's answer is very well explained in simple words. In case you are still not convinced with the difference of asynchronous HTTP request and synchronous HTTP request, you can read this - https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests
A synchronous client constructs an HTTP structure, sends a request, and waits for a response. An asynchronous client constructs an HTTP structure, sends a request, and moves on. In this case, the client is notified when the response arrives. The original thread, or another thread, can then process the response. Although asynchronous behavior can result in faster overall execution, synchronous behavior might be preferred in certain cases where more simplified code is necessary.

Suspend Akka Actors

I am trying to use Akka to implement the following (I think I'm trying to use Akka the proper way):
I have a system where I have n resource listeners. Essentially a resource listener is an entity that will listen on an input resource and publish what it sees (i.e. polling a database, tailing a log file, etc.).
So I want to use Akka actors to do these little bits of work units (listening on a resource). I've noticed that the Akka gives me a thread pool of t threads which may be less than the number of listeners. Unfortunately for me, getting a message from these resource listeners might be blocking, so it could take seconds, minutes, before the next message pops up.
Is there any way to suspend a resource listener so it leaves the thread to another actor and we'll come back to it a little later in time?
Executive Summary
What you want is for your producer API (the resources) to be asynchronous, or at least support non-blocking operations (so that you can do polling). If the API does not support that, then there is no way to retrofit this property, not even using the almighty actors ;-)
Strategies for Different Situations
Only Blocking API
If the resources only support the blocking getWhatever() method of retrieving things, then you must allocate one thread per resource. An Actor with a PinnedDispatcher could be a way to do this. But be aware that the actor will not be responsive while waiting for events from the resource.
Non-Blocking but Synchronous API
If there is a peek() or poll() method on the resource API, you can use one actor per resource, have them share a thread (or pool) and schedule the polling as required (i.e. every 100ms or whatever you need). This has the huge advantage that nobody is actually blocked and the whole system remains responsive. But latency for event reception will be of the order of your schedule interval.
Proper Asynchronous API
If you have enough good karma to encounter a nice asynchronous API, then simply register a callback which will send a message to the actor whenever an event occurs. Sadly, this is not the norm.
PS:
The JVM does not support wrapping up the current call stack, doing something else and return to that same processing state later. A method can only be popped of the stack when it is actually finished.
In general, you should try to avoid blocking operations in actors. For file IO, there are asynchronous libraries and for some databases, too. If that is not an option for you, you can set change the default dispatcher so that the underlying thread pool expands as needed.
One option is to call your blocking APIs inside Futures. The Futures should use an ExecutionContext (thread pool) that is separate from the Actors' ExecutionContext.
See this blog post for an example (specifically CacheActor.findValueForSender).