starting with reactive DB access in a blocking monolith - wildfly

In a DB heavy monolith based on wildfly. does it make sense to transform the DB access to reactive one for starters? should I see performance benefits?
also, the DB is sybase and the only 'generic' jdbc driver I know is from vert.x but this implies that I will have to put vert.x inside my wildfly. I understand that they are sort of alternatives but I cant find any other options.
I would love to hear your thoughts about the 2 points I am raising. In general, I cant commit to a full transition from wildfly to quarkus/vert.x from the get go as it will take lots of resources so I thought I could start smaller...

Vert.x is a toolkit, which means, for example, you do not need to use the web server it provides, nor any other module. It's also very lightweight, so you will only add a few more dependencies to your application. So, yes it can make sense to integrate Vert.x.
vertx-jdbc-client however, cannot magically transform blocking calls into non-blocking calls. Instead, it will off-load the blocking calls onto Vert.x' worker thread pool. That will lead to another effect: The DB call you used to wait for, will immediately return, leaving you with nothing but a Future. That Future will eventually have the expected result.
Going further upstream in your code (the direction where your user's request came from), this means that you will have to
either defer processing of the result via Future.map() or Future.compose()
block the thread to get the result immediately
You will win nothing by (2), so rule that out.
When you go for (1), you must defer all further processing, up to the point where the incoming request is originally handled. If that is, for example, a Servlet, you have to use Asynchronous Processing to make sure that Wildfly does not commit the response after the doGet, doPost etc. method exits.
The result of all this will be that Wildfly now handles your request asynchronously, with Vert.x managing the DB interaction. You can do that. But it would be more idiomatic to your current setup to just use Asynchronous Processing (or Spring's #Async feature) and wrap all of your code in a Runnable. Both approaches will not speed up request processing itself, because the processing depends on the slower DB. However, Wildfly will be able to process more requests because the threads it assigns to requests will not be blocked anymore.
Having all that said, if you want to migrate to Quarkus in small steps, you should do that service by service. Identify the Servlets (or Controllers) which do the work, and port them one by one to Quarkus. If sessions are your problem, then you could possibly share them between Wildfly and Quarkus, using Infinispan.

Related

Will WebFlux have any bottlenecks in such architecture?

We're currently about to migrate from monolithic design to the microservice architecture, trying to choose the best way to replace JAX-WS with RESTful and considering to use Spring WebFlux.
We currently have an JAX-WS endpoint deployed at Tomcat EE serving requests from third-party clients. Webservice endpoint makes a long running blocking call to the database and then sends a SOAP-response to the client with a data retrieved from DB (Oracle).
Oracle DB will be replaced with one of NoSQL databases soon (possibly it will be MongoDB). Since MongoDB supports asynchronous calls we're considering to substitute current implementation with a microservice exposing REST endpoint based on WebFlux.
We have about 2500 req/sec at peaks, so current endpoint often gets down with a OutOfMemoryError. It was a root cause that pushed us towards migration.
My thoughts are to create a non-blocking endpoint which will call MongoDB in asynchronous manner and send a REST-response to the client. So I have a few questions considering basic features that WebFlux provides:
As far as I concerned there is a built-in backpressure control at
the business-level (not TCP flow control) in WebFlux and it works
generally via Reactive Streams. Since our clients are not
reactive, does it means that such way of a backpressure control is
not implementable here?
Suppose that calls to a new database remains long-running in a new
architecture. Since Netty uses EventLoop to serve incoming
requests, is there possible a situation when the microservice has
accepted all incoming HTTP connections, invoke an async call to the
db and subscribed a resulted Mono to the scheduler, but, since
the request quantity keeps growing explosively, application keep
creating new workers at scheduler pools that leads to a
crashing? Is this a realistic scenario?
Suppose that calls to the database remained synchronous. Is there a
way to handle them using WebFlux in a such way that microservice
will remain reachable under load?
Which bottlenecks can be found in such design? Does this solution
looks adequate?
Does Netty (or Reactor-Netty, or whatever) has a tool to limit a
quantity of requests processing simultaneously? Say I would to limit
the endpoint to serve not more than 100 parallel requests and skip
all requests above that point, is it possible?
Suppose I will create a huge amount of threads serving async (or
maybe sync) calls to the DB. Where is a breaking point when the
application will crash or stop responding to the incoming
HTTP-requests? What will happened there - we will ran out of memory
or..?
Finally, there were no any major issues concerning perfomance during our pilot project. But unfortunately we didn't take in account some specific Linux (and also OpenShift) TCP tuning props.
They may significanly affect the overall perfomance, in our case we've gained about 10 times more requests after tuning.
So pay attention to the net.core.somaxconn and other related parameters.
I've summarized our expertise in the article.

How to handle timeouts in a REST Client when calling methods with side-effect

Let's say we have a REST client with some UI that lists items it GETs from the server. The server also exposes some REST methods to manipulate the items (POST / PUT).
Now the user triggers one of those calls that are supposed to change the data on the server side. The UI will reflect the server state change, if the call was successful.
But what are good strategies to handle the situation when the server is not available?
What is a reasonable timeout lengths (especially in a 3G / Cloud setup)?
How do you handle the timeout in the client, considering the fact that the client can't tell whether the operation succeeded or not?
Are there any common patterns to solve that, other than a complete client termination (and subsequent restart)?
This will be application specific. You need to decide what makes the most sense in your usage case.
Perhaps start with a timeout similar to that of the the default PHP session of 24 minutes. Adjust as necessary based on testing.
Do you have server and client mixed up here? If so the server cannot tell if the client times out other than reaching the end of a session. The client can always query the server for a progress update.
This one is a little general to provide an answer for.

Play & Akka and blocking threads for database access

I want to make a call to a database which has lots of data and it might take a while to return.
I plan to do that work inside a call to Akka.future(f) and use an Async{} to render the response when the work is done.
Does it make sense to do that, or should I just do the long database call in the controller, without sending the work to Akka?
Or is there a way to do non blocking database access?
If you're forced to use a blocking driver for your database (if for some reason the async driver for MySQL doesn't work out) consider setting up an Actor pool (using routing) with a PinnedDispatcher.
The PinnedDispatcher provides a thread per actor and, by setting up the router, will give you the ability to adjust the number of threads strictly responsible for handling the database calls. Easy scaling. Also, by using Actors you can structure the messages between actors (e.g. a message having the results of the database call) a little easier.
You can use Akka.future(f) and provide your own Akka configuration file to get more threads to process your database accesses. Look at this config file for example.
But you pointed it out: the real problem is in using a database driver that blocks. I don't know which DB you are using, but it's worth to take a look to MongoDB with ReactiveMongo for example. With ReactiveMongo all MongoDB operations are perfectly non-blocking and asynchronous. There is a good introduction here. Moreover, it deals very well with Play Framework (check the ReactiveMongo Play Plugin).
EDIT: You can also check "Configuring Playframework's internal Akka system" to tune the worker threads number.
If the response is blocked on completion of the database call, then it's only useful to make it asynchronous if you can get other work done towards assembling the response while the call runs.
Non blocking database access could mean a couple different things: A client library that gives you a callback based API, which would be pretty similar to the future solution, or one that uses non-blocking sockets to save on thread usage. I'm assuming you mean the former, in which case I think it'd be functionally equivalent to using a future.

Is it good to put jdbc operations in actors?

I am building a traditional webapp that do database CRUD operations through JDBC. And I am wondering if it is good to put jdbc operations into actors, out of current request processing thread. I did some search but found no tutorials or sample applications that demo this.
So What are the cons and pros? Will this asynchonization improve the capacity of the appserver(i.e. the concurrent request processed) like nio?
Whether putting JDBC access in actors is 'good' or not greatly depends upon the rest of your application.
Most web applications today are synchronous, thanks to the Servlet API that underlies most Java (and Scala) web frameworks. While we're now seeing support for asynchronous servlets, that support hasn't worked its way up all frameworks. Unless you start with a framework that supports asynchronous processing, your request processing will be synchronous.
As for JDBC, JDBC is synchronous. Realistically there's never going to be anything done about that, given the burden that would place on modifying the gazillion JDBC driver implementations that are out in the world. We can hope, but don't hold your breath.
And the JDBC implementations themselves don't have to be thread safe, so invoking an operation on a JDBC connection prior to the completion of some other operation on that same connection will result in undefined behavior. And undefined behavior != good.
So my guess is that you won't see quite the same capacity improvements that you see with NIO.
Edit: Just discovered adbcj; an asynchronous database driver API. It's an experimental project written for a master's thesis, very early, experimental. It's a worthy experiment, and I hope it succeeds. Check it out!
But, if you are building an asynchronous, actor-based system, I really like the idea of having data access or repository actors, much in the same way your would have data acccess or repository objects in a layered OO architecture.
Actors guarantee that messages are processed one at a time, which is ideal for accessing a single JDBC connection. (One word of caution: most connection pools default to handing out connection-per-thread, which does not play well with actors. Instead you'll need to make sure that you are using a connection-per-actor. The same is true for transaction management.)
This allows you to treat the database like the asynchronous remote system we ought to have been treating it as all along. This also means that results from your data access/repository actors are futures, which are composable. This makes it easier to coordinate data access with other asynchronous activities.
So, is it good? Probably, if it fits within the architecture of the rest of your system. Will it improve capacity? That will depend on your overall system, but it sounds like a very worthy experiment.

Implement server-push with GWTP

I got a project using GWTP (which involves MVP separation, Gin and Dispatch), now I'm on the situation where it is required that changes on the server are pushed to specific clients
I've reading the gwt-comet and gwteventservice documentation, It seems the first doesn't work with RPC and the second Ecnapsulates RPC, for which I don't know how to fit it in my current command pattern from GWTP. Ideas?
I have been using gwt-comet (http://code.google.com/p/gwt-comet/). It's a native comet implementation working pretty good like RPC, you can send Strings or your GWT-serialized objects as well. And the best thing you don't need to do many things to make it works.
i used "Server Push in GWT" described here http://code.google.com/p/google-web-toolkit-incubator/wiki/ServerPushFAQ - it seemed to work fairly well for a small project.
This is really a servlet problem, not a GWT or GWTP problem.
So there are a few approaches to doing this, the most stable (in my opinion) is to have a long or blocking poll servlet. This is basically a servlet that is polled by the client, and holds the connection open for some period of time if there is no message to 'push' to the client, and if too much time passes (this is to get around http timeouts) a heartbeat is returned of some kind. Either way, when the servlet request request returns, the client just makes another request. This is the most portable and stable way to my mind, since it uses only the core servlet api, doesn't suffer from network issues, and the blocking portion allows you to have the poll 'park' at the server for some period of time and reduces total request load, while allowing very quick return of new information to the client when there is some available.
The next way to achieve this is via WebSockets, this is great once you get it working and in my opinion is the way of the future without question. I think this is a good one to work with since this will be, in my opinion, a paradigm shift in web applications once it catches a head of steam, so we all need to be up to speed. Basically, you have a javascript 'socket' open via port 80 (this is one of the best features, since you don't have to open any firewall holes) and can communicate in two directions across that socket.
Comet can also work, but it will generally lock you down to one server type, which may be alright for your application. Caveat here!!!! I have only done very small tests with comet, it was flaky for me when I set it up, and was not as steady as the blocking poll solution as I had it set up.
Now the neatest one in my opinion, but this one is very limited due to network constraints probably to single domain intranet applications, is to use an applet based push. This setup (which could be done with udp or a straight socket, I did all web just to keep it all simpler conceptually) takes the applet, uses it to spin up a jetty server instance on the client, and the has the page publish the client's jetty 'endpoint' to the server. At this point, the client can contact the server using it's servlets, and the server can contact the client at the servlet(s) exposed on the jetty server. This is true push, it's neato, but there are network nightmares.
So of all the above, I use long polling, keep my eye on web sockets since they are the future in my mind, and really like the applet based version, although it's quite restricted in use due to the network resolution limitations.
Once you have this decided, from GWTP you would just have actions or JSNI bridge methods as needed to connect to your server and receive responses. I won't go into this, since this is really a core servlet/http/javascript question more than a GWT or GWTP centric question.
I hope that helps!