I'm new to both Play framework and Akka Toolkit.
We are trying to build an orchestration layer between the web client and microservices using Play.
So basically for every request from the client, Play has to do a WS call and return the JSON (as well as cache it).
Now when doing the WS call, we can use Play Async APIs or use Akka actors.
Does one of these options outweigh the other anyway?
Is there any recommendation on when one should venture into using Akka actors along with Play compared to directly using Play Async APIs?
In Akka, main notion as an actor, which is an object with memory to keep state. Sequential operations to that state are serialized by the system and cannot interfere. In Java8 promise/futures, main notion is asynchronous method call, and if different methods belong to the same object, it is user's responsibility to provide serial access - which is not easy and can be error prone.
Then, using futures implies creation of a Future object for each separate operation, which can be considered as overhead if the operations are fine-grained.
On the other hand, CompletableFuture has means to combine several events into one like allOf(), which has no direct analogue in Akka.
Related
We know Akka is one implementation of actor pattern. Without Akka, I usually implement a simple actor pattern using ThreadPool+BlockingQueue. So the message is offered into the queue, and the works(actors) take the message from the Queue, then do what they should do. Of course, this kind of implementation can be only in just ONE process.
So as to in one process,
What's the essential difference between these two(Akka vs.
ThreadPool+BlockingQueue)
Moreover, what's the difference between actor pattern and producer-consumer model?
Actor model is indeed quite similar to producer-consumer model (P-C).
However, if you use a blocking queue with P-C your application won't be completely non-blocking and asynchronous. The promise of actor model and Akka is that all messages are sent asynchronously and don't block the sender.
Another aspect of it is managing these queues gets quite cumbersome once you have many consumers and producers. With actors you simply send a message and don't have to think about these low level details. Under the hood Akka will keep a message queue aka mailbox per actor with a dispatcher assigning actors to the thread pool to process those messages.
It's much easier to use Akka to achieve highly performant and resilient application than coding it yourself. You get fault tolerance, resource management, location transparency, routing, distributed, async processing, hierarchical supervision out of the box. Not to mention other frameworks and libraries leveraging these features to give you even more (reactive streams, akka http, etc). There are lot's of patterns developed for you already there, so why bother with your own.
Im learning akka streams but obviously its relevant to any streaming framework :)
quoting akka documentation:
Reactive Streams is just to define a common mechanism of how to move
data across an asynchronous boundary without losses, buffering or
resource exhaustion
Now, from what I understand is that if up until before streams, lets take an http server for example, the request would come and when the receiver wasent finished with a request, so the new requests that are coming will be collected in a buffer that will hold the waiting requests, and then there is a problem that this buffer have an unknown size and at some point if the server is overloaded we can loose requests that were waiting.
So then stream processing came to play and they bounded this buffer to be controllable...so we can predefine the number of messages (requests in my example) we want to have in line and we can take care of each at a time.
my question, if we implement that a source in our server can have a 3 messages at most, so if the 4th id coming what happens with it?
I mean when another server will call us and we are already taking care of 3 requests...what will happened to he's request?
What you're describing is not actually the main problem that Reactive Streams implementations solve.
Backpressure in terms of the number of requests is solved with regular networking tools. For example, in Java you can configure a thread pool of a networking library (for example Netty) to some parallelism level, and the library will take care of accepting as much requests as possible. Or, if you use synchronous sockets API, it is even simpler - you can postpone calling accept() on the server socket until all of the currently connected clients are served. In either case, there is no "buffer" on either side, it's just until the server accepts a connection, the client will be blocked (either inside a system call for blocking APIs, or in an event loop for async APIs).
What Reactive Streams implementations solve is how to handle backpressure inside a higher-level data pipeline. Reactive streams implementations (e.g. akka-streams) provide a way to construct a pipeline of data in which, when the consumer of the data is slow, the producer will slow down automatically as well, and this would work across any kind of underlying transport, be it HTTP, WebSockets, raw TCP connections or even in-process messaging.
For example, consider a simple WebSocket connection, where the client sends a continuous stream of information (e.g. data from some sensor), and the server writes this data to some database. Now suppose that the database on the server side becomes slow for some reason (networking problems, disk overload, whatever). The server now can't keep up with the data the client sends, that is, it cannot save it to the database in time before the new piece of data arrives. If you're using a reactive streams implementation throughout this pipeline, the server will signal to the client automatically that it cannot process more data, and the client will automatically tweak its rate of producing in order not to overload the server.
Naturally, this can be done without any Reactive Streams implementation, e.g. by manually controlling acknowledgements. However, like with many other libraries, Reactive Streams implementations solve this problem for you. They also provide an easy way to define such pipelines, and usually they have interfaces for various external systems like databases. In particular, such libraries may implement backpressure on the lowest level, down to to the TCP connection, which may be hard to do manually.
As for Reactive Streams itself, it is just a description of an API which can be implemented by a library, which defines common terms and behavior and allows such libraries to be interchangeable or to interact easily, e.g. you can connect an akka-streams pipeline to a Monix pipeline using the interfaces from the specification, and the combined pipeline will work seamlessly and supporting all of the backpressure features of Reacive Streams.
I am going to keep it short, we have a product that uses BPM and internal queue with lots of EJBs (pojo implementation). We decided to add REST to the product and we zeroed in to JAX-RS and Swagger for documentation.
Now, we created endpoint pointing to a async scenario in a such a way that when REST request arrives we start the BPMN flow asynchronously and then we wait for agreed timeout duration for flows to finish so that we can parallelly send a response to internal queue, which receive message when BPMN flow finished processing and then can construct REST response.
I am looking for some enterprise pattern or some utility framework to help me achieve this and not invent it myself. I know Camel has lots of such patterns but I am not so sure I am looking for something available on JDK 1.6 compatible framework to simulate this synchronous behavior.
I would have something like a RxJava or some observer notifier pattern probably no internal JMS queues to pass message between threads. A concurrent and thread-safe soilutuion is what I am looking for.
I would have something like a RxJava or some observer notifier pattern probably no internal JMS queues to pass message between threads. A concurrent and thread-safe solution is what I am looking for.
If you are to be using JAX-RS, then you should probably become familiar with the Asynchronous Server API. For a slow but synchronous operation, you would simply dispatch a task to your executor, and resume the suspended request when you have a result.
Another approach is to store the suspended request in a shared data structure, with a worker responsible for observing the completed flows, looking up the suspended request and dispatching the response.
The ResponseServlet from Michael Barker's ticketing demonstration shows this basic idea (Barker's code uses servlets rather than JAX-RS, and Disruptor rather than RxJava, so you'll need to translate).
Additional resources on async response processing
https://dennis-xlc.gitbooks.io/restful-java-with-jax-rs-2-0-2rd-edition/content/en/part1/chapter13/server_asynchronous_response_processing.html
http://www.nurkiewicz.com/2014/12/asynchronous-timeouts-with.html
I want to make a call to a database which has lots of data and it might take a while to return.
I plan to do that work inside a call to Akka.future(f) and use an Async{} to render the response when the work is done.
Does it make sense to do that, or should I just do the long database call in the controller, without sending the work to Akka?
Or is there a way to do non blocking database access?
If you're forced to use a blocking driver for your database (if for some reason the async driver for MySQL doesn't work out) consider setting up an Actor pool (using routing) with a PinnedDispatcher.
The PinnedDispatcher provides a thread per actor and, by setting up the router, will give you the ability to adjust the number of threads strictly responsible for handling the database calls. Easy scaling. Also, by using Actors you can structure the messages between actors (e.g. a message having the results of the database call) a little easier.
You can use Akka.future(f) and provide your own Akka configuration file to get more threads to process your database accesses. Look at this config file for example.
But you pointed it out: the real problem is in using a database driver that blocks. I don't know which DB you are using, but it's worth to take a look to MongoDB with ReactiveMongo for example. With ReactiveMongo all MongoDB operations are perfectly non-blocking and asynchronous. There is a good introduction here. Moreover, it deals very well with Play Framework (check the ReactiveMongo Play Plugin).
EDIT: You can also check "Configuring Playframework's internal Akka system" to tune the worker threads number.
If the response is blocked on completion of the database call, then it's only useful to make it asynchronous if you can get other work done towards assembling the response while the call runs.
Non blocking database access could mean a couple different things: A client library that gives you a callback based API, which would be pretty similar to the future solution, or one that uses non-blocking sockets to save on thread usage. I'm assuming you mean the former, in which case I think it'd be functionally equivalent to using a future.
I am building a traditional webapp that do database CRUD operations through JDBC. And I am wondering if it is good to put jdbc operations into actors, out of current request processing thread. I did some search but found no tutorials or sample applications that demo this.
So What are the cons and pros? Will this asynchonization improve the capacity of the appserver(i.e. the concurrent request processed) like nio?
Whether putting JDBC access in actors is 'good' or not greatly depends upon the rest of your application.
Most web applications today are synchronous, thanks to the Servlet API that underlies most Java (and Scala) web frameworks. While we're now seeing support for asynchronous servlets, that support hasn't worked its way up all frameworks. Unless you start with a framework that supports asynchronous processing, your request processing will be synchronous.
As for JDBC, JDBC is synchronous. Realistically there's never going to be anything done about that, given the burden that would place on modifying the gazillion JDBC driver implementations that are out in the world. We can hope, but don't hold your breath.
And the JDBC implementations themselves don't have to be thread safe, so invoking an operation on a JDBC connection prior to the completion of some other operation on that same connection will result in undefined behavior. And undefined behavior != good.
So my guess is that you won't see quite the same capacity improvements that you see with NIO.
Edit: Just discovered adbcj; an asynchronous database driver API. It's an experimental project written for a master's thesis, very early, experimental. It's a worthy experiment, and I hope it succeeds. Check it out!
But, if you are building an asynchronous, actor-based system, I really like the idea of having data access or repository actors, much in the same way your would have data acccess or repository objects in a layered OO architecture.
Actors guarantee that messages are processed one at a time, which is ideal for accessing a single JDBC connection. (One word of caution: most connection pools default to handing out connection-per-thread, which does not play well with actors. Instead you'll need to make sure that you are using a connection-per-actor. The same is true for transaction management.)
This allows you to treat the database like the asynchronous remote system we ought to have been treating it as all along. This also means that results from your data access/repository actors are futures, which are composable. This makes it easier to coordinate data access with other asynchronous activities.
So, is it good? Probably, if it fits within the architecture of the rest of your system. Will it improve capacity? That will depend on your overall system, but it sounds like a very worthy experiment.