Finagle and Akka, why not use them together? - scala

I have not used Finagle nor Akka in practice, but I have been reading a lot of about them.
Finagle being a RPC system and Akka a toolkit for highly concurrent applications, why all the people compare them as two possible solutions which cannot be used together? All searches I've done propose to use one or the other, no one proposes to use them together.
Finagle, for example, has a very interesting way of defining endpoints via thrift and its IDL. With this IDL we could define a custom endpoint and through scooge or whatever code generation tool, it would be possible to have a service with no effort. Also a client to connect to this service is created with a lot of common client issues automatically resolved (reconnection, timeout, retries, load-balancing, connection-pooling, ...).
Akka instead, solves a lot of concurrency headaches and it scales extremely well without all the complexities of hand controlled threading.
As a summary, why not use them together?:
Finagle + Thrift (with its IDL): It facilitates service design and development as well as deployment (which includes ease of scaling-out).
Akka: It uses all the server power through its Actor system and it scales extremely well if I change server properties (for example if it's deployed on EC2 and I convert my node from m1.small to m1.large).
What do you think?
NOTE: Asume that the issue of mapping Futures and Promises is resolved, as well as a mismatch between FuturePools and ExecutionContexts. The pattern would be to convert Finagle to the scala way of using Futures.

You are right in that service discovery and service implementation are orthogonal concerns, and I can follow your argument about using Finagle for the former and Akka for the latter. You could in principle use the two together without seeking a grand unification of futures, since you only need to send the service’s reply back to the requesting Actor in a message, i.e. you would need to add your own little “pipeTo” pattern on top of Twitter futures.

Related

How to configure tomcat in http4s

I wonder if there is any option to configure tomcat when using http4s server API. Tomcat builder allows to change some basic options, but besides those there is no much what can be set. Could I somehow provide a server.xml file? Or get access to tomcat instane?
Tomcat, AFAIR is a server running applications defined as WARs. That is: your app is not a server, your app is logic bound to functionalities provided by this external layer.
Http4s server talks to the external world directly, it manages requests lifecycle through FS2 streams, if you use it then you probably talk to DB through e.g. Doobie or some other Cats library, which also manages its own thread pools and transactions instead of relying on ThreadLocals and other JavaEE-like and JPA-like conventions.
Long story short: these models are incompatible.
You would have to rip-off 90% of Http4s to leave something that could be agnostic to implementation, and then wire it to Tomcat, but that would leave only, IDK, DSLs for building requests? And for that you'd have better chances to e.g. define things using Endpoints4s or Tapir and implementing interpreter which binds things to Tomcat.
But at that point there would be probably 0 benefits from using Tomcat or any other servlet container: in Java EE it is usually easy to monitor things because you have 1 thread per 1 ongoing request, and all DB connections and stuff is put into ThreadLocals as request-scoped things. Meanwhile virtually all IO monads use thread pools (separate for different boundaries) so your operation can span across several threads. All conventions that containers rely on to monitor things (which are quite inflexible performance-wise) go to hell.
Meanwhile Http4s has its own ways of tracking things (through middleware), similarly you can add instrumentation to DB queries. So things provided by servlet container are also available there, though it requires a bit of effort to configure them.
The bottom line is: if you are using Http4s, you don't need servlet container, and if you are using servlet container, then you don't have a nice integration with anything that is using any IO monad (scala.concurrent.Future, cats.effect.IO, monix.execution.Task, zio.ZIO, etc) as they aren't guaranteed to run whole operation in the same thread (invalidating a lot of assumptions made by certain Java frameworks).

Play framework web application: when to use Akka?

I'm transitioning from Java to Scala, and started using Play as application server. My Java legacy application (the one I'm trying to replace) is built on three layers: servlets, session beans and entity beans. I read that Akka actors would replace session beans, is that accurate? When is it appropriate to use Akka actors in a Play web application?
I don't think there is any thumb rule like convert Session Beans / Entity Beans to actors.
You may need to look at your requirements. It's worth considering what the actor model is used for: the actor model is
a concurrency model that avoids concurrent access to mutable state
using asynchronous communications mechanisms to provide concurrency
This is valuable because using shared state from multiple threads gets really hard, especially when there are relationships among different components of the shared state that must be kept synchronized.
However, if you have domain components in which:
You don't allow concurrency, OR
You don't allow mutable state (as in functional programming), OR
You must rely on some synchronous communications mechanism,
then the actor model will not provide much (if any) benefit.
Take a look at this URL if you haven't already http://www.infoq.com/news/2014/02/akka-ejbs-concurrency
To answer the second part of the question:
When is it appropriate to use Akka actors in a Play web application?
Scala is not just syntactical sugared Java but meant to write Functional Programming(FP). Scala makes writing functional code easier and more obvious. When you do this all your methods become functions(no state change is held).
From here when you do want to hold global state, you encapsulate it within an Actor and interact with it only via a messaging queue.. and therefore don't have to concern yourself with threading logic.
A great course to get started with this is https://www.coursera.org/course/progfun
A great book to get started with Scala is:
Scala for the Impatient by Cay S. Horstmann

Difference in message-passing model of Akka and Vert.x

Am a Scala programmer and understand Akka from a developer point of view. I have not looked into Akka library's code. Have read about the two types of actors in the Akka model - thread-based and event-based - but not having run Akka at large scale I dont have experience of configuring Akka for production. And am completely new to Vert.x. So, from the choices perspective to build a reactive application stack I want to know -
Is the message-passing model of Akka and Vert.x very different? How?
Are the data-structures behind Akka's actors and Vert.x's verticles to buffer messages very different?
In a superficial view they're really similar, although I personally consider more similar Vert.x ideas to some MQ system than to Akka... the Vert.x topology is more flat: A verticle share a message with other verticle and receive a response... instead Akka is more like a tree, where you've several actors, but you can supervise actors using other actor,..for simple projects maybe they're not so big deal, but for big projects you could appreciate a more "hierarchic system"...
Vert.x on the other hand, offer a better Interoperability between very popular languages*. For me that is a big point, where you would need to mix actors with a MQ system and dealing with more complexity, Vert.x makes it simple and elegant..so the answer, which is better?...depend, if your system will be constructed only over Scala, then Akka could be the best way...if you need communication with JavaScript, Ruby, Python, Java, etc... and don't need a complex hierarchy, then Vert.x is the way to go..
*(using JSON, which could be an advantage or disadvantage compared to)
Also you must consider that Vert.x is a full solution, TCP, HTTP server, routing, even WebSocket!!! That is pretty amazing because they offer a full stack and the API is very clean. If you choose Akka you would need use a framework like Play, Xitrum Ospray. Personally I don't like any of them.
Also remember that Vert.x is a not opinionated platform, you can use Akka or Kafka with it, for example, without almost any overhead. The way how every part of the system is decouple inside a verticle makes it so simple.
Vert.x is a big project with an amazing perspective but really new, if you need a solution now maybe it would not be the better option, fortunately you can learn both and use both in the same project.
After doing a bit of google search I have figured that at detailed comparison of Akka vs Vert.x has not yet been done ( atleast I cound't find it ).
Computation model:
Vert.x is based on Event Driven model.
Akka is based on Actor Model of concurrency,
Reactive Streams:
Vert.x has Reactive Streams builtin
Akka supports Reactive Streams via Akka Streaming. Akka has stream operators ( via Scala DSL ) that is very concise and clean.
HTTP Support
Vert.x has builtin support of creating network services ( HTTP, TCP etc )
Akka has Akka HTTP for that
Scala support
Vert.x is written in Java
Akka is written in Scala and its fun to work on
Remote services
Vert.x supports services, so we need to explicitly create services
Akka has Actors which can be deployed anywhere on the network, with support for clustering, replication, load-balancing, supervision etc.
References:
https://groups.google.com/forum/#!topic/vertx/ppSKmBoOAoQ
https://blog.openshift.com/building-distributed-and-event-driven-applications-in-java-or-scala-with-akka-on-openshift/
https://en.wikipedia.org/wiki/Vert.x
http://akka.io/

What actor based web frameworks are available for Scala?

I need to build very concurrent web service which will expose REST based API for JavaScript (front end) and Rails (back end). Web service will be suiting data access API to MongoDB.
I already wrote an initial implementation using NodeJS and would like to try Scala based solution. I'm also considering Erlang, for which every web framework is actor based.
So I'm looking for web framework explicitly build using Actors in order to support massive load of requests I'm very new to Scala and I don't quite understand how Actor might work if almost all frameworks for Scala are based on Java servlets which creates a thread on each request which will just exhaust all resources in my scenario.
If you're really going to have 10k+ long active connections at a time, then any standard Java application server/framework (maybe, except for Netty) will not work for you - all of them are consuming lots of memory (even if any kind of smart NIO is used). You'd better stick to a clustered event-loop based solution (like node.js that you've already tried), mongrel backed with zeroMQ, nginx with the mode for writing into MQ polled by Scala Actors, etc.
Among the Scala/Java frameworks, Lift has a good async support for REST (though it's not directly tied to actors). OTOH, LinkedIn uses Scalatra + stdlib actors for their REST services behind Signal ,and feels just fine.
Another option is Play framework. The latest 1.1 release supports Scala. It also supports akka as a module.
As far as Scalatra itself, they have been working on a new request
abstraction called SSGI (akin to the Servlet/Rack/WSGI/WAI layer),
that they said should ennable them to break from solely running as a
Servlet and also run on top of something built with Netty. See thread here.
http://github.com/scalatra/ssgi
There's some other interesting frameworks at the Scalatra level of simplicity since designed from the ground up to support asynchronous web services (won't tie up a thread per request):
https://github.com/jdegoes/blueeyes - Not a servlet; built on Netty.
("loosely inspired by ... Scalatra")
http://spray.cc/ - Built on Akka actors, Akka Mist. Servlet 3.0 or Jetty continuations
("spray was heavily inspired by BlueEyes and Scalatra.")
And at a lower level:
https://github.com/rschildmeijer/loft - "Continuation based non-blocking, asynchronous, single threaded web
server."
Not production-ready, but rather interesting-looking. Continuations require the compiler plugin.
http://liftweb.net/ Indeed, a request starts off as a servlet, but then lift uses comet support found in many servlet containers to break away from the thread, keeping the request context (which the container then doesn't destroy) which then can be used to output data in actors.
http://akkasource.org also has support for rest, but it will block the thread until the actor finishes with its work

When should I use RequestFactory vs GWT-RPC?

I am trying to figure out if I should migrate my gwt-rpc calls to the new GWT2.1 RequestFactory cals.
Google documentation vaguely mentions that RequestFactory is a better client-server communication method for "data-oriented services"
What I can distill from the documentation is that there is a new Proxy class that simplifies the communication (you don't pass back and forth the actual entity but just the proxy, so it is lighter weight and easier to manage)
Is that the whole point or am I missing something else in the big picture?
The big difference between GWT RPC and RequestFactory is that the RPC system is "RPC-by-concrete-type" while RequestFactory is "RPC-by-interface".
RPC is more convenient to get started with, because you write fewer lines of code and use the same class on both the client and the server. You might create a Person class with a bunch of getters and setters and maybe some simple business logic for further slicing-and-dicing of the data in the Person object. This works quite well until you wind up wanting to have server-specific, non-GWT-compatible, code inside your class. Because the RPC system is based on having the same concrete type on both the client and the server, you can hit a complexity wall based on the capabilities of your GWT client.
To get around the use of incompatible code, many users wind up creating a peer PersonDTO that shadows the real Person object used on the server. The PersonDTO just has a subset of the getters and setters of the server-side, "domain", Person object. Now you have to write code that marshalls data between the Person and PersonDTO object and all other object types that you want to pass to the client.
RequestFactory starts off by assuming that your domain objects aren't going to be GWT-compatible. You simply declare the properties that should be read and written by the client code in a Proxy interface, and the RequestFactory server components take care of marshaling the data and invoking your service methods. For applications that have a well-defined concept of "Entities" or "Objects with identity and version", the EntityProxy type is used to expose the persistent identity semantics of your data to the client code. Simple objects are mapped using the ValueProxy type.
With RequestFactory, you pay an up-front startup cost to accommodate more complicated systems than GWT RPC easily supports. RequestFactory's ServiceLayer provides significantly more hooks to customize its behavior by adding ServiceLayerDecorator instances.
I went through a transition from RPC to RF. First I have to say my experience is limited in that, I used as many EntityProxies as 0.
Advantages of GWT RPC:
It's very easy to set-up, understand and to LEARN!
Same class-based objects are used on the client and on the server.
This approach saves tons of code.
Ideal, when the same model objects (and POJOS) are used on either client and server, POJOs == MODEL OBJECTs == DTOs
Easy to move stuff from the server to client.
Easy to share implementation of common logic between client and server (this can turn out as a critical disadvantage when you need a different logic).
Disadvatages of GWT RPC:
Impossible to have different implementation of some methods for server and client, e.g. you might need to use different logging framework on client and server, or different equals method.
REALLY BAD implementation that is not further extensible: most of the server functionality is implemented as static methods on a RPC class. THAT REALLY SUCKS.
e.g. It is impossible to add server-side errors obfuscation
Some security XSS concerns that are not quite elegantly solvable, see docs (I am not sure whether this is more elegant for RequestFactory)
Disadvantages of RequestFactory:
REALLY HARD to understand from the official doc, what's the merit of it! It starts right at completely misleading term PROXIES - these are actually DTOs of RF that are created by RF automatically. Proxies are defined by interfaces, e.g. #ProxyFor(Journal.class). IDE checks if there exists corresponding methods on Journal. So much for the mapping.
RF will not do much for you in terms of commonalities of client and server because
On the client you need to convert "PROXIES" to your client domain objects and vice-versa. This is completely ridiculous. It could be done in few lines of code declaratively, but there's NO SUPPORT FOR THAT! If only we could map our domain objects to proxies more elegantly, something like JavaScript method JSON.stringify(..,,) is MISSING in RF toolbox.
Don't forget you are also responsible for setting transferable properties of your domain objects to proxies, and so on recursively.
POOR ERROR HANDLING on the server and - Stack-traces are omitted by default on the server and you re getting empty useless exceptions on the client. Even when I set custom error handler, I was not able to get to low-level stack traces! Terrible.
Some minor bugs in IDE support and elsewhere. I filed two bug requests that were accepted. Not an Einstein was needed to figure out that those were actually bugs.
DOCUMENTATION SUCKS. As I mentioned proxies should be better explained, the term is MISLEADING. For the basic common problems, that I was solving, DOCS IS USELESS. Another example of misunderstanding from the DOC is connection of JPA annotations to RF. It looks from the succinct docs that they kinda play together, and yes, there is a corresponding question on StackOverflow. I recommend to forget any JPA 'connection' before understanding RF.
Advantages of RequestFactory
Excellent forum support.
IDE support is pretty good (but is not an advantage in contrast with RPC)
Flexibility of your client and server implementation (loose coupling)
Fancy stuff, connected to EntityProxies, beyond simple DTOs - caching, partial updates, very useful for mobile.
You can use ValueProxies as the simplest replacement for DTOs (but you have to do all not so fancy conversions yourself).
Support for Bean Validations JSR-303.
Considering other disadvantages of GWT in general:
Impossible to run integration tests (GWT client code + remote server) with provided JUnit support <= all JSNI has to be mocked (e.g. localStorage), SOP is an issue.
No support for testing setup - headless browser + remote server <= no simple headless testing for GWT, SOP.
Yes, it is possible to run selenium integration tests (but that's not what I want)
JSNI is very powerful, but at those shiny talks they give at conferences they do not talk much about that writing JSNI codes has some also some rules. Again, figuring out how to write a simple callback was a task worth of true researcher.
In summary, transition from GWT RPC to RequestFactory is far from WIN-WIN situation,
when RPC mostly fits your needs. You end up writing tons conversions from client domain objects to proxies and vice-versa. But you get some flexibility and robustness of your solution. And support on the forum is excellent, on Saturday as well!
Considering all advantages and disadvantages I just mentioned, it pays really well to think in advance whether any of these approaches actually brings improvement to your solution and to your development set-up without big trade-offs.
I find the idea of creating Proxy classes for all my entities quite annoying. My Hibernate/JPA pojos are auto-generated from the database model. Why do I now need to create a second mirror of those for RPC? We have a nice "estivation" framework that takes care of "de-hibernating" the pojos.
Also, the idea of defining service interfaces that don't quite implement the server side service as a java contract but do implement the methods - sounds very J2EE 1.x/2.x to me.
Unlike RequestFactory which has poor error handling and testing capabilities (since it processes most of the stuff under the hood of GWT), RPC allows you to use a more service oriented approach. RequestFactory implements a more modern dependency injection styled approach that can provide a useful approach if you need to invoke complex polymorphic data structures. When using RPC your data structures will need to be more flat, as this will allow your marshaling utilities to translate between your json/xml and java models. Using RPC also allows you to implement more robust architecture, as quoted from the gwt dev section on Google's website.
"Simple Client/Server Deployment
The first and most straightforward way to think of service definitions is to treat them as your application's entire back end. From this perspective, client-side code is your "front end" and all service code that runs on the server is "back end." If you take this approach, your service implementations would tend to be more general-purpose APIs that are not tightly coupled to one specific application. Your service definitions would likely directly access databases through JDBC or Hibernate or even files in the server's file system. For many applications, this view is appropriate, and it can be very efficient because it reduces the number of tiers.
Multi-Tier Deployment
In more complex, multi-tiered architectures, your GWT service definitions could simply be lightweight gateways that call through to back-end server environments such as J2EE servers. From this perspective, your services can be viewed as the "server half" of your application's user interface. Instead of being general-purpose, services are created for the specific needs of your user interface. Your services become the "front end" to the "back end" classes that are written by stitching together calls to a more general-purpose back-end layer of services, implemented, for example, as a cluster of J2EE servers. This kind of architecture is appropriate if you require your back-end services to run on a physically separate computer from your HTTP server."
Also note that setting up a single RequestFactory service requires creating around 6 or so java classes where as RPC only requires 3. More code == more errors and complexity in my book.
RequestFactory also has a little bit more overhead during the request processing, as it has to marshal serialization between the data proxies and actual java models. This added interface adds extra processing cycles which can really add up in an enterprise or production environment.
I also do not believe that RequestFactory services are serialization like RPC services.
All in all after using both for some time now, i always go with RPC as its more lightweight, easier to test and debug, and faster then using a RequestFactory. Although RequestFactory might be more elegant and extensible then its RPC counter part. The added complexity does not make it a better tool necessary.
My opinion is that the best architecture is to use two web apps , one client and one server. The server is a simple lightweight generic java webapp that uses the servlet.jar library. The client is GWT. You make RESTful request via GWT-RPC into the server side of the client web application. The server side of the client is just a pass though to apache http client which uses a persistant tunnel into the request handler you have running as a single servlet in your server servlet web application. The servlet web application should contain your database application layer (hibernate, cayenne, sql etc..) This allows you to fully divorce the database object models from the actual client providing a much more extensible and robust way to develop and unit test your application. Granted it requires a tad bit of initial setup time, but in the end allows you to create a dynamic request factory sitting outside of GWT. This allows you to leverage the best of both worlds. Not to mention being able to test and make changes to your server side without having to have the gwt client compiled or build.
I think it's really helpful if you have a heavy pojo on the client side, for example if you use Hibernate or JPA entities.
We adopted another solution, using a Django style persistence framework with very light entities.
The only caveat I would put in is that RequestFactory uses the binary data transport (deRPC maybe?) and not the normal GWT-RPC.
This only matters if you are doing heavy testing with SyncProxy, Jmeter, Fiddler, or any similar tool that can read/evaluate the contents of the HTTP request/response (like GWT-RPC), but would be more challenging with deRPC or RequestFactory.
We have have a very large implementation of GWT-RPC in our project.
Actually we have 50 Service interfaces with many methods each, and we have problems with the size of TypeSerializers generated by the compiler that turns our JS code huge.
So we are analizing to move towards RequestFactory.
I have been read for a couple of days digging into the web and trying to find what other people are doing.
The most important drawback I saw, and maybe I could be wrong, is that with RequestFactory your are no longer in control of the communication between your Server Domain objects and your client ones.
What we need is apply the load / save pattern in a controlled way. I mean, for example client receive the whole object graph of objects belonging to a specific transaction, do his updates and them send the whole back to the server. The server will be responsible for doing validation, compare old with new values and do persistance. If 2 users from different sites gets the same transaction and do some updates, the resulting transaction shouldn't be the merged one. One of the updates should fail in my scenario.
I don't see that RequestFactory helps supporting this kind of processing.
Regards
Daniel
Is it fair to say that when considering a limited MIS application, say with 10-20 CRUD'able business objects, and each with ~1-10 properties, that really it's down to personal preference which route to go with?
If so, then perhaps projecting how your application is going to scale could be the key in choosing your route GWT RPC or RequestFactory:
My application is expected to stay with that relatively limited number of entities but will massively increase in terms of their numbers. 10-20 objects * 100,000 records.
My application is going to increase significantly in the breadth of entities but the relative numbers involved of each will remain low. 5000 objects * 100 records.
My application is expected to stay with that relatively limited number of entities AND will stay in relatively low numbers of e.g. 10-20 objects * 100 records
In my case, I'm at the very starting point of trying to make this decision. Further complicated by having to change UI client side architecture as well as making the transport choice. My previous (significantly) large scale GWT UI used the Hmvc4Gwt library, which has been superseded by the GWT MVP facilities.