Play framework to build application with no UI and need to accept requests using REST and ipc and/or message queues - scala

I have to build a component that runs in a jvm, uses MongoDB as database and doesn't need a UI. It will be integrated into other products. I'm planning to build this using scala and related tools.
My first thoughts are to just let it expose REST API and let other products integrate using the API. While this is acceptable for some products, it isn't for others due to performance reasons. So I have to enable other components to communicate to this using either http or ipc or message queues. How can achieve this without much duplication of business logic.
Would Play framework be the right choice for this even though there is no UI involved and there is a need to accept messages via http or ipc or message queues?

Using Play for that is ok but there are frameworks better fit for what you are planning to do, as you already said, play has a lot of support for frontend features you don't need.
It will not so much affect the runtime speed as the time you will need for programming, compiling, building and deployment.
There are some framworks that might fit you needs better:
Scalatra Nice, easy to use, integrates good with JavaEE-Stack http://www.scalatra.org/
Finatra Cool if you have the twitter stack running. Metrics and other stuff almost for free http://finatra.info/
Skinny Framework : Looks nice, never tried myself
Spray : Cool features to come, a little elitist

Related

Constructing a back-end suitable for app and web interface

Let's suppose I was going to design a platform like Airbnb. They have a website as well as native apps on various mobile platforms.
I've been researching app design, and from what I've gathered, the most effective way to do this is to build an API for the back-end, like a REST API using something like node.js, and SQL or mongoDB. The font-end would then be developed natively on each platform which makes calls to the API endpoints to display and update data. This design sounds like it works great for mobile development, but what would be the best way to construct a website that uses the same API?
There are three approaches I can think of:
Use something completely client-side like AangularJS to create a single-page application front end which ties directly into the REST API back-end. This seems OK, but I don't really like the idea of a single-page application and would prefer a more traditional approach
Create a normal web application (in PHP, python, node.js, etc), but rather than tying the data to a typical back end like mySQL, it would basically act as an interface to the REST API. For example when you visit www.example.com/video/3 the server would then call the corresponding REST endpoint (ie api.example.com/video/3/show) and render the HTML for the user. This seems like kind of a messy approach, especially since most web frameworks are designed to work with a SQL backend.
Tie the web interface in directly in with the REST api. For example, The endpoint example.com/video/3/show can return both html or json depending on the HTTP headers. The advantage is that you can share most of your code, however the code would become more complex and you can't decouple your web interface from the API.
What is the best approach for this situation? Do you choose to completely decouple the web application from the REST API? If so, how do you elegantly interface between the two? Or do you choose to merge the REST API and web interface into one code base?
It's a usually a prefered way but one should have a good command of SPA.
Adds a redundant layer from performance perspective. You will basically make twice more requests all the time.
This might work with super simple UI, when it's just a matter of serializing your REST API result into different formats but I believe you want rich UI and going this way will be a nightmare from both implementation and maintainance perspective.
SUGGESTED SOLUTION:
Extract your core logic. Put it into a separate project/assembly and reuse it both in your REST API and UI. This way you will be able to reuse the business logic which is the same both for UI and REST API and keep the representation stuff separately which is different for UI and REST API.
Hope it helps!
Both the first and the second option seem reasonable to me, in the sense that there are certain advantages in decoupling the backend API from the clients (including your web site). For example, you could have dedicated teams per each project, if there's a bug on the web/api you'd only have to release that project, and not both.
Say you're going public with your API. If you're releasing a version that breaks backwards compatibility, with a decoupled web app you'd be able to detect that earlier (say staging environment, given you're developing both in-house). However, if they were tightly coupled they'd probably work just fine, and you'll find out you've broken the other clients only once you release in production.
I would say the first option is preferable one as a generic approach. SPA first load delay problem can be resolved with server side rendering technique.
For second option you will have to face scalability, cpu performance, user session(not on rest api of course because should be stateless), caching issues both on your rest api services and normal website node instances (maybe caching not in all the cases). In most of the cases this intermediate backend layer is just unnecessary, there is not any technical limitation for doing all the stuff in the recent versions of browsers.
The third option violates the separation of concerns, in your case presentational from data models/bussines logic.

Adapter Proxy for Restful APIs

this is a general 'what technologies are available' question.
My company provides a web application with a RESTful API. However, it is too slow for my needs and some of the results are in an awkward format.
I want to wrap their restful server with a proxy/adapter server, so when you connect to the proxy you get the RESTful API I wish the real one provides.
So it needs to do a few things:
passthrough most requests
cache some requests
do some extra requests on the original server to detect if a request is cacheable
for instance: there is a request for a field in a record: GET /records/id/field which might be slow, but there is a fingerprint request GET /records/id/fingerprint which is always fast. If there exists a cache of GET /records/1/field2 for the fingerprint feedbeef, then I need to check the original server still has the fingerprint feed beef before serving the cached version.
fix headers for some responses - e.g. content-type, based upon the path
do stream processing on some large content, for instance
GET /records/id/attachments/1234
returns a 100Mb log file in text format
remove null characters from files
optionally recode the log to filter out irrelevant lines, reducing the load on the client
cache the filtered version for later requests.
While I could modify the client to achieve this functionality, such code would not be re-usable for other clients (different languages), and complicates the client logic.
I had a look at whether clojure/ring could do it, and while there is a nice little proxy middleware for it, it doesn't handle streaming content as far as I can tell - the whole 100Mb would have to be downloaded. Also it doesn't include any cache logic yet.
I took a look at whether squid could do it, but I'm not familiar with the technology, and it seems mostly concerned with passing through requests rather than modifying them on the fly.
I'm looking for hints where I might find the correct technology to implement this. I'm mostly language agnostic if learning a new language gets me access to a really simple way to do it.
I believe you should choose a platform that is easier for you to implement your custom business logic on. The following web application frameworks provide easy connectivity with REST APIs, and allow you to create a web application that could work as a REST proxy:
Play framework (Java + Scala)
express + Node.js (Javascript)
Sinatra (Ruby)
I'm more familiar with Play, of which I know it provides utilities for caching you could find useful, and is also extendable by a number of plugins.
If you are familiar with Scala, you could have a also have a look at Finagle. It is a framework build be Twitter's infrastructure team to provide protocol-agnostic connectivity. It might be an overkill for REST to REST proxy, but it provides abstractions you might find useful.
You could also look at some 3rd party services like Apitools, which allows to create a proxy programmatically (in lua). Apirise is a similar service (of which I'm a co-founder) that intends to do provide similar functionalities with a user-friendly UI.
Beeceptor does exactly what you want. It plugs in-between your web-app and original API to route requests.
For your use-case of caching a few responses, you can create a rule. That way it shall not hit the original endpoint.
The requests to original APIs can be mocked, and you can inspect response
You can simulate delays.
(Note: it is a shameless plug, I am the author of Beeceptor and thought it should help you and other developers.)
https://github.com/nodejitsu/node-http-proxy is looking useful - although I don't yet know if it can stream process for transcoding.

Using Akka to seperate a Lucene service form a website

I'm developing a website with a Lucene backend. Lucene connects directly to index files, making it difficult to develop the website from machines other than the index machine. Traditional databases have a server running to provide an intermediary between the raw data and the application.
I would like to create such an intermediary between Lucene and my web application. On first thought, Akka seems like the right tool, and I think I would use Akka futures or typed actors to perform the call. However, the Akka Typed Actors page warns:
"A bit more background: TypedActors can very easily be abused as RPC, and that is an abstraction which is well-known to be leaky. Hence TypedActors are not what we think of first when we talk about making highly scalable concurrent software easier to write correctly. They have their niche, use them sparingly."
I think the point is that RPC promotes centralization, but is my plan a good one or an abuse of Akka?
Why not use solr? It provides the application to manage your lucene indexes (as it is basically lucene with an application over the top to interact with the data. It would be easier than dealing with actors and it should provide everything you need.

When should I use RequestFactory vs GWT-RPC?

I am trying to figure out if I should migrate my gwt-rpc calls to the new GWT2.1 RequestFactory cals.
Google documentation vaguely mentions that RequestFactory is a better client-server communication method for "data-oriented services"
What I can distill from the documentation is that there is a new Proxy class that simplifies the communication (you don't pass back and forth the actual entity but just the proxy, so it is lighter weight and easier to manage)
Is that the whole point or am I missing something else in the big picture?
The big difference between GWT RPC and RequestFactory is that the RPC system is "RPC-by-concrete-type" while RequestFactory is "RPC-by-interface".
RPC is more convenient to get started with, because you write fewer lines of code and use the same class on both the client and the server. You might create a Person class with a bunch of getters and setters and maybe some simple business logic for further slicing-and-dicing of the data in the Person object. This works quite well until you wind up wanting to have server-specific, non-GWT-compatible, code inside your class. Because the RPC system is based on having the same concrete type on both the client and the server, you can hit a complexity wall based on the capabilities of your GWT client.
To get around the use of incompatible code, many users wind up creating a peer PersonDTO that shadows the real Person object used on the server. The PersonDTO just has a subset of the getters and setters of the server-side, "domain", Person object. Now you have to write code that marshalls data between the Person and PersonDTO object and all other object types that you want to pass to the client.
RequestFactory starts off by assuming that your domain objects aren't going to be GWT-compatible. You simply declare the properties that should be read and written by the client code in a Proxy interface, and the RequestFactory server components take care of marshaling the data and invoking your service methods. For applications that have a well-defined concept of "Entities" or "Objects with identity and version", the EntityProxy type is used to expose the persistent identity semantics of your data to the client code. Simple objects are mapped using the ValueProxy type.
With RequestFactory, you pay an up-front startup cost to accommodate more complicated systems than GWT RPC easily supports. RequestFactory's ServiceLayer provides significantly more hooks to customize its behavior by adding ServiceLayerDecorator instances.
I went through a transition from RPC to RF. First I have to say my experience is limited in that, I used as many EntityProxies as 0.
Advantages of GWT RPC:
It's very easy to set-up, understand and to LEARN!
Same class-based objects are used on the client and on the server.
This approach saves tons of code.
Ideal, when the same model objects (and POJOS) are used on either client and server, POJOs == MODEL OBJECTs == DTOs
Easy to move stuff from the server to client.
Easy to share implementation of common logic between client and server (this can turn out as a critical disadvantage when you need a different logic).
Disadvatages of GWT RPC:
Impossible to have different implementation of some methods for server and client, e.g. you might need to use different logging framework on client and server, or different equals method.
REALLY BAD implementation that is not further extensible: most of the server functionality is implemented as static methods on a RPC class. THAT REALLY SUCKS.
e.g. It is impossible to add server-side errors obfuscation
Some security XSS concerns that are not quite elegantly solvable, see docs (I am not sure whether this is more elegant for RequestFactory)
Disadvantages of RequestFactory:
REALLY HARD to understand from the official doc, what's the merit of it! It starts right at completely misleading term PROXIES - these are actually DTOs of RF that are created by RF automatically. Proxies are defined by interfaces, e.g. #ProxyFor(Journal.class). IDE checks if there exists corresponding methods on Journal. So much for the mapping.
RF will not do much for you in terms of commonalities of client and server because
On the client you need to convert "PROXIES" to your client domain objects and vice-versa. This is completely ridiculous. It could be done in few lines of code declaratively, but there's NO SUPPORT FOR THAT! If only we could map our domain objects to proxies more elegantly, something like JavaScript method JSON.stringify(..,,) is MISSING in RF toolbox.
Don't forget you are also responsible for setting transferable properties of your domain objects to proxies, and so on recursively.
POOR ERROR HANDLING on the server and - Stack-traces are omitted by default on the server and you re getting empty useless exceptions on the client. Even when I set custom error handler, I was not able to get to low-level stack traces! Terrible.
Some minor bugs in IDE support and elsewhere. I filed two bug requests that were accepted. Not an Einstein was needed to figure out that those were actually bugs.
DOCUMENTATION SUCKS. As I mentioned proxies should be better explained, the term is MISLEADING. For the basic common problems, that I was solving, DOCS IS USELESS. Another example of misunderstanding from the DOC is connection of JPA annotations to RF. It looks from the succinct docs that they kinda play together, and yes, there is a corresponding question on StackOverflow. I recommend to forget any JPA 'connection' before understanding RF.
Advantages of RequestFactory
Excellent forum support.
IDE support is pretty good (but is not an advantage in contrast with RPC)
Flexibility of your client and server implementation (loose coupling)
Fancy stuff, connected to EntityProxies, beyond simple DTOs - caching, partial updates, very useful for mobile.
You can use ValueProxies as the simplest replacement for DTOs (but you have to do all not so fancy conversions yourself).
Support for Bean Validations JSR-303.
Considering other disadvantages of GWT in general:
Impossible to run integration tests (GWT client code + remote server) with provided JUnit support <= all JSNI has to be mocked (e.g. localStorage), SOP is an issue.
No support for testing setup - headless browser + remote server <= no simple headless testing for GWT, SOP.
Yes, it is possible to run selenium integration tests (but that's not what I want)
JSNI is very powerful, but at those shiny talks they give at conferences they do not talk much about that writing JSNI codes has some also some rules. Again, figuring out how to write a simple callback was a task worth of true researcher.
In summary, transition from GWT RPC to RequestFactory is far from WIN-WIN situation,
when RPC mostly fits your needs. You end up writing tons conversions from client domain objects to proxies and vice-versa. But you get some flexibility and robustness of your solution. And support on the forum is excellent, on Saturday as well!
Considering all advantages and disadvantages I just mentioned, it pays really well to think in advance whether any of these approaches actually brings improvement to your solution and to your development set-up without big trade-offs.
I find the idea of creating Proxy classes for all my entities quite annoying. My Hibernate/JPA pojos are auto-generated from the database model. Why do I now need to create a second mirror of those for RPC? We have a nice "estivation" framework that takes care of "de-hibernating" the pojos.
Also, the idea of defining service interfaces that don't quite implement the server side service as a java contract but do implement the methods - sounds very J2EE 1.x/2.x to me.
Unlike RequestFactory which has poor error handling and testing capabilities (since it processes most of the stuff under the hood of GWT), RPC allows you to use a more service oriented approach. RequestFactory implements a more modern dependency injection styled approach that can provide a useful approach if you need to invoke complex polymorphic data structures. When using RPC your data structures will need to be more flat, as this will allow your marshaling utilities to translate between your json/xml and java models. Using RPC also allows you to implement more robust architecture, as quoted from the gwt dev section on Google's website.
"Simple Client/Server Deployment
The first and most straightforward way to think of service definitions is to treat them as your application's entire back end. From this perspective, client-side code is your "front end" and all service code that runs on the server is "back end." If you take this approach, your service implementations would tend to be more general-purpose APIs that are not tightly coupled to one specific application. Your service definitions would likely directly access databases through JDBC or Hibernate or even files in the server's file system. For many applications, this view is appropriate, and it can be very efficient because it reduces the number of tiers.
Multi-Tier Deployment
In more complex, multi-tiered architectures, your GWT service definitions could simply be lightweight gateways that call through to back-end server environments such as J2EE servers. From this perspective, your services can be viewed as the "server half" of your application's user interface. Instead of being general-purpose, services are created for the specific needs of your user interface. Your services become the "front end" to the "back end" classes that are written by stitching together calls to a more general-purpose back-end layer of services, implemented, for example, as a cluster of J2EE servers. This kind of architecture is appropriate if you require your back-end services to run on a physically separate computer from your HTTP server."
Also note that setting up a single RequestFactory service requires creating around 6 or so java classes where as RPC only requires 3. More code == more errors and complexity in my book.
RequestFactory also has a little bit more overhead during the request processing, as it has to marshal serialization between the data proxies and actual java models. This added interface adds extra processing cycles which can really add up in an enterprise or production environment.
I also do not believe that RequestFactory services are serialization like RPC services.
All in all after using both for some time now, i always go with RPC as its more lightweight, easier to test and debug, and faster then using a RequestFactory. Although RequestFactory might be more elegant and extensible then its RPC counter part. The added complexity does not make it a better tool necessary.
My opinion is that the best architecture is to use two web apps , one client and one server. The server is a simple lightweight generic java webapp that uses the servlet.jar library. The client is GWT. You make RESTful request via GWT-RPC into the server side of the client web application. The server side of the client is just a pass though to apache http client which uses a persistant tunnel into the request handler you have running as a single servlet in your server servlet web application. The servlet web application should contain your database application layer (hibernate, cayenne, sql etc..) This allows you to fully divorce the database object models from the actual client providing a much more extensible and robust way to develop and unit test your application. Granted it requires a tad bit of initial setup time, but in the end allows you to create a dynamic request factory sitting outside of GWT. This allows you to leverage the best of both worlds. Not to mention being able to test and make changes to your server side without having to have the gwt client compiled or build.
I think it's really helpful if you have a heavy pojo on the client side, for example if you use Hibernate or JPA entities.
We adopted another solution, using a Django style persistence framework with very light entities.
The only caveat I would put in is that RequestFactory uses the binary data transport (deRPC maybe?) and not the normal GWT-RPC.
This only matters if you are doing heavy testing with SyncProxy, Jmeter, Fiddler, or any similar tool that can read/evaluate the contents of the HTTP request/response (like GWT-RPC), but would be more challenging with deRPC or RequestFactory.
We have have a very large implementation of GWT-RPC in our project.
Actually we have 50 Service interfaces with many methods each, and we have problems with the size of TypeSerializers generated by the compiler that turns our JS code huge.
So we are analizing to move towards RequestFactory.
I have been read for a couple of days digging into the web and trying to find what other people are doing.
The most important drawback I saw, and maybe I could be wrong, is that with RequestFactory your are no longer in control of the communication between your Server Domain objects and your client ones.
What we need is apply the load / save pattern in a controlled way. I mean, for example client receive the whole object graph of objects belonging to a specific transaction, do his updates and them send the whole back to the server. The server will be responsible for doing validation, compare old with new values and do persistance. If 2 users from different sites gets the same transaction and do some updates, the resulting transaction shouldn't be the merged one. One of the updates should fail in my scenario.
I don't see that RequestFactory helps supporting this kind of processing.
Regards
Daniel
Is it fair to say that when considering a limited MIS application, say with 10-20 CRUD'able business objects, and each with ~1-10 properties, that really it's down to personal preference which route to go with?
If so, then perhaps projecting how your application is going to scale could be the key in choosing your route GWT RPC or RequestFactory:
My application is expected to stay with that relatively limited number of entities but will massively increase in terms of their numbers. 10-20 objects * 100,000 records.
My application is going to increase significantly in the breadth of entities but the relative numbers involved of each will remain low. 5000 objects * 100 records.
My application is expected to stay with that relatively limited number of entities AND will stay in relatively low numbers of e.g. 10-20 objects * 100 records
In my case, I'm at the very starting point of trying to make this decision. Further complicated by having to change UI client side architecture as well as making the transport choice. My previous (significantly) large scale GWT UI used the Hmvc4Gwt library, which has been superseded by the GWT MVP facilities.

Need opinion regarding design/architecture of a web application

I am working on a web application which needs to get data from some local and some non local resources and then display it. As it could take arbitrary amount of time to get the data from these resources I am thinking of using the actors concept so that each actor is responsible for getting data from the respective resource. The request thread will wait for each actor to finish its task and then use ajax to update only the portion of the web page that is dependent on that data. This way user will start seeing the data as soon as it is received rather than wait for all of them to finish and then get a first look at the data.
I am planning to look into scala/lift framework for this. I have read some articles on the web for scala/lift and want to explore if this is the correct way to approach this problem and also if scala/lift is good platform of choice. I have worked in Java and C# previously. Any opinions, comments, suggestions are welcome.
Thanks,
gary
Take a look a message queue technology like Java's JMS. Message queues allow you to handle long running background tasks asynchronously and reliably. This is the technique sites like Flickr and YouTube use to do media transcoding asynchronously. You can use a Java EE server, or a JMS technology like Apache's ActiveMQ, and then layer your Scala/Lift code on top of it.
Richard Monson-Haefel's book on JMS covers it well.
For more general help with web site scaling and construction, take a look at Todd Hoff's excellent blog, highscalability.com/. There are some good pointers to using message queues to offload long-running tasks this way.
BTW, Twitter uses Scala for something much like what you're considering. Here's an interview with some of their developers; they describe one way they use Scala:
Robey Pointer: A lot of our architecture is based on letting Rails do what it does best, which is the AJAX, the web front ends, the website—what the user sees. Anything we can offload out of the request/response cycle, we do. So we queue those tasks into a messaging system and have back-end daemons handle them.
If the non-local resources originate at some other service or system, Event Driven Architecture might work for you. Instead of pulling from the non-local resources you could set up this web-application as a subscriber to the events published by these services. Upon receiving a message regarding part of its functionality it would cache locally the data it's interested in. This should let you escape the issue of asynchronous update of parts of the page (all data would be accessible locally).
Udi Dahan blogs about this approach a lot and is also an author of a .NET message bus (NServiceBus) that can be used in such scenarios. See for example http://msdn.microsoft.com/en-us/architecture/aa699424.aspx
Actors would be a way to go. You're essentially setting up a light weight version of JMS. And Lift does the comet stuff very well.
In addition to the Scala actors, and the Lift Actors, you also have akka actors. When Scala Swarm becomes production ready you'll be ready for that too.
If the delayed information is distinct from the information that needs to display immediately, with Lift you can use a LazyLoad snippet that has the longer-running computation (calling the web service) as part of its logic. Lift will take care of inserting it into the page when its ready.