Easiest form of process communication in Scala - scala

What I want to do: I want to add communication capabilities to a couple of applications (soon to be jar libraries for Java) in Scala, and I want to do it in the most painless way, with no Tomcat, wars, paths for GET requests, RPC servers, etc.
What I have done: I've been checking a number of libraries, like Jetty, JAX-RS, Jackson, etc. But then I see the examples and they usually involve many different folders for configuration, WSDL files, etc. Most of the examples lack a main method and I don't have a clear picture about how many additional requirements may they have (e.g. Tomcat).
What I am planning to do: I'm considering to simply open a socket on the "server" to listen, then connect with the "client" and transfer some JSON, in both directions. This should be fairly standard so that I can use other programming languages in compatible ways (e.g. Python).
What I am asking: I would like to know whether there is some library that makes this easier. Not necessarily using raw sockets, but setting up some process communication in just a few lines, maybe not as simple as Node.js, but something similar.
Bonus: It would be cool to
be able to use other programming languages (e.g. Python) by using open standards
have authentication
But I don't really need any of those at this point.

I think you need RPC client/server system, I would suggest to take one of these two:
Finagle - super flexible and powerful RPC client/server from Finagle. You can define your service with Thrift, and it will generate stubs for client/server in scala. With Thrift it should be straightforward to add Python support.
Spray - much smaller library, focused on creating REST services. It's not so powerful as Finagle, however much easier. And REST allows you to use any other clients
Remotely - an elegant RPC system for reasonable people. Interesting and very promising project, however maybe difficult to start with because of extensive Scalaz+Shapeless+Macro usage

Honestly if you want something that is cross-language compatible, simple, straightforward, and concise then you do not want to use plain old sockets!
Check out dropwizard. It is amazing and I use it for small and large projects alike! It is usually configured by no more than a single configuration file. It supports authentication too!
Out of the box it gives you really great inter-process communication over JSON (using Jackson) and much much more. There is also pretty decent Scala support for dropwizard.
If you must roll your own then I'd recommend using Jackson for JSON parsing. It's super simple to use and also has great scala support.

If you've got a "controlled" use case where the client and server are on the same LAN and deployed in tandem, I'd (controversially) recommend Java RMI; it's dumb and JVM-specific (and uses a Java-specific protocol), but it's very simple to use.
If you need something more robust and cross-language, I'd recommend Apache Thrift. You write your interfaces in a platform-independent interface definition language, and it's very clear which changes are compatible and which are not; the thrift compiler generates skeleton interfaces for you to use, and then you just write an implementation of that interface and a couple of lines to start the server (as you can see from the example on the homepage). It's also got good support for async implementations if you need the performance. Thrift itself is reasonably standard and cross-platform, with its own binary protocol, or you can use JSON as a transport if you really want to (I'd recommend against that though).

RabbitMQ provides one easy way to do what you want without writing a server and implementing your own persistence, flow control, authentication, etc. You can brew or apt-get install it.
You start up a broker daemon process (i.e. manages message queues)
In the Scala producer, you can use Maven-provided Java API to send JSON strings without any fuss (e.g. no definition languages) to specified queues
Then in your other Scala program, connect to the broker, and listen for messages on the queue, and parse the incoming JSON
Because it is so popular, there are many tutorials online for different patterns you may want to use to distribute the messages, e.g. pub/sub, one-to-one, exactly-once delivery, etc.

Related

Rest APIs in Go - using net/http vs. a library like Gorilla

I see that Go itself has a package net/http, which is adequate at providing everything you need to get your own REST APIs up and running. However, there are a variety of frameworks; the most popular maybe say gorilla.
Considering that one of the main things I need to do going forward is to build REST APIs that will access some back-end storage (databases, caches, etc.) to perform CRUD operation, is it good to go with Go's standard library itself, or should I consider using some frameworks?
Normally, people write a new library or framework which solves the problem present in the existing library. But a lot of the frameworks also tend to make things worse when actual demands are simple.
So I have few questions:
Is the basic library in go lang good enough to support basic to moderate functionality for REST?
If I do end up using the inbuilt library and tomorrow have to change it to use some framework (like a gorilla), how difficult/costly would that be?
Are frameworks really addressing the problems or just making simple problems complex?
I would be extremely grateful for someone to share his thoughts here (who has been through making this choice himself) while I research more of my own.
The net/http package is probably sufficient for most scenarios, but if you want to ease your development, you should use a third-party package, such as Gorilla.
For example, net/http's ServeMux does a great job at routing incoming requests for fixed URL paths but for pretty paths which use variables, you will need to implement a custom multiplexer while using Gorilla, you are getting this for free.
Another example is if you want to specify RESTful resources with
proper HTTP methods, it is hard to work with the standard
http.ServeMux, while with Gorilla's mux package,
requests can be matched based on URL host, path, path prefix,
schemes, header and query values, and HTTP methods.
One of the great benefits of Gorilla is that it is fully compatible with the net/http package and can be substituted in the future.
See 1.
I totally encourage you to use Gorilla's toolkit to develop REST services.
The built-in net/http package is sufficient to build a complete REST API. However, some of the libraries can make building an API slightly easier, particularly if the REST API is complex. Changing from the built-in facilities to any decent framework is relatively straightforward - they generally accept handlers of the http.Handler type.
In the end, though, this is an extremely situational choice. The best thing you can do is examine each available solution, contrast and compare, and build a proof of concept with the top options if you possibly can. First-hand experience will guide you best.

Bidirectional messaging with request-response pattern (Protobuf for data serialization)

The base problem is as follows:
Two nodes communicating over a single socket;
Request-reply pattern;
Both nodes are client and server, i.e. node A makes requests to node B and node B makes requests to node A;
This would be easily solved with two sockets but I have only one. You can think of the problem as having to create multiple virtual sockets/channels over a single socket. Do you know of a well-tested messaging library that would support such use case?
In addition:
Support for C++ and Java;
The data serialization is handled by me using Google Protocol Buffers; and
If it is possible to achieve all this using RPC then great, otherwise I'll manage using protobuf.
I'd prefer not to have to write my own library and use something which is well-tested and well-support. I've looked into ZeroMQ but it doesn't seem to support the third requirement (request-reply pattern from A to B and from B to A simultaneously over a single socket). RabbitMQ is another possibility but may not support this requirement either. (I don't have experience with these libraries so maybe I'm wrong...)
(I wonder if I'm asking for too much.)
I don't have a complete answer for you, but I note that what you're asking for is emphatically supported by Cap'n Proto's RPC protocol: calls can be initiated in either direction.
It's not exactly what you want because:
Cap'n Proto has its own serialization format. The usage model is heavily inspired by Protobufs, but it isn't protobufs.
The Java implementation does not yet support RPC (planned, though).
If you care about Windows: the C++ implementation does not yet support RPC on MSVC, mostly due to MSVC being woefully behind on C++11 support. :/
But it might be worth checking out, for inspiration if nothing else.
(Disclosure: I'm the author of Cap'n Proto, and also the author of most of Google's open source Protobuf code.)

How to implement XEP-0289 FMUC plugin on a XMPP server?

I need to implement a distributed XMPP MuC application on the lines of XEP-0289 minus some of the features, in essence I want to have a bare bones implementation of the plugin, my concern is to address fault-tolerance and as of now I do not want to worry about the performance considerations as specified in 289.
I have looked into SleekXmpp as a tool to develop server side plugins, but don't know how comfortable it would be to use it for such an implementation, other options I have looked at are OpenFire , Tigase. I am comfortable with Python/Java and other key features to consider would be good documentation, ease of use etc keeping that in mind I would like to know what would be the preferred path to take for this development.
Any guidance will be appreciated.
you should be able to write a MUC component that includes FMUC (or similar). The general way to do this would be to use a library that supports XEP-0114 components (e.g. SleekXMPP (Python), Swiften (C++)) and implement MUC+FMUC through that. You haven't said what your concerns with SleekXMPP are, but it's a fairly well-respected library in the XMPP community, so seems a fair choice (I'd pick Swiften, but I'm biased as one of the authors).
Your second option (patching the server directly) isn't generally the XMPPish way of adding customisations (as it's vendor-specific), but should also work if you can find someone sufficiently familiar with the server code, or if you're willing to become so.
To achieve fault tolerance (assuming you mean resilience to server failures) you'd need to run your XMPP server clustered, and also cluster your FMUC implementation. With that done, the usual XMPP fail-over using SRV records in DNS should ensure other servers retry connections to another host.
On a side note, the next version of FMUC (XEP-0289) will have some of the features of the current revision stripped out, and a number of improvements made based on deployment experience, so if your work is not time-critical, it might be of benefit to you to read that when it's released. I also note that there exists at least one implementation of FMUC already (Isode's M-Link, on which I work), and there is interest from other vendors, so using the standard protocol might benefit you in terms of not re-inventing the wheel.

Why use backend server and RPC in Web Server infrastructure?

I'm interested in creating a web application and I've just done some research on the what makes a good web server. I've search through the facebook, twitter and foursquare. They share what software they used to build their infrastructure.
For me, some of the software used are new. I'd like to ask some questions here.
why create a back end server, isn't a web server running PHP is enough? Why use java/scala for backend? Do we really need RPC framework such as thrift/protocol buffer? What is that RPC framework used for? Is it used for communication between frontend and backend servers?
Really appreciate for those who answer my questions, or if there's some books you would suggest me to read.
Thank you.
It sounds as though you'd like to build a scalable backend infrastructure that ultimately will be used to do the following:
Serve content. This is the web server layer.
Perform some type of back end processing for user requests
coming in from the web server layer and communicate with the data store. Call this the application server layer.
Save session state and user data in a distributed, fault tolerant, eventually consistent key value store.
Also, it sounds as though you want to do this using commodity PC hardware.
This is a tall order.
Foursquare uses Scala with the Lift framework, jetty for their web server. Here's more. And more.
Facebook uses many different technologies. I know that for their data store they use HBase (they were using Cassandra)
Yahoo uses HBase to keep track of user statistics.
Twitter started as a Ruby-backend web site. They moved to Scala. Twitter is incrementally moving from mysql (I assume sharded) to Cassandra using their proprietary incremental database conversion tool.
As far as scaling on the application server and web server end, I know that what really counts is having a language that has the ability to spawn new user processes in user space and a manager process that assigns new worker processes the requests coming in. Think of it as running a very efficient company. The more work you've got coming in, the more people you hire. This is the Actor model. Some languages have actors built in,(erlang) others have actors implemented as frameworks(akka) or libraries (Scala native). Apparently, Scala's native actors are buggy so some people got together and implemented the akka framework for Scala and Java. There's a lot of discussion online regarding actors and which language and libraries one should use. Erlang has a lot going for it out of the box, however, Scala runs in the JVM and allows you to reuse a lot of the existing Java web libraries (which could have some issue if they happen to have static objects declared in them) Erlang has actors and the OTP libraries, but apparently does not have the rich libraries that Java has. So, for me it really boils down to Scala (with akka) or Erlang.
For the web server, with Scala, you can use any java app server. Foursquare uses jetty for most things. It's not written in Scala, but since Scala compiles down to bytecode that runs on the JVM it easily interops with any java app server.
People also say that there aren't that many Erlang programmers and that Erlang is harder to learn (functional programming vs imperative programming) Scala is functional and imperative at the same time (meaning you can do either)
Erlang is functional. Now, functional programming has a lot of things going for it as one expert functional programmer can get a lot more done than an expert imperative programmer. Yahoo stores was originally written and maintained in Lisp (functional language) by one man. On the other hand, imperative programming is easier to learn and used widely in a team setting. Imperative languages are good for some things, functional languages for
others. The right tool for the right job.
Back to the web server discussion, with Erlang, you can use yaws or you can run a framework (Chicago Boss)
Here's more on the Scala vs Erlang debate.
Another link.
More here.
And another.
Another opinion.
On the database end, you have a lot of choices. See here.
You can even eschew the database all-together and save your data in mnesia (Erlang's runtime data store)
My answer is not complete as this topic (scaling app servers, databases and web servers) is very complicated and full of debate. Some frameworks even blur the tiers (web server, application server, database) distinction and integrate a lot of the functionality of these layers within the framework itself.
For example, I encounter a lot of problems developing complex webapp using PHP only. PHP has no threads, php is lacking many good things that has scala, or another good modern language with rich syntax. PHP is slow comparing to compiled JVM language. PHP is less secure in my opinion. It is good to get a bunch of data and render as HTML page, but processing for high load is not its plus. RPC as you suggest serves as communication layer.

Any success using Apache Thrift on iPhone?

Has anybody done or seen a deployment of Apache Thrift in an iPhone app?
I am wondering if is a reasonable solution for a high-volume, low(er)-latency network service for iPhones compared to HTTP.
One noteworthy thing I found is a bug report about running Thrift on the iPhone, which seems to have been fixed. But that doesn't necessarily indicate that it's a done deal.
Thrift and HTTP aren't mutually exclusive. In fact thrift now ships with an HTTP transport implementation to use. It's also a really nice way to auto-generate server/client code that avoids a lot of marshalling/unmarshalling boilerplate while still being really fast. Its internal representation is basically binary JSON, so it's very similar to a RESTful web service (except being easier to code and much, much faster).
So... anyone able to answer the original question? If not, I'll dive in myself with thrift's included Cocoa support and see how it works on the iphone.
Just my two cents..
The accepted answer to this question, is an opinion to not use a technology, not an answer of whether it is possible.
Thrift, is an interface definition language, IDL, like Protobuf and Capt'n'Proto. They permit the definition of a client/server/server protocol which is platform agnostic. JSON and Plist don't provide the same level of type conformance.
Having previously lead an iOS team with 10Ms MAU using Google Protobuf v2.5 on iOS, Android, Windows, and server teams, I can attest that IDLs are great on mobile. Apple uses them for syncing iWork content.
My current team uses Thrift for iOS and Android clients, with a mostly Scala backend. I much prefer it to Protobuf.
We send Thrift payloads over HTTPS and WebSockets. Once you have defined (in Thrift) your our wire communication protocol (i.e. frame structure), it's very easy to evolve your APIs.
However, on iOS in particular there are some implementation issues. The current version of the library is quite poorly packaged, and if you hope to make an Objective-C framework (e.g. for iOS 8+), then you will not be able to out of the box with v0.9.2. This is because the library headers include local imports, (#import "TProtocol.h" instead of #import <Thrift/TProtocol.h>) with no umbrella headers. Worst of all, the Objective-C compiler generates very messy Objective-C classes, also including local imports from the Thrift library.
Some of these issues are pretty damning. It indicates to me that while use of an IDL is very much a good engineering decision, not many iOS teams are using Thrift, unless they're huge with the resources to write their own library.
I've always disliked frameworks that use a common interface definition that builds out both server and client code. It keeps both sides too much in lockstep where in reality server API changes must be very flexible in the versions of clients that are communicating with it.
There are helpful libraries that make JSON or PLIST communication over HTTP pretty easy, and decades of debugging and understanding the HTTP protocol and how to use it well. I would ignore that at your peril.
I have used thrift's objective c bindings for a large iPhone app with a few million users. As one of the posters mentioned we can use Http which gets the best of both worlds. However there is no asynchronous HTTP client for thrift. We had to build an event based wrapper to allow non-blocking I/O calls. The underlying layer still issues one call at a time which hit us in a big way because we have one server call that takes a long time but it does not block UI flow and another really fast one that does block UI flow. If the underlying layer is busy with the slow command our fast command just has to wait. I am trying to build asyc http in c++ which can then be used on the iPhone but that is someways off from being ready.
Thrift as an external API doesn't make sense. Use it internally rock and roll.