Why use backend server and RPC in Web Server infrastructure? - webserver

I'm interested in creating a web application and I've just done some research on the what makes a good web server. I've search through the facebook, twitter and foursquare. They share what software they used to build their infrastructure.
For me, some of the software used are new. I'd like to ask some questions here.
why create a back end server, isn't a web server running PHP is enough? Why use java/scala for backend? Do we really need RPC framework such as thrift/protocol buffer? What is that RPC framework used for? Is it used for communication between frontend and backend servers?
Really appreciate for those who answer my questions, or if there's some books you would suggest me to read.
Thank you.

It sounds as though you'd like to build a scalable backend infrastructure that ultimately will be used to do the following:
Serve content. This is the web server layer.
Perform some type of back end processing for user requests
coming in from the web server layer and communicate with the data store. Call this the application server layer.
Save session state and user data in a distributed, fault tolerant, eventually consistent key value store.
Also, it sounds as though you want to do this using commodity PC hardware.
This is a tall order.
Foursquare uses Scala with the Lift framework, jetty for their web server. Here's more. And more.
Facebook uses many different technologies. I know that for their data store they use HBase (they were using Cassandra)
Yahoo uses HBase to keep track of user statistics.
Twitter started as a Ruby-backend web site. They moved to Scala. Twitter is incrementally moving from mysql (I assume sharded) to Cassandra using their proprietary incremental database conversion tool.
As far as scaling on the application server and web server end, I know that what really counts is having a language that has the ability to spawn new user processes in user space and a manager process that assigns new worker processes the requests coming in. Think of it as running a very efficient company. The more work you've got coming in, the more people you hire. This is the Actor model. Some languages have actors built in,(erlang) others have actors implemented as frameworks(akka) or libraries (Scala native). Apparently, Scala's native actors are buggy so some people got together and implemented the akka framework for Scala and Java. There's a lot of discussion online regarding actors and which language and libraries one should use. Erlang has a lot going for it out of the box, however, Scala runs in the JVM and allows you to reuse a lot of the existing Java web libraries (which could have some issue if they happen to have static objects declared in them) Erlang has actors and the OTP libraries, but apparently does not have the rich libraries that Java has. So, for me it really boils down to Scala (with akka) or Erlang.
For the web server, with Scala, you can use any java app server. Foursquare uses jetty for most things. It's not written in Scala, but since Scala compiles down to bytecode that runs on the JVM it easily interops with any java app server.
People also say that there aren't that many Erlang programmers and that Erlang is harder to learn (functional programming vs imperative programming) Scala is functional and imperative at the same time (meaning you can do either)
Erlang is functional. Now, functional programming has a lot of things going for it as one expert functional programmer can get a lot more done than an expert imperative programmer. Yahoo stores was originally written and maintained in Lisp (functional language) by one man. On the other hand, imperative programming is easier to learn and used widely in a team setting. Imperative languages are good for some things, functional languages for
others. The right tool for the right job.
Back to the web server discussion, with Erlang, you can use yaws or you can run a framework (Chicago Boss)
Here's more on the Scala vs Erlang debate.
Another link.
More here.
And another.
Another opinion.
On the database end, you have a lot of choices. See here.
You can even eschew the database all-together and save your data in mnesia (Erlang's runtime data store)
My answer is not complete as this topic (scaling app servers, databases and web servers) is very complicated and full of debate. Some frameworks even blur the tiers (web server, application server, database) distinction and integrate a lot of the functionality of these layers within the framework itself.

For example, I encounter a lot of problems developing complex webapp using PHP only. PHP has no threads, php is lacking many good things that has scala, or another good modern language with rich syntax. PHP is slow comparing to compiled JVM language. PHP is less secure in my opinion. It is good to get a bunch of data and render as HTML page, but processing for high load is not its plus. RPC as you suggest serves as communication layer.

Related

Easiest form of process communication in Scala

What I want to do: I want to add communication capabilities to a couple of applications (soon to be jar libraries for Java) in Scala, and I want to do it in the most painless way, with no Tomcat, wars, paths for GET requests, RPC servers, etc.
What I have done: I've been checking a number of libraries, like Jetty, JAX-RS, Jackson, etc. But then I see the examples and they usually involve many different folders for configuration, WSDL files, etc. Most of the examples lack a main method and I don't have a clear picture about how many additional requirements may they have (e.g. Tomcat).
What I am planning to do: I'm considering to simply open a socket on the "server" to listen, then connect with the "client" and transfer some JSON, in both directions. This should be fairly standard so that I can use other programming languages in compatible ways (e.g. Python).
What I am asking: I would like to know whether there is some library that makes this easier. Not necessarily using raw sockets, but setting up some process communication in just a few lines, maybe not as simple as Node.js, but something similar.
Bonus: It would be cool to
be able to use other programming languages (e.g. Python) by using open standards
have authentication
But I don't really need any of those at this point.
I think you need RPC client/server system, I would suggest to take one of these two:
Finagle - super flexible and powerful RPC client/server from Finagle. You can define your service with Thrift, and it will generate stubs for client/server in scala. With Thrift it should be straightforward to add Python support.
Spray - much smaller library, focused on creating REST services. It's not so powerful as Finagle, however much easier. And REST allows you to use any other clients
Remotely - an elegant RPC system for reasonable people. Interesting and very promising project, however maybe difficult to start with because of extensive Scalaz+Shapeless+Macro usage
Honestly if you want something that is cross-language compatible, simple, straightforward, and concise then you do not want to use plain old sockets!
Check out dropwizard. It is amazing and I use it for small and large projects alike! It is usually configured by no more than a single configuration file. It supports authentication too!
Out of the box it gives you really great inter-process communication over JSON (using Jackson) and much much more. There is also pretty decent Scala support for dropwizard.
If you must roll your own then I'd recommend using Jackson for JSON parsing. It's super simple to use and also has great scala support.
If you've got a "controlled" use case where the client and server are on the same LAN and deployed in tandem, I'd (controversially) recommend Java RMI; it's dumb and JVM-specific (and uses a Java-specific protocol), but it's very simple to use.
If you need something more robust and cross-language, I'd recommend Apache Thrift. You write your interfaces in a platform-independent interface definition language, and it's very clear which changes are compatible and which are not; the thrift compiler generates skeleton interfaces for you to use, and then you just write an implementation of that interface and a couple of lines to start the server (as you can see from the example on the homepage). It's also got good support for async implementations if you need the performance. Thrift itself is reasonably standard and cross-platform, with its own binary protocol, or you can use JSON as a transport if you really want to (I'd recommend against that though).
RabbitMQ provides one easy way to do what you want without writing a server and implementing your own persistence, flow control, authentication, etc. You can brew or apt-get install it.
You start up a broker daemon process (i.e. manages message queues)
In the Scala producer, you can use Maven-provided Java API to send JSON strings without any fuss (e.g. no definition languages) to specified queues
Then in your other Scala program, connect to the broker, and listen for messages on the queue, and parse the incoming JSON
Because it is so popular, there are many tutorials online for different patterns you may want to use to distribute the messages, e.g. pub/sub, one-to-one, exactly-once delivery, etc.

HTTP messaging gateway with Scala

I'm developing a gateway that will be located between mobile web apps and web services on backend systems. The purpose of this gateway is to protect web apps against changes to backend web service api's, to introduce concurrency, transform messages, buffering, etc.
My proposed architecture is as follows:
Platform independent mobile web apps using PhoneGap (done)
Gateway is a web service using Scala for business logic and ZeroMQ for message passing (new)
Backends are existing web services (existing)
The gateway is purely responsible passing, translation, aggregation, etc. of messages and does not need to keep state or do user authentication at this point - it's simply responsible for being a single interface that knows how to talk to mobile apps on the one side and one or more services on the other side.
I'm strongly considering using Scala as the development language because it seems well suited for this type of application, but what would be the correct architecture for such a Scala service? I looked at frameworks such as Lift and Play and also considered doing a simple "java" based web service and just use Scala as to implement my business logic. I believe strongly in keeping things as simple as possible. I'm wary of complicated setups and thousands of lines of dead code in frameworks that may never be used. On the other side, limiting oneself to a 'role your own' solution and creating a lot of work and having to maintain code that may have been part of existing solutions is also not ideal.
Some things to consider: I'm the architect and developer, but my knowledge of Scala is limited to the first half of "Programming in Scala, Second Edition". Also, my time is very limited. Still, I want to get this right the first time.
I'm hoping that some clever gent or dame will provide me with insights to this type of solution and maybe a link or two to get started quick. I really need to get going FAST, but hope that the experience or insights of another professional could help me avoid pitfalls along the way. Any insights to development environments and tools will also be helpful. I have to develop on Mac (company rules) but will be deploying on Ubuntu server. I'm currently juggling between installs of Eclipse or Idea as IDE's and the scala compiler or sbt for building.
UPDATE
Thanks for all the potential answers below. I had a look at each and every suggestion and all of them have merit. The problem is now to bet on the right horse. Spray is possibly the simplest solution to the problem, but I also found Finagle. It seams like a stunning solution to my problem. I'm a bit concerned that it is built on top of Netty instead of Akka. Does anyone see any problem with that. I was hoping to keep my solution as purely Scala as possible, but Finagle appears to be the most mature of the lot. Any ideas?
It might be worth taking a look at Akka, which provides a lightweight framework for writing concurrent, fault-tolerant applications on the JVM, as well as useful built-in abstractions for writing web services. In addition, the Spray project looks like it could be useful for your purposes, providing HTTP server and client libraries on top of Akka (Although, unlike Akka itself, I haven't used this so myself so can't vouch for it).
If you're using a web service that is purely for other services, then an Akka based service like Spray or Blueeyes is probably your best bet. You can use Jerkson to do the JSON mapping to and from your case classes and use Akka-Camel to link to ZeroMQ.
As mistertim says, Akka is a good option. I'm biased, but I think Lift and its RestHelper is a fantastic way to go also (I've built several APIs for iPhone apps using Lift). Beyond those, I know several people who are very happy with Unfiltered.
Unfiltered is very interesting option if you want to keep it lightweight, as is not a full blown web framework.
To be honest, the documentation can use more detail, but to provide a web service you just write:
object MyService extends unfiltered.filter.Plan {
def intent = {
case req # GET(Path ("/myendpoint")& Params(params)) =>
for {
param1<-params("param1")
param2<-params("param2")
} yield ResponseString(yourMethod(param1,param2))
}
override def main(args:Array[String]) = unfiltered.jetty.Http.anylocal.filter(MyService).run();
}
Very useful if you don't want to use a framework and don't want to mix java/java annotations with Scala (otherwise, you have the usual java solutions)
Still, I would recommend to look into Akka if you think you think actors will fit your problem
If your system is middle-ware and get the things done quickly, then Spray will be a best. Spray was developed on top of Akka. Akka has ZeroMQ support.
in future if you are going to add some other web like module along with your middle-ware, then choosing Lift + Akka will be better because, Lift also provides Spray like web service stuff + it will be easy to start developing other modules.
You can choose SBT as your build, there are some project templates, template generation tools available, so you can get the project build very quickly.
In my experience , SBT + InteliJ Idea works well, have a look on sbt-idea plugin
Spray Project Template
Lift Project Template Generator

Advices on server implementation for server/client structure iOS App development?

There are must be a lot of apps that are designed to communicate with server. My question is only about App installed on iOS device + Server side service interaction. Web app is not what I am talking about, and there should be no webpage involved in this discussion at all. Typical examples are Apps like Instagram and Twitter, in which most of the information exchanged between the App and the server is just data like String, Image and Integers(wrapped in JSON or XML), no webpage presentation needed.
My question will be: if you are an independent app developer, and you are designing such an app from scratch without any existing website API, database structure or application(so you are not limited by any existing API or database structure or application protocol), What will be the most efficient approach?
What the sever side need to do are:
receive data send by the App;
process the data with designed logic;
interact with database(like MySQL);
do necessary data mining and analysis---this could be a constantly running service or one time task requested by the App client;
send the data back to the App upon request or spontaneously;
exchange or broadcast the data between/among different App clients (i.e.: group chatroom and peer to peer message);
As far I as know there are 3 obvious options to implement the server side:
PHP
Python
Ruby on Rails
(please feel free to add more options)
My questions are:
which one is the most appropriate choice to implement the server side?
If the App is focusing intensively on natural human language/text searching, analyzing and data mining, which one is the best choice? I heard Python is doing pretty good in this area.
Any advice on the database choices? I am using MySQL for now, and I found it's quite powerful for my purposes, I heard Twitter is switching to Cassandra. Will that be too difficult to start with?
For the server end, if you need to build a Server management interface, for you as an admin to manage and monitor the community, membership, data and such, is there any existing solution, or framework or tool for that? what will be the most efficient approach?
If a new programmer has no experience in non of them, which one you suggest he/she to start with?
Is there any good reference material or sample code on the server side in such context we can learn from?
I know there are a lot of very experienced experts on these areas on Stackoverflow, but I saw more newbies who just entered the iOS developing area without much knowledge in server/database programming experience. And I hope this thread can help these who are thinking to design an App with server/client structure but have no idea where to start with.
ps: I will keep updating this question thread and adding my findings on this topic, to help all other users at stackoverflow. :-) Please try to make your answer informative, easy to understand, and constructive. I guess most of readers for this thread will be new members of this great community.
Are you sure you want to spend time & money to develop your own Server & develop your own API?
There are lots of mBaaS (mobile Backend as a Service) providers today such QuickBlox, Parse,StackMob, which are ready to use and they have great Custom Objects API and some of predefined modules. They have great free plans with big quota. Some of them such QuickBlox has Enterprise plan - so you can buy license and they server team update server for you purpose.
So, i recommend not develop your server and think about mBaaS market.
Just about your issue - I can recommend look at QuickBlox Custom Objects code sample and also Custom Objects API. Custom Objects module provides flexibility to define any data structure(schema) you need. Schema is defined in Administration Panel. The schema is called Class and contains field names and their type. I think it's what you need.
which one is the most appropriate choice to implement the server
side?
Well that depends on what you know, there is reason to choice one of the other
If the App is focusing intensively on natural human language/text
searching, analyzing and data mining, which one is the best choice? I
heard Python is doing pretty good in this area.
This would reflect on your first question, you pick the language on you needs. Thus if python makes it easier then pick that one.
Any advice on the database choices? I am using MySQL for now, and I
found it's quite powerful for my purposes, I heard Twitter is
switching to Cassandra. Will that be too difficult to start with?
Again not one that is easy to answer, since it all has to do with requirements. But any SQL server will do. Cassandra is meant for "scalability and high availability without compromising performance" accourding to there website. Do you think you webservice will get many request then it might be a choice to consider.
For the server end, if you need to build a Server management
interface, for you as an admin to manage and monitor the community,
membership, data and such, is there any existing solution, or
framework or tool for that? what will be the most efficient approach?
This again is only going to be answered when you pick the a SQL server and server language.
If a new programmer has no experience in non of them, which one you
suggest he/she to start with?
Start with something simpler, you are really going out on a limb here.
Is there any good reference material or sample code on the server
side in such context we can learn from?
Propably there is some, but you should really start small and work from there.
Twitter started out as a Rub on Rails app and is working on scalability and availability which ruby is not really good ar (that is my person opinion). or Look at facebook they have written a php to c compiler to make php run faster.
The only thing I can say to start code, when you app does take off then tackle the some of the performance issues.
And since you state that you are new to programming do not bite of more then you can chew.
This is a huge question and I don't think there is a best answer. It most depends on what you care about, such as how quickly the development process, how easily the implementation, etc.
And which one is popular, which one is cool, I don't think it make really sense.
In my personal opinion, I'm good at ASP.NET and I can get Windows server easily, so I'll start with an ASP.NET service to provides data.
And, to be continued.

What is the good starting point to developing RESTful web service in Clojure?

I am looking into something lightweight, that, at a minimum should support the following features:
Support for easy definition of actions through metadata
Wrapper that extracts parameters from request into clojure map, or as function parameters
Support for multiple forms of authentication (basic, form, cookie)
basic authorization based of api method metadata
session object wrapped in clojure map
live coding from REPL (no need to restart server)
automatic serialization of return value to json and xml
have nice (pluggable) url parameter handling (eg /action/par1/par2 instead of /action?par1=val1&par2=val2)
I know it is relatively easy to roll my own micro-framework for each one of these options, but why reinvent the wheel if something like that already exists? Especially if it is:
Active project with rising number of contributors/users
Have at least basic documentation and tutorial online.
First of all, I think that you are unlikely to find a single shrinkwrapped solution to do all this in Clojure (except in the form of a Java library to be used through interop). What is becoming Clojure's standard Web stack comprises a number of libraries which people mix and match in all sorts of ways (since they happily tend to be perfectly compatible).1
Here's a list of some building blocks which you might find useful:
Ring -- Clojure's basic HTTP request handling library; all the other webby libraries (for writing routes &c.) that I know of are compatible with Ring. Ring is being actively developed, has a robust community, is very well-written and has a nice SPEC document detailing its design philosophy. This blog post provides a nice example of how it might be used (reacting to GitHub commits).
Sandbar -- currently an authentication library, more types of functionality planned; under development.
Compojure -- a mature and robust library which provides a nice DSL for writing routes to be used on top of Ring. This will give you the nice URL parameter handling.
Compojure-rest -- "a library for building RESTful applications on top of Compojure". Compojure-rest is, as far as I can tell, in its early stages of development; perhaps you might see this as an opportunity to influence its design. :-)
For dealing with XML, there's clojure.contrib.lazy-xml (and the helper library clojure.contrib.zip-filter.xml) and Enlive (the built-in clojure.xml namespace is currently not very usable); these would be used in tandem (though for your purposes the former might suffice).
For JSON, there a library in contrib and clojure-json (and I think there was at least one other lib I seem to be forgetting now...); pick the one you like best.
All of will be perfectly happy with a REPL-driven development style (see the accepted answer to this SO question for a Ring trick which is very much to the purpose here). I suppose the above collection of links does leave a few blind spots (in particular, the authentication story is still being ironed out, as far as I can tell), but hopefully it's a good start.
1The only single-package solution for building webapps in Clojure that I know of is Conjure, inspired by Rails; unfortunately I have to admit that I don't know much about it, so if you feel interested, follow the link and look around the sources, wiki &c.
While building my first Clojure rest service I found myself asking often the same question. The Clojure Toolbox helped me a lot: http://www.clojure-toolbox.com/
If you are looking for some sample, real-world, illustrative code to get you started, then you could study this clojure-news-feed on github project which demonstrates how to implement a non-trivial RESTful web service with compojure/ring that wraps both SQL (postgresql or mysql) and NoSQL (cassandra), search (solr), caching (redis), event logging (kafka), connection pooling (c3po), and real-time metrics via JMX.
This blog about Building a Scalable News Feed Web Service in Clojure provides a good introduction. I ran some load tests against this service on a humble AWS deployment and got about eighty transactions per second with less than a half second average latency per transaction.
Take a look at liberator library http://clojure-liberator.github.io/liberator/ It's noy a standalone solution, buy very good for rest service definition.
Just to provide an updated answer to this old question, currently (in 2018) I think Luminus provides an excellent starting point. It's using many of the libraries (ring, compojure, etc.) mentioned in previous answers, is modular and as close to "single package" as you can get with Clojure. Specifically for REST, take a look at compojure-api. Luminus recommends buddy for authentication, I've had good success using it both for traditional session-based auth as well as Oauth and stateless JWTs.

Concurrent program languages for Chat and Twitter-like Apps

I need to create a simple Chat system like facebook chat and a twitter-like app.
What is the best concurrent program languages in this case ?
Erlang, Haskell, Scala or anything else ?
Thanks ^_^
Chat system like facebook chat
Facebook chat is written in Erlang, http://www.facebook.com/note.php?note_id=14218138919
and a twitter-like app
What aspect? Message delivery? The web frontend? It's all message routing ultimately. Haskell's been used recently for a couple of real time production trading systems, using heavy multicore concurrency. http://www.starling-software.com/misc/icfp-2009-cjs.pdf
More relevant: what's the scale: how many users do you expect to serve concurrently?
Erlang is my choice of drug for something like that. But I would also check out Node.js if you feel more comfortable in JavaScript.
Complete chat application source code using Scala/Lift.
8 minutes and 20 seconds real-time, unedited, webcast writing a chat app from scratch using Scala/Lift.
This is actually a relatively easy problem to solve that can be done by any language with decent threading support including (eg Java, C# and others). The model is fairly simple:
Each client connects to a Web server with an AJAX request that is configured on the server not to timeout;
If it does timeout for any reason the client is configured to issue another AJAX request;
The server endpoint for that AJSX call does a thread wait operation on some monitor waiting for updates;
When the user sends a chat it is sent to the server and any relevant monitors are signaled;
Any threads listening to that monitor are woken up, retrieve any messages waiting for them and return that as the AJAX result to the client;
The client renders those messages and then issues another AJAX request to listen to for messages.
This is the basic framework but isn't the end of the story. Any scalable chat system will support either internal or external federation (meaning the clients can connect to more than one server) but unless you're Google, Facebook or Twitter you're unlikely to have this problem.
If you do you need some kind of message queue/bus for inter-server communication.
This is something you need a heavy duty multithreaded language like Erlang for but it, Haskell and others are of course capable of doing it.
F# for future.
Client/Server chat F#
Concurrent Programming - A Primer
Also take a look at Twisted(Python).
I'd recommend taking a look at Akka (www.akkasource.org)
It's a Scala Actor framework with ALOT of plumbing for making scalable backend applications.
Out of the box it supports:
Actor Supervision
Remote actors
Cluster Membership
Comet over JAX-RS (Project Atmosphere)
HTTP Auth for your services
Distributed storage engine support (Cassandra, MongoDB)
+ a whole lot more
I would go for erlang, it's efficiency in comet-enabled apps is largely proven.
Go with a framework such as nitrogen, where you can start comet requests just as easily as Jquery does ajax.
If it's actually going to be a simple application (and not under very high load), then the answer is to use whichever language you already know that has decent threading. Erlang, Scala, Clojure, Haskell, F#, etc., all do a perfectly good job at things like this--but so do Java and C#. You'll be fine if you pick whichever of these you know and/or like.
If you're using this as an excuse to learn a new generally useful language, I'd pick Scala as a nice blend of highly performant (which Erlang is not in general, though it is fantastic with efficient concurrency), highly functional (which Java and C# are not), highly deployable (due to running on the JVM), and reasonably familiar (assuming you know C-ish languages).
If you're using this as an excuse to practice fault-tolerant concurrency, use Erlang. That's what it was designed for and it does it extremely well.