I need to create a simple Chat system like facebook chat and a twitter-like app.
What is the best concurrent program languages in this case ?
Erlang, Haskell, Scala or anything else ?
Thanks ^_^
Chat system like facebook chat
Facebook chat is written in Erlang, http://www.facebook.com/note.php?note_id=14218138919
and a twitter-like app
What aspect? Message delivery? The web frontend? It's all message routing ultimately. Haskell's been used recently for a couple of real time production trading systems, using heavy multicore concurrency. http://www.starling-software.com/misc/icfp-2009-cjs.pdf
More relevant: what's the scale: how many users do you expect to serve concurrently?
Erlang is my choice of drug for something like that. But I would also check out Node.js if you feel more comfortable in JavaScript.
Complete chat application source code using Scala/Lift.
8 minutes and 20 seconds real-time, unedited, webcast writing a chat app from scratch using Scala/Lift.
This is actually a relatively easy problem to solve that can be done by any language with decent threading support including (eg Java, C# and others). The model is fairly simple:
Each client connects to a Web server with an AJAX request that is configured on the server not to timeout;
If it does timeout for any reason the client is configured to issue another AJAX request;
The server endpoint for that AJSX call does a thread wait operation on some monitor waiting for updates;
When the user sends a chat it is sent to the server and any relevant monitors are signaled;
Any threads listening to that monitor are woken up, retrieve any messages waiting for them and return that as the AJAX result to the client;
The client renders those messages and then issues another AJAX request to listen to for messages.
This is the basic framework but isn't the end of the story. Any scalable chat system will support either internal or external federation (meaning the clients can connect to more than one server) but unless you're Google, Facebook or Twitter you're unlikely to have this problem.
If you do you need some kind of message queue/bus for inter-server communication.
This is something you need a heavy duty multithreaded language like Erlang for but it, Haskell and others are of course capable of doing it.
F# for future.
Client/Server chat F#
Concurrent Programming - A Primer
Also take a look at Twisted(Python).
I'd recommend taking a look at Akka (www.akkasource.org)
It's a Scala Actor framework with ALOT of plumbing for making scalable backend applications.
Out of the box it supports:
Actor Supervision
Remote actors
Cluster Membership
Comet over JAX-RS (Project Atmosphere)
HTTP Auth for your services
Distributed storage engine support (Cassandra, MongoDB)
+ a whole lot more
I would go for erlang, it's efficiency in comet-enabled apps is largely proven.
Go with a framework such as nitrogen, where you can start comet requests just as easily as Jquery does ajax.
If it's actually going to be a simple application (and not under very high load), then the answer is to use whichever language you already know that has decent threading. Erlang, Scala, Clojure, Haskell, F#, etc., all do a perfectly good job at things like this--but so do Java and C#. You'll be fine if you pick whichever of these you know and/or like.
If you're using this as an excuse to learn a new generally useful language, I'd pick Scala as a nice blend of highly performant (which Erlang is not in general, though it is fantastic with efficient concurrency), highly functional (which Java and C# are not), highly deployable (due to running on the JVM), and reasonably familiar (assuming you know C-ish languages).
If you're using this as an excuse to practice fault-tolerant concurrency, use Erlang. That's what it was designed for and it does it extremely well.
Related
I am working on a multiplayer game which will be a Facebook app. Doing some research, I found out that for server-side pushes I need comet which is best implemented in Node.js or Python.
But Facebook's API is only written in javascript and PHP. I know there are third party APIs, but I do not want to go with them. I can do all the Facebook code client-side in javascript I guess, but that will be a little difficult, especially when it is so easy to do in PHP.
According to me, my options are summarized as below
Leave server-side pushes and stick with periodic Ajax requests + PHP.
Stick with Node.js and leave PHP and do all FB programming in javascript (if that is even possible, which I think it is).
use server side pushes in Apache (which I heard is not a good way to go).
Go with a technology like Java with some comet support and FB API's. (I don't know Pyhton).
HTML5 has introduced server side updates as well ,maybe it can help. (haven't given it much thought though)
Which is the best way to go? I have good experience with Java, PHP, and javascript.
All comet is is a normal HTTP ajax request where the server intentionally delays the response if there are no results, and continues polling the data-source server-side until there are results or the request times out. It is a good approximation of push technology if important events are fairly sparse (i.e. if there are frequently many seconds in a row where there are no updates).
I don't think PHP is a great language in general, but it should not be much harder to do comet (also known as long polling) in PHP than in Python etc. So if you don't have any other reason not to use PHP, then go for it. You should also be able to interact with Facebook's API from other languages like Python or Javascript/Node.js without too much trouble.
HTML5 has, among other things, web sockets, which are quite different from HTTP requests and which can have much better latency than long polling techniques, especially for very frequent updates. Web socket data is closer to what you might imagine "push" technology means -- comet is really just an approximation of "push" implemented through a delayed pull. Whether sockets or comet or just normal non-delayed ajax requests are best for your game depends entirely on the specifics of your game, and of your server resources.
I'm interested in creating a web application and I've just done some research on the what makes a good web server. I've search through the facebook, twitter and foursquare. They share what software they used to build their infrastructure.
For me, some of the software used are new. I'd like to ask some questions here.
why create a back end server, isn't a web server running PHP is enough? Why use java/scala for backend? Do we really need RPC framework such as thrift/protocol buffer? What is that RPC framework used for? Is it used for communication between frontend and backend servers?
Really appreciate for those who answer my questions, or if there's some books you would suggest me to read.
Thank you.
It sounds as though you'd like to build a scalable backend infrastructure that ultimately will be used to do the following:
Serve content. This is the web server layer.
Perform some type of back end processing for user requests
coming in from the web server layer and communicate with the data store. Call this the application server layer.
Save session state and user data in a distributed, fault tolerant, eventually consistent key value store.
Also, it sounds as though you want to do this using commodity PC hardware.
This is a tall order.
Foursquare uses Scala with the Lift framework, jetty for their web server. Here's more. And more.
Facebook uses many different technologies. I know that for their data store they use HBase (they were using Cassandra)
Yahoo uses HBase to keep track of user statistics.
Twitter started as a Ruby-backend web site. They moved to Scala. Twitter is incrementally moving from mysql (I assume sharded) to Cassandra using their proprietary incremental database conversion tool.
As far as scaling on the application server and web server end, I know that what really counts is having a language that has the ability to spawn new user processes in user space and a manager process that assigns new worker processes the requests coming in. Think of it as running a very efficient company. The more work you've got coming in, the more people you hire. This is the Actor model. Some languages have actors built in,(erlang) others have actors implemented as frameworks(akka) or libraries (Scala native). Apparently, Scala's native actors are buggy so some people got together and implemented the akka framework for Scala and Java. There's a lot of discussion online regarding actors and which language and libraries one should use. Erlang has a lot going for it out of the box, however, Scala runs in the JVM and allows you to reuse a lot of the existing Java web libraries (which could have some issue if they happen to have static objects declared in them) Erlang has actors and the OTP libraries, but apparently does not have the rich libraries that Java has. So, for me it really boils down to Scala (with akka) or Erlang.
For the web server, with Scala, you can use any java app server. Foursquare uses jetty for most things. It's not written in Scala, but since Scala compiles down to bytecode that runs on the JVM it easily interops with any java app server.
People also say that there aren't that many Erlang programmers and that Erlang is harder to learn (functional programming vs imperative programming) Scala is functional and imperative at the same time (meaning you can do either)
Erlang is functional. Now, functional programming has a lot of things going for it as one expert functional programmer can get a lot more done than an expert imperative programmer. Yahoo stores was originally written and maintained in Lisp (functional language) by one man. On the other hand, imperative programming is easier to learn and used widely in a team setting. Imperative languages are good for some things, functional languages for
others. The right tool for the right job.
Back to the web server discussion, with Erlang, you can use yaws or you can run a framework (Chicago Boss)
Here's more on the Scala vs Erlang debate.
Another link.
More here.
And another.
Another opinion.
On the database end, you have a lot of choices. See here.
You can even eschew the database all-together and save your data in mnesia (Erlang's runtime data store)
My answer is not complete as this topic (scaling app servers, databases and web servers) is very complicated and full of debate. Some frameworks even blur the tiers (web server, application server, database) distinction and integrate a lot of the functionality of these layers within the framework itself.
For example, I encounter a lot of problems developing complex webapp using PHP only. PHP has no threads, php is lacking many good things that has scala, or another good modern language with rich syntax. PHP is slow comparing to compiled JVM language. PHP is less secure in my opinion. It is good to get a bunch of data and render as HTML page, but processing for high load is not its plus. RPC as you suggest serves as communication layer.
I've recently decided to take on a pretty big software engineering project that will involve developing a client-server based application. My plan is to develop as many clients as possible: including native iPhone, Android and Blackberry Apps as well as a web-based app.
For my server I'm planning on using a VPS (possibly from slicehost.com) running a flavor of Linux with a MySQL database. My first question is what should be my strategy for clients to interface with the server. My ideas are:
HTTP-POST or GET based communication with a PHP script.
This is something I'm very familiar with - passing information to a PHP script from a form, working with it and returning output. I'm assuming I'd want to return output to clients as some sort of XML or JSON based string. I'm also assuming I'd want to create a well defined API for clients that want to interface with my server.
Socket based communication with either a PHP script, Java program, or C++ program
This I'm less familiar with. I've worked with basic tutorials on creating a script or simple application that creates a socket, listens for a connection and returns data. I'm assuming there is far less communication data-overhead with this method than an HTTP based method. My dream is for there to be A LOT of concurrent clients in use, all working with the server/database. I'm not sure if a simple HTTP/PHP script based communication design can scale effectively to meet the needs of many clients. Also, I may eventually want the capability of a Server-Push to clients triggered by various server events. I'm also unsure of what programming language is best suited for this. If efficiency is a big concern I'd imagine a PHP script might not be efficient enough?
Is there a commonly accepted way of doing this? For me this is an attempt to bridge a gap between some of my current skills. I have a lot of experience with PHP and interfacing with a MySQl database to serve dynamic web pages. I also have a lot of experience developing native iPhone applications (however none that have had any significant server-based communication). Also I've worked with Java/C++, and I've developed applications in both languages that have interfaced with MySQL.
I don't anticipate my clients sending/receiving a tremendous amount of data to/from a server. Something on par with a set of strings per a given client-side event.
Another question: Using a VPS - good idea? I obviously don't want to pay for a full-dedicated server (slicehost offers a VPS starting at ~ $20/month), and I'm assuming a VPS will be capable of meeting the requirements of a few initial clients. As more and more users begin to interface with my server, I'm assuming it will be possible to migrate to larger and larger 'slices' and possibly eventually moving to a full-dedicated server if necessary.
Thanks for the advice! :)
I'd say go with the simplicity of HTTP, at least until your needs outgrow its capabilities. (The more stateful your application needs to be, the less HTTP fits).
For low cost and scalability, you probably can't go wrong with a cloud like Rackspace's or Amazon's. But I'm just getting started with those, my servers have been VPSs from tektonic until now.
I want to run Cross-Platform XMPP Instant Messenger. What Server-Side language should I choose?
There's also http://prosody.im/ which uses Lua. Their mission statement:
Prosody is a modern flexible communications server for Jabber/XMPP written in Lua. It aims to be easy to set up and configure, and light on resources. For developers it aims to be easy to extend and give a flexible system on which to rapidly develop added functionality, or prototype new protocols.
Prosody is licensed under the permissive MIT/X11 license.
If you want to use a XMPP server, take a look at Ejabberd(written in Erlang) or maybe Tigase(written in Java)
If you want to create your own, use:
Twisted Matrix, a Python framework to develop asynchronous applications
Erlang, a functional language designed with concurrency in mind.
Java with the NIO asynchronous library
Depending on how close to the XMPP spec you want to be, C++ might be an option but it will be quite challenging, as there is a fair amount of logic to implement :-)
If you want to optimize for speed, identify the bottlenecks of your application, and look into writing specific parts in C(XML parsing or string handling).
I am planning to build a web chat on my site. I know two way of doing this: one is using XMPP web client (through flash, long TCP connection), and the other is facebook way, long-polling.
But facebook is going to update their chat to support Jabber (XMPP), so can some one tell what way is better? (including upgrading to XMPP)
I've had pretty good results with long polling in my applications, but the bigger question is whether you're going to face the C10K problem. If so, structuring your code to deal with that kind of intense workload will likely dominate all other design considerations, at least in the short term. :-)
Other than server load, the primary consideration for which strategy to use will be client environment compatibility -- to be able to work from behind draconian firewalls that only allow HTTP or in browser environments that prohibit any plugins, long polling is the only way to survive, but it has more overhead than the simple TCP connection approach.
They have different pros and cons, for eg: TCP requires a plugin (at least until HTML5 web sockets get widely supported), Long Polling is less performant, etc. I'm not an specialist on this differences, that's why I'll recomend you to avoid making this choice, avoid developing and tuning that each approach involves, avoid the future changes in available technologies (ie. as HTML5 arrival), using a library that abstracts the transport method used, and chooses the best approach based on client capabilities:
http://socket.io/
this wonderful library makes creating realtime apps amazingly simple! and there are various server-side implementations: Python (Tornado), Java, Google GO, Rack (Ruby), besides the mainstream implementation in Node.js (server-side JavaScript)