Which is better? Long TCP connection or long-polling? - chat

I am planning to build a web chat on my site. I know two way of doing this: one is using XMPP web client (through flash, long TCP connection), and the other is facebook way, long-polling.
But facebook is going to update their chat to support Jabber (XMPP), so can some one tell what way is better? (including upgrading to XMPP)

I've had pretty good results with long polling in my applications, but the bigger question is whether you're going to face the C10K problem. If so, structuring your code to deal with that kind of intense workload will likely dominate all other design considerations, at least in the short term. :-)
Other than server load, the primary consideration for which strategy to use will be client environment compatibility -- to be able to work from behind draconian firewalls that only allow HTTP or in browser environments that prohibit any plugins, long polling is the only way to survive, but it has more overhead than the simple TCP connection approach.

They have different pros and cons, for eg: TCP requires a plugin (at least until HTML5 web sockets get widely supported), Long Polling is less performant, etc. I'm not an specialist on this differences, that's why I'll recomend you to avoid making this choice, avoid developing and tuning that each approach involves, avoid the future changes in available technologies (ie. as HTML5 arrival), using a library that abstracts the transport method used, and chooses the best approach based on client capabilities:
http://socket.io/
this wonderful library makes creating realtime apps amazingly simple! and there are various server-side implementations: Python (Tornado), Java, Google GO, Rack (Ruby), besides the mainstream implementation in Node.js (server-side JavaScript)

Related

Developing Chat/Real time web application

I am currently doing my research on building a chat system with more than 10k users connected online. I came across technologies and ways to do it such as jabber(XMPP), websockets, long polling, push. As far as I now, long polling might not work given the number of users. I know there is a lot of ways to accomplish this. I also know that facebook and Google chat systems are developed on XMPP.
I would truly appreciate if anyone could point me to the right direction. I believe all these methods and technologies out there are good depending on the scale of the project. I definitely need performance and scalability.
I've used Socket.io together with NodeJS for such a chat application. It scaled to over 10K concurrent users on moderate servers and there was a lot of room to grow.
This does depend on your limitations, tho.
What kind of hardware are you planning on using?
Which operating system would power your servers?
Which client platforms are you targeting?
Do you have an existing infrastructure you need to fit this into?
Do you have a previously selected programming language?
The existing skill sets your team members have and your team's ability to adopt new platforms and languages if necessary.
Take all of the above into consideration when making your decision.
Personally, I've found XMPP to be quite adequate, but a bit bloated for my purposes. YMMV.
You are comparing a fruit basket and three different variety of oranges.
XMPP is the only protocol that you have mentioned that actually is designed to support a chat system (of which many exist). The others are simply asynchronous messaging protocols/techniques. XMPP already supports http based chat via BOSH. Without a doubt, it will also support WebSockets when the specification is finalized. There is actually a draft of this already written, but at this point it appears to be a draft using a draft, so there will probably be few, if any, implementations.
Using XMPP would allow you to build on a proven technology for implementing a chat system and would allow you to choose what transport you want to use "under the hood". You haven't actually said whether you need a http based transport or not, but with XMPP you can use the stock tcp socket based transport or a http based one (BOSH) with the knowledge that it will also support WebSockets in the future.
The other benefit is of course that this is a widely used standard that will allow reuse of existing clients, servers and libraries in pretty much all popular (and not so popular) languages and platforms.
Scalability is not too much of a concern with the numbers you are quoting, as most (maybe all) existing xmpp servers will handle that many users.

Multiplayer game in Facebook

I am working on a multiplayer game which will be a Facebook app. Doing some research, I found out that for server-side pushes I need comet which is best implemented in Node.js or Python.
But Facebook's API is only written in javascript and PHP. I know there are third party APIs, but I do not want to go with them. I can do all the Facebook code client-side in javascript I guess, but that will be a little difficult, especially when it is so easy to do in PHP.
According to me, my options are summarized as below
Leave server-side pushes and stick with periodic Ajax requests + PHP.
Stick with Node.js and leave PHP and do all FB programming in javascript (if that is even possible, which I think it is).
use server side pushes in Apache (which I heard is not a good way to go).
Go with a technology like Java with some comet support and FB API's. (I don't know Pyhton).
HTML5 has introduced server side updates as well ,maybe it can help. (haven't given it much thought though)
Which is the best way to go? I have good experience with Java, PHP, and javascript.
All comet is is a normal HTTP ajax request where the server intentionally delays the response if there are no results, and continues polling the data-source server-side until there are results or the request times out. It is a good approximation of push technology if important events are fairly sparse (i.e. if there are frequently many seconds in a row where there are no updates).
I don't think PHP is a great language in general, but it should not be much harder to do comet (also known as long polling) in PHP than in Python etc. So if you don't have any other reason not to use PHP, then go for it. You should also be able to interact with Facebook's API from other languages like Python or Javascript/Node.js without too much trouble.
HTML5 has, among other things, web sockets, which are quite different from HTTP requests and which can have much better latency than long polling techniques, especially for very frequent updates. Web socket data is closer to what you might imagine "push" technology means -- comet is really just an approximation of "push" implemented through a delayed pull. Whether sockets or comet or just normal non-delayed ajax requests are best for your game depends entirely on the specifics of your game, and of your server resources.

Enterprise Web Application Architecture Question

I am wondering if it would be possible to develop an enterprise-level web application without the use of a standard MVC structure and application server by carrying the business/flow logic and session data to the client-size Javascript and make it talk to REST data services directly...maybe we could make use of an authorization/authentication layer and a second validation layer sitting on top of the data services. All these services operate on standard HTTP methods, support configurable logging&monitoring, and content or query parameters are all contained in the HTTP request/response body. Static HTML and Javascript are served to the browser and the rest is carried out by Javascript functions talking to the HTTP-based authorization/authentication, validation and then data services. Do you think this kind of an architecture could satisfy enterprise-level web application requirements?
It's possible but unlikely; what are the drivers that suggest this architecture to you? Is it just to be different or are there some specific aspects that this best addresses?
by carrying the business/flow logic
and session data to the client-side
Javascript and make it talk to REST
data services directly
In theory you'd still be able to have an appropriately layered solution (Business Logic (BL) script vs UI focused script) but practically speaking it'd be messy, and you'd lose the ability to physically separate it into different tiers. This could "bite" you at any number of places in the life of the system.
"Enterprise" grade systems are seldom small, I hate to think how much logic you'd be having to send over the wire to support a given action/process.
Putting all the BL into a scripting language ties you to that platform, and platforms change over time. The bad thing about scripts is that whilst they are stable to a degree I'd suggest they are more exposed to change than server-based platforms like Java or .Net. In an enterprise scenario, the servers will have very tight change control and upgrade paths mapped out for them - whereas browsers are much more open to regular change.
There's the issue of compatibility - unless you're tied to a specific browser (to the version level) guaranteeing consistent behavior is going to be harder, and will likely require more development effort. Let's say you deliver the solution successfully; what do you do when the business wants to take advantage of mobile computing - say iPads? Your only option is going to be a browser - you won't be able to take advantage of any of the native advantages of the platform. "The web and browsers" might seem like they'll be around forever - but then I'm guessing that what MainFrame folks said at the time. A server-centered solution is going to give you more life for less expense.
Staffing will be an issue - you'll need very strong JavaScript and server-side developers.
Security: having your core BL out on the client where it's much more exposed sounds very dangerous.
EDIT:
Web Apps can be sow for many reasons - not many of which are reason enough to put all your BL in JavaScript on the client. Building apps for performance is a whole field of endeavor on its own - I suggest you get more familiar with Architecting and implementing for performance before you write off n-tier web apps altogether :)
Regarding keeping your layers separated: there are different ways of doing this but it boils down to abstraction - and more correctly to keeping good design principles in mind; if you haven't heard of SOLID that would be a good place to start. In terms of implementation start reading up on Dependency Inversion (FYI - self-promotion, the articles mine and is .Net focused, but you should have no problem tracking down Java-based ones too).

Socket vs HTTP based communication for a mobile client/server application

I've recently decided to take on a pretty big software engineering project that will involve developing a client-server based application. My plan is to develop as many clients as possible: including native iPhone, Android and Blackberry Apps as well as a web-based app.
For my server I'm planning on using a VPS (possibly from slicehost.com) running a flavor of Linux with a MySQL database. My first question is what should be my strategy for clients to interface with the server. My ideas are:
HTTP-POST or GET based communication with a PHP script.
This is something I'm very familiar with - passing information to a PHP script from a form, working with it and returning output. I'm assuming I'd want to return output to clients as some sort of XML or JSON based string. I'm also assuming I'd want to create a well defined API for clients that want to interface with my server.
Socket based communication with either a PHP script, Java program, or C++ program
This I'm less familiar with. I've worked with basic tutorials on creating a script or simple application that creates a socket, listens for a connection and returns data. I'm assuming there is far less communication data-overhead with this method than an HTTP based method. My dream is for there to be A LOT of concurrent clients in use, all working with the server/database. I'm not sure if a simple HTTP/PHP script based communication design can scale effectively to meet the needs of many clients. Also, I may eventually want the capability of a Server-Push to clients triggered by various server events. I'm also unsure of what programming language is best suited for this. If efficiency is a big concern I'd imagine a PHP script might not be efficient enough?
Is there a commonly accepted way of doing this? For me this is an attempt to bridge a gap between some of my current skills. I have a lot of experience with PHP and interfacing with a MySQl database to serve dynamic web pages. I also have a lot of experience developing native iPhone applications (however none that have had any significant server-based communication). Also I've worked with Java/C++, and I've developed applications in both languages that have interfaced with MySQL.
I don't anticipate my clients sending/receiving a tremendous amount of data to/from a server. Something on par with a set of strings per a given client-side event.
Another question: Using a VPS - good idea? I obviously don't want to pay for a full-dedicated server (slicehost offers a VPS starting at ~ $20/month), and I'm assuming a VPS will be capable of meeting the requirements of a few initial clients. As more and more users begin to interface with my server, I'm assuming it will be possible to migrate to larger and larger 'slices' and possibly eventually moving to a full-dedicated server if necessary.
Thanks for the advice! :)
I'd say go with the simplicity of HTTP, at least until your needs outgrow its capabilities. (The more stateful your application needs to be, the less HTTP fits).
For low cost and scalability, you probably can't go wrong with a cloud like Rackspace's or Amazon's. But I'm just getting started with those, my servers have been VPSs from tektonic until now.

Concurrent program languages for Chat and Twitter-like Apps

I need to create a simple Chat system like facebook chat and a twitter-like app.
What is the best concurrent program languages in this case ?
Erlang, Haskell, Scala or anything else ?
Thanks ^_^
Chat system like facebook chat
Facebook chat is written in Erlang, http://www.facebook.com/note.php?note_id=14218138919
and a twitter-like app
What aspect? Message delivery? The web frontend? It's all message routing ultimately. Haskell's been used recently for a couple of real time production trading systems, using heavy multicore concurrency. http://www.starling-software.com/misc/icfp-2009-cjs.pdf
More relevant: what's the scale: how many users do you expect to serve concurrently?
Erlang is my choice of drug for something like that. But I would also check out Node.js if you feel more comfortable in JavaScript.
Complete chat application source code using Scala/Lift.
8 minutes and 20 seconds real-time, unedited, webcast writing a chat app from scratch using Scala/Lift.
This is actually a relatively easy problem to solve that can be done by any language with decent threading support including (eg Java, C# and others). The model is fairly simple:
Each client connects to a Web server with an AJAX request that is configured on the server not to timeout;
If it does timeout for any reason the client is configured to issue another AJAX request;
The server endpoint for that AJSX call does a thread wait operation on some monitor waiting for updates;
When the user sends a chat it is sent to the server and any relevant monitors are signaled;
Any threads listening to that monitor are woken up, retrieve any messages waiting for them and return that as the AJAX result to the client;
The client renders those messages and then issues another AJAX request to listen to for messages.
This is the basic framework but isn't the end of the story. Any scalable chat system will support either internal or external federation (meaning the clients can connect to more than one server) but unless you're Google, Facebook or Twitter you're unlikely to have this problem.
If you do you need some kind of message queue/bus for inter-server communication.
This is something you need a heavy duty multithreaded language like Erlang for but it, Haskell and others are of course capable of doing it.
F# for future.
Client/Server chat F#
Concurrent Programming - A Primer
Also take a look at Twisted(Python).
I'd recommend taking a look at Akka (www.akkasource.org)
It's a Scala Actor framework with ALOT of plumbing for making scalable backend applications.
Out of the box it supports:
Actor Supervision
Remote actors
Cluster Membership
Comet over JAX-RS (Project Atmosphere)
HTTP Auth for your services
Distributed storage engine support (Cassandra, MongoDB)
+ a whole lot more
I would go for erlang, it's efficiency in comet-enabled apps is largely proven.
Go with a framework such as nitrogen, where you can start comet requests just as easily as Jquery does ajax.
If it's actually going to be a simple application (and not under very high load), then the answer is to use whichever language you already know that has decent threading. Erlang, Scala, Clojure, Haskell, F#, etc., all do a perfectly good job at things like this--but so do Java and C#. You'll be fine if you pick whichever of these you know and/or like.
If you're using this as an excuse to learn a new generally useful language, I'd pick Scala as a nice blend of highly performant (which Erlang is not in general, though it is fantastic with efficient concurrency), highly functional (which Java and C# are not), highly deployable (due to running on the JVM), and reasonably familiar (assuming you know C-ish languages).
If you're using this as an excuse to practice fault-tolerant concurrency, use Erlang. That's what it was designed for and it does it extremely well.