Developing Chat/Real time web application - xmpp

I am currently doing my research on building a chat system with more than 10k users connected online. I came across technologies and ways to do it such as jabber(XMPP), websockets, long polling, push. As far as I now, long polling might not work given the number of users. I know there is a lot of ways to accomplish this. I also know that facebook and Google chat systems are developed on XMPP.
I would truly appreciate if anyone could point me to the right direction. I believe all these methods and technologies out there are good depending on the scale of the project. I definitely need performance and scalability.

I've used Socket.io together with NodeJS for such a chat application. It scaled to over 10K concurrent users on moderate servers and there was a lot of room to grow.
This does depend on your limitations, tho.
What kind of hardware are you planning on using?
Which operating system would power your servers?
Which client platforms are you targeting?
Do you have an existing infrastructure you need to fit this into?
Do you have a previously selected programming language?
The existing skill sets your team members have and your team's ability to adopt new platforms and languages if necessary.
Take all of the above into consideration when making your decision.
Personally, I've found XMPP to be quite adequate, but a bit bloated for my purposes. YMMV.

You are comparing a fruit basket and three different variety of oranges.
XMPP is the only protocol that you have mentioned that actually is designed to support a chat system (of which many exist). The others are simply asynchronous messaging protocols/techniques. XMPP already supports http based chat via BOSH. Without a doubt, it will also support WebSockets when the specification is finalized. There is actually a draft of this already written, but at this point it appears to be a draft using a draft, so there will probably be few, if any, implementations.
Using XMPP would allow you to build on a proven technology for implementing a chat system and would allow you to choose what transport you want to use "under the hood". You haven't actually said whether you need a http based transport or not, but with XMPP you can use the stock tcp socket based transport or a http based one (BOSH) with the knowledge that it will also support WebSockets in the future.
The other benefit is of course that this is a widely used standard that will allow reuse of existing clients, servers and libraries in pretty much all popular (and not so popular) languages and platforms.
Scalability is not too much of a concern with the numbers you are quoting, as most (maybe all) existing xmpp servers will handle that many users.

Related

Connection between programs over the network

I want to dive into the whole diversity of tools which provide connection between programs over the network.
To clarify the question, I divide it on subquestions:
Why some groups of programs (or specific tools/frameworks/approaches with programming languages where this frameworks can be used) were popular in each period of time? (I expect description of problems which were solved, description of tools, why those tools are considered as best solution to those problems at that time, why some tools lost popularity)
What is the entire history of software communication over the network? (tools/approaches popularity precisely to decades)
What are the modern solutions to this problem?
I can distinguish only two significant approaches.
RPC, RMI and their implementations (I saw this, but it is about concrete problem and specific tools to solve this problem, I want to see the place of this problem in the whole picture of interconnection programs over the network. I heard about implementations: ONC RPC, XML-RPC, CORBA, DCOM, gRPC, but which are active now? which are reasonable to use? which are preferable and why? I want answers not to be opinion based, so I accept answers like "technology A better than technology B for problem X because ..." only if there is reliable research/statistics or facts). I heard that RPC and RMI were popular 10 years ago. Are they still?
Web services: REST, SOAP.
Am I miss something? Maybe there are some technologies which solve problem completely new way? Maybe there are technologies which can be treated as replacement to RPC(RMI) and Web Services? Can we replace RPC(RMI) by REST for any task? Can we replace RPC(RMI) by REST only for modern tasks? Should I separate technologies not as RPC and Web Services, but in some other manner?
As a partial answer, I can give you my feedback on the use of RabbitMQ.
As explain here, it provides a lot of different ways to use it :
RPC by implementing a "callback" queue
One to one, one to many routing strategy to propagate your events through your whole infrastructure and target the right destination.
It comes with the ability to persist messages to avoid loosing data when a crash appears but also with some plugins to increase possibilities (e.g x-delayed plugin)
This technologie written in Erlang is powerful and is a must try in term of communication between programs.
To your question „Am I missing something“: yes.
Very popular communication patterns are the so-called Event-Driven or Message-Driven protocols. This type of protocols are often used in distributed systems such web applications, microservices and IoT-Environments. The communication is complete asynchronously and allows building scalable and loosely coupled systems.
There are many different frameworks and methods for Event-Driven systems like WebSockets, WebHooks, Pub-Sub and Messaging-Librarys like AcitveMQ, OpenMQ, RabbitMQ, ZeroMQ and MQTT.
Hope this info helps for your research.

How to implement XEP-0289 FMUC plugin on a XMPP server?

I need to implement a distributed XMPP MuC application on the lines of XEP-0289 minus some of the features, in essence I want to have a bare bones implementation of the plugin, my concern is to address fault-tolerance and as of now I do not want to worry about the performance considerations as specified in 289.
I have looked into SleekXmpp as a tool to develop server side plugins, but don't know how comfortable it would be to use it for such an implementation, other options I have looked at are OpenFire , Tigase. I am comfortable with Python/Java and other key features to consider would be good documentation, ease of use etc keeping that in mind I would like to know what would be the preferred path to take for this development.
Any guidance will be appreciated.
you should be able to write a MUC component that includes FMUC (or similar). The general way to do this would be to use a library that supports XEP-0114 components (e.g. SleekXMPP (Python), Swiften (C++)) and implement MUC+FMUC through that. You haven't said what your concerns with SleekXMPP are, but it's a fairly well-respected library in the XMPP community, so seems a fair choice (I'd pick Swiften, but I'm biased as one of the authors).
Your second option (patching the server directly) isn't generally the XMPPish way of adding customisations (as it's vendor-specific), but should also work if you can find someone sufficiently familiar with the server code, or if you're willing to become so.
To achieve fault tolerance (assuming you mean resilience to server failures) you'd need to run your XMPP server clustered, and also cluster your FMUC implementation. With that done, the usual XMPP fail-over using SRV records in DNS should ensure other servers retry connections to another host.
On a side note, the next version of FMUC (XEP-0289) will have some of the features of the current revision stripped out, and a number of improvements made based on deployment experience, so if your work is not time-critical, it might be of benefit to you to read that when it's released. I also note that there exists at least one implementation of FMUC already (Isode's M-Link, on which I work), and there is interest from other vendors, so using the standard protocol might benefit you in terms of not re-inventing the wheel.

Real-time auction updates - Comet? Tornado? ActiveMQ?

I'm in the process of deciding how to write an online auction application. I would like to provide real-time updates to the site users. My background is with LAMP (although, in my case, the 'P' would be more for Perl than PHP). I've considered ActiveMQ, but I'm wondering if there are better options.
My primary concerns are scalability and speed. It could have several simultaneous auctions taking place, with [hopefully] many users participating in each auction. Whatever solution that I decide on would have to accommodate such a scenario. Of course, this is all in theory so I have no idea how many concurrent users that I might have, but I'd like to have the means to support tens of thousands of users.
Another concern is ease of implementation. I've spent the past few days reading docs and tutorials and, so far, nothing has come across as anything less than a bit of a pain in the rear to deal with, which is actually what has led me here to seek some advice.
I was hoping to use a web framework, such as Codeigniter (PHP) or Catalyst (Perl), because I intend to pay a contractor or two to help with some of the bulk of the coding, and I like the idea of having a framework to somewhat enforce a design pattern. However, the more that I look into this, I'm just not seeing an obvious solution to 1) use a framework, and 2) provide real-time auction updates (other than Tornado, I guess - maybe I'm answering my own question. ;)).
So, with all that said, short of using polling (which I'm not really interested in doing), is there a way that I can accomplish these real-time updates using a language like Perl or PHP for my server-side code? I know that ActiveMQ supports STOMP, and I actually have this working on my local machine (using Jetty since it requires a servlet to publish/consume messages from client-side javascript), but is there a better option here?
I'm sorry that I don't have a more direct question, but after several days of looking at docs and tutorials, I'm more lost than ever!
Part of your problem is that your mixing a variety of concepts together. If I read things correctly you have a problem statement of:
I'm building an online auction site and would like to insure that my visitors have real-time updates of prices on the items they are viewing.
Now between the Browser and the Server you'll probably use a Comet style request pattern to handle communications, you could also look at socket.io as a backup pattern. This polling will require a server that is able to handle lots of simultaneous open connections, which Tornado is a good candidate (there are others, but given you asked in relationship to Tornado it's good).
Now that we've gone from 1000+ of Browsers to a handful of Tornado servers, you need a way to communicate between them. In the the last of publish/subscribe message patterns you have a few choices:
RabbitMQ (AMQP)
ZeroMQ
Redis Pub/Sub
All three a good choices, with their own pros/cons. Personally I've used Redis and Rabbit on different projects and just toyed with ZeroMQ. The message broker is a whole decision tree that is going to be based on what you have available.

How to implement group chatting/message-board mobile app?

I am trying to write a iPhone group chatting/message-board app which will have a backend component. I expect users to register with our system and start posting messages on chatrooms/message-boards. These message-boards can have more than 2 individuals, must support real time notifications and should be accessible from any other clients (like web) as well.
I stumbled upon http://code.google.com/p/xmppframework/ . I realize that XMPP is a very attractive proposition for our needs but I am seriously worried about the infrastructure complexities and scale issues. Besides, XMPP has way too much to offer for my needs. Looks like, XMPP might be the only choice for my pleasure in pain, but I wanted to see what you experts have to say on this.
Any thoughts?
Thanks,
My advice is: whichever protocol you're choosing, do not try to invent your own protocol. Go for XMPP or if you can find an alternative which you find more compelling, use that. Especially if there's already a nice framework for you to use. Why ? Because a single developer new to a field is seldom smarter than a bunch of people with experience ;-) Make use of other peoples' experience by using an established protocol, and make use of existing frameworks to avoid coding mistakes and investing a lot of time to solve a problem yet again that was already solved.
That being said, XMPP is widely deployed and thus would make for a good choice if you later plan to write additional clients for other platforms or want to have third-party clients connect to your server.

Which is better? Long TCP connection or long-polling?

I am planning to build a web chat on my site. I know two way of doing this: one is using XMPP web client (through flash, long TCP connection), and the other is facebook way, long-polling.
But facebook is going to update their chat to support Jabber (XMPP), so can some one tell what way is better? (including upgrading to XMPP)
I've had pretty good results with long polling in my applications, but the bigger question is whether you're going to face the C10K problem. If so, structuring your code to deal with that kind of intense workload will likely dominate all other design considerations, at least in the short term. :-)
Other than server load, the primary consideration for which strategy to use will be client environment compatibility -- to be able to work from behind draconian firewalls that only allow HTTP or in browser environments that prohibit any plugins, long polling is the only way to survive, but it has more overhead than the simple TCP connection approach.
They have different pros and cons, for eg: TCP requires a plugin (at least until HTML5 web sockets get widely supported), Long Polling is less performant, etc. I'm not an specialist on this differences, that's why I'll recomend you to avoid making this choice, avoid developing and tuning that each approach involves, avoid the future changes in available technologies (ie. as HTML5 arrival), using a library that abstracts the transport method used, and chooses the best approach based on client capabilities:
http://socket.io/
this wonderful library makes creating realtime apps amazingly simple! and there are various server-side implementations: Python (Tornado), Java, Google GO, Rack (Ruby), besides the mainstream implementation in Node.js (server-side JavaScript)