Ejabberd MUC with more that 5K users - xmpp

Ejabberd is the massively scalable server, here's article which show Ejabber can supporting 2M+ concurrent user.
But for Multi-User Chat(MUC), Ejabber supports only 5K users(as per ejabberd module code: here).
Ejabbered should be able to handle more user than that, so my questions to Ejabberd Gurus out there:
Why Ejabberd impose the limitation to only support 5k Users in MUC?
How to support more than 5K users in the MUC?
Will clustering be able to mitigate this limitation?
Thanks in advance.

Why Ejabberd impose the limitation to only support 5k Users in MUC?
Because it makes no sense to have 5000 people sharing their presence with each other, and chatting with all the other 4999.
How to support more than 5K users in the MUC?
Very simple, just edit the source code and add a 0 to 5000. Recompile, reinstall. But then don't complain if your machine lags. You do this at your own risk.
Will clustering be able to mitigate this limitation?
No, because each room is handled by a process, and that process lives in a specific machine.
XEP-0045 (MUC) is not designed to have thousands of chatbots in the same room. It is for people chatting. If you are not having people chatting, use a proper tool for your task, like MucSub, or PubSub...

You're using the term "user", but obviously you are not thinking in persons, in human beings, right?
I hope you don't plan to have 5.000 human beings joining and leaving a single chatroom, sharing their presence everytime they join and leave the chatroom with the other 4.999 people. I hope you don't expect those 5.000 people sending public messages to all the other 4.999 people.
That would be a cricket cage.

Related

ejabberd 2 user MUC vs. normal 1-to-1 chat

I've been working with ejabberd for some time now, but due to some recent issues and requirements, I'm curious about something.
If I create a MUC room with 2 users in it, does it differ from normal 1-to-1 chat messaging (performance wise)?
What happens if I always use MUCs for all 1to1 chats?
Does it have any performance overheads or disadvantages?
Do my connections suffer from performance penalties, and does this generally consume more resources or impose any kind of restrictions or penalties?
Any help or insights would be much appreciated.
I don't know how ejabberd implements XMPP, but from protocol perspective:
"Normal" one-to-one chats are stateless server-side. All context (message history etc) is maintained by client. Server just relays messages back and forth. On the other hand, Multi User Chats are maintained by server. Resources (participants list, room settings, message history) have to be stored somewhere, and that responsibility lies on server.
One to one messages are "ad-hoc". When one party wants to chat, they just send a message to recipient. MUC, on the other hand, has to be created, configured prior to starting conversation, and the other party has to be invited to join MUC room before conversation can begin. This adds complexity and/or time.
Multi User Chats give more features, but it is debatable whether they make sense in context of one-to-one conversations (eg. does kicking someone out of conversation make sense?). On the other hand, you probably have to properly configure chatrooms, so that they are not discoverable (one cannot see list of conversations), third parties cannot join them (unless invited to), that users cannot freely change nicknames etc.
Yes, MUC has an overhead which is the MUC process management itself.

Persistent XMPP MUC (XEP-45), like WhatsApp groupchats

From the spec —
7.14 Exiting a Room
In order to exit a multi-user chat room, an occupant sends a presence
stanza of type "unavailable" to the <room#service/nick> it is
currently using in the room.
Example 80. Occupant Exits a Room
<presence
from='hag66#shakespeare.lit/pda'
to='coven#chat.shakespeare.lit/thirdwitch'
type='unavailable'/>
This implies that as soon as the user disconnects from the XMPP server, he is removed from the group on the server side. The issue is simple — I don't want this behavior; I want a behavior that is similar to what Whatsapp does, i.e. even if the user goes offline, he is still part of the MUC room (which is configured to be persistent on the server side) and will receive messages from other occupants.
Given the spec and the documentation for XEP-0045 and XMPPFramework for iOS, I have no idea how to accomplish this or if it's possible to accomplish this in the traditional ejabberd server.
XEP-45 was designed more then 10 years ago. Back then, the designers had something like IRC channels in mind. Everything of XEP-45 is designed based on the assumption that a user enters and leaves a room when he/she starts/terminates its client.
WhatsApp Groupchats are different: A user joins a groupchat is is able to view the (complete) history of that chat. Even if the users client is offline/unavailable, he is still considered part of the groupchat.
The XMPP community currently works on a new XEP that provides such functionality. It is called XEP-0369: Mediated Information eXchange. It is the spiritual successor of XEP-0045, providing the features one would expect from modern groupchats.
You could emulate something quite like this by using server-side history of the MUC (Message Archive Management, XEP-0313), so that when a client logs in they're able to request the history of the MUC while they weren't in it.
If you also want to be able to show the offline pseudo-occupants of a room, the easiest way to do this is probably to map a pubsub node per room to store the list of these pseudo-occupants that clients could read to supplement the usual occupancy list.
There are probably other solutions here, but those that come immediately to mind for me involve changing the behaviour of the server in non-standard ways, such as allowing normal occupants to query a membership list, which normally only admins can do.
The Whatsapp model is much simpler than you imagine - they just maintain user session online even if user disconnects, and re-sends messages when he "reattach" session. XEP-0198 introduce similar concept to traditional XMPP sessions. You only need to configure longer inactivity period (typically XEP-0198 assume 300 seconds, but whatsapp-like messengers holds session 24+ hours)
Yes you can make your group persistent by setting its configurations this way:
NSString *var = [field attributeStringValueForName:#"var"];
if ([var isEqualToString:#"muc#roomconfig_persistentroom"])
{
[field removeChildAtIndex:0];
[field addChild:[NSXMLElement elementWithName:#"value" stringValue:#"1"]];
}

How do I set up an IRC server that connects to a big IRC network?

I would like to connect my IRC server to the hackint network so all chats and channels that are hold on my IRC server are also mirrored in the hackint.net.
How can I set up that connection?
I am a bit unlucky in searching for it, cause I seem not to find the right keywords (peering???)
You don't automatically connect your ircd to a big irc network.
A big irc network has linking procedures and doesnt accept links from just anyone. You need to join the queue along with many others who want the same, you have to build trust and friendships with other server administrators and above all you must be altrustic and not want to just become an irc operator/admin.
Did you know that a lot of big irc servers (e.g. on efnet, freenode etc) are donated for free by businesses etc, and the donators don't even have any access beyond that of a user?
You may want to read on the linking policies of big networks to get an idea of the requirements, they expect servers with some level of performance, resilience to DDoS, good routing, and not a simple VPS:
https://www.dal.net/?page=Application%20Guidelines
http://ircnet.barfooze.de/articles/linking/
http://www.efnet.info/?module=docs&doc=16&type=html
You can use a bit of another Concept:
set up a Matrix synapse node and an IRC-Bridge
All chats and users will be stored locally in your homeserver and whenever possible mirrored in the IRC-Chat and back.
This is what I ended up with, so I'll set this as accepted answer, although the other hints are good to know and also valid.
IRCD-Hybrid -- High Performance Internet Relay Chat:
apt-get install ircd-hybrid
Then you have to adapt the configuration file to Connect Multiple IRC Servers
In the IRC world it's called "linking a server"

XMPP - Retrieve last n messages from chat room

Anyone know if there is a way to query the last n messages in a muc in xmpp (specifically ejabberd) without joining the room.
Thanks.
No, not without modifications to the server software.
If you do actually join, you can specify the amount of history you want with the <history/> element, see Managing Discussion History in XEP 45.
Messages are kept in each chat room process memory.
You will have to modify the code to expose access to that data structure programmatically.

Horizontal scalability for distributed apps, how to achieve that?

I would like to disregard web applications here, because to scale them horizontally, ie to use multiple server instances together, it is "sufficient" to just duplicate the server software over the machines and just use a sort of router that forwards requests to the "less busy" server machine.
But what if my server application allows users to engage together in realtime ?
If the response to the request of a certain client X depends on the context of a client Y whose connection is managed by another machine then "inter machines" communication is needed.
I'd like to know the kind of "design solutions" that people has used in such cases.
For example, the people at Facebook must have already encountered such situation when enabling the chat feature of their social app.
Thank you in advance for any advise.
One solution to achive that is to use distibuted caches like memcache (Facebook also uses that aproach).
Then all the information which is needed on all nodes is stored in that cache (and a database if it needs to be permanent) an so all nodes can access that information (with a very small latency between the nodes).
regards
You should consider some solutions that provide transparent horizontal database scalability and guarantee ACID semantics. There are many solutions that offer this at various levels. People at Facebook which you reference have solved the problem by accepting eventual consistency but your question leads me to believe that you can't accept eventual consistency.