group chat approaches for crowds - chat

I'm planning to write a group chat platform to use in crowded situations, like events, parties or shows, for example.
the simple approach would be to put everybody in the same room. but having a thousand people talking in a same room doesn't work. the multiple possible parallel conversations overlap and none can actually be understood or followed.
I'm not talking here about performance issues. I'm looking for design options. I couldn't find any discussion like this out there. if anyone have a link or a suggestion, that would be fine :)
so far, I could think about the following alternatives and corresponding downsides:
I could offer multiple rooms with limited capacity. let's say 50 people per room. each user could explicitly pick a room to join, knowing its current capacity beforehand, or could be randomly put in any non-empty and non-full room.
the problem with having multiple rooms is that someone can be in one room at each time, and so, if I want to talk to the host of the party, I must get into the room he is, or no deal. so... just picking a non-full room to join may just not be good enough.
the same happens to being randomly put in a room. that may be good to keep rooms balanced, but might cause the friend I just invited to join other random room and we get separated.
other possibility would be have a single room, a thousand people inside, but just some messages would be broadcast to everybody in the room. the problem is to choose who is allowed to talk and why would anyone join a chat group to be just an expectator :P
for example, for starters, the 50 initial users to join would be allowed to talk. as long as they live, the next ones in the queue would gaining opportunity to join the conversation.
or maybe only the most active (by some ranking) would be allowed to talk.
other hybrid alternative would be to allow users to create their own rooms and (auto-)close these rooms when they get empty, and only invited people can join their rooms.
this alternative does not solve the problem of trying to talk to the host, but gives the users the responsibility to keep their rooms conversations healthy.
a last hardcore approach would be to use machine learning to put everybody in the same room and broadcast the messages to limited people (selected by the ML algorithm), possibly grouped by interest or part in the conversation.
the problem here is that recently joined users have not enough data to be put in any cohort. actually, most chat messages are just too short and too similar to have a good ML classification applied.
so....
I'm looking for any reference, suggestion, paper, idea or anything that could help this analysis.
those are objective answers. please do not close this question as not constructive. and... in case of unavoidable closing, please tell me the correct place to ask this question (and this would also be an answer to my question, since it would help my analysis by getting me to the right forum).
thanks in advance :D

Related

Proper state management architecture to implement read/unread of items

Context: We are implementing a news app. For now, you can assume the news to be the same across all users, and maintains an order based on the parameters we set (according to trends, and date).
Problem: We are not sure what the best implementation for keeping track of what users read is. We want to be able to configure a way in which we can track what users read and what they didn’t.
Assumption: You can assume that the posts in the database are in a descending order, based on time.
So, the ideal scenario is that: when there are posts: A,B,C,D,E fetched from the server in the app, and the user read A,B. Now the user only gets to see C,D,E when they check for next posts. If they do previous, they see posts in the following order B-> A.
Furthermore, when P,Q is added to the database, now, the user must see next posts in the order of P->Q->C->D->E and so on.
Example: Let us assume there are 20 news in our app right now, and Gavin picks up his phone and starts reading from our app. In midst of his usage, he finds himself occupied with some other work, so quits the app after reading 5 news posts.
The challenge for us now is to figure the best way to make sure Gavin doesn’t have to re-read the 5 posts he already did.
One way we thought we could solve this problem is through use of index. We can assume uniform ordering for our posts as mentioned in the context, so we could use an index to track where Gavin was last in the order of news and show him news based on that index.
However, one problem with that approach is, we could easily have 5 new posts when Gavin picks up his phone and uses our app again. So, if we have the news based on date, technically that indexing approach means that we omit 5 unread new posts instead of the 5 read old ones.
We've also thought of maintaining three lists: Read, Unread and New so that we fetch only posts that are not in our lists. For example, in my initial example: A-B-C-D-E is in unread initially. Then, after user reads A-B, read becomes A-B. Meanwhile, when P-Q is added in the database, P-Q is added to the list of unread posts as P-Q-C-D-E.
How do you solve this problem? Any suggestions are welcome as we kind of think we're not thinking out of box when it comes to a solution for the problem. Thank you! :)
As i first read problem the solution ends up in my mind is also having 2 different list read unread and new ones are added to end of unread ones and unread list is shown in reverse order so most recent ones are on the top. However is it the most efficient way? Discussible. For example if number of new number increases a lot, then will be memory inefficient. But i assume small numbers in general.

Instant Messaging Schema design advice

I'm trying to build an Instant Messaging functionality in my app as part a bigger project.
Chats can have more than 2 participants (group chats)
If participant A delete a message, it still should be visible to participant B (that's why I used the Message Participants table)
Same applies to Conversation.
By same logic, if all participants delete the conversation/message, it should be erased from DB.
Questions :
I'm afraid that this schema is too cumbersome, meaning that the queries will be too slow once the app gets certain traffic mark (1k active users ? I'm guessing)
Message Participants will have multiple records for each message - one for each participants in the chat. Instant Messaging means it will involve those writes with very tight timings. Wouldn't that be a problem?
Should I add a layer of Redis DB, to manage a chat's active session's messaging? it will store the recent messages, and actively sync the PostgreSQL db with those messages (perhaps with Async transactions functionality that postgresql has?)
UPDATED schema :
I would also gladly hear ideas for having a "read" status functionality. I'm assuming it's much more complex with Group chats, so at least offering that for 1:1 chats would be nice.
I am a little confused by your diagram. Shouldn't the Conversation Participants be linked to the Conversations instead of the Message? The FKs look all right, just the lines appear wrong.
I wouldn't be worried about performance yet. The Premature Optimization Anti-Pattern warns us not to give up a clean design for performance reasons until we know whether we are going to have a performance problem. You anticipate 1000 users - that's not much for a modern information system. Even if they are all active at the same time and enter a message every 10 seconds, this will just mean 100 transactions per second, which is nothing to be afraid of. Of course, I don't know the platform on which you are going to run this. But it should be an easy task to set up those tables and write a simple test program that inserts those records as fast as possible.
Your second question makes me wonder how "instant" you expect your message passing to be. Shall all viewers of a message receive each keystroke of the text within a millisecond? Or do they just need to see each message appear right after it was posted? Anyway, the limiting factor for user responsiveness will probably be the network, not the database.
Maybe this is not mainly a database design issue. Let's assume you will have a tremendous rate of postings and viewings. But not all conversations will be busy all the time. If the need arises - but not earlier - it might be necessary to hold the currently busy conversations in memory and using the database just as a backup for future times when they aren't busy any more.
Concerning your additional comments:
100k users: This is a topic not for this forum, but concerning business development of a startup. Many founders of startup companies imagine huge masses of users being attracted to their site, while in reality most startups just fail or only reach very few. So beware of investments (in money, but also in design and implementation effort) that will only pay in the highly improbable case that your company will be the next Whatsapp.
In case you don't really anticipate such masses of users but just want to imagine this as a programming exercise, you still have a difficult task. You won't have the platform to simulate the traffic, so there is no way to make measurements on where you actually have a performance problem to solve. That's one of the reasons for the Premature Optimization warning: Unless you know positively where you have a bottleneck, you - and all of us - will be just guessing and probably make the wrong decisions.
Marking a message as read is easy: Introduce a boolean attribute read at Message Participants, and set it to true as soon as, well, the user has read the message. It's up to your business requirements in which cases and to whom you show this.

ejabberd MUC and MUC/Sub - Clarification

I've recently been playing with the new MUC-Sub module in ejabberd - the use case being I need to have WhatsApp-like permanent rooms in my mobile app. Before I go too further into using MUC/Sub, can an ejabberd expert opine on the below concepts please? Is probably a lack of full knowledge of ejabberd on my part, hence the basic questions. Or else do let me know please a good place to start understanding the below better... I did study these two links in detail already (https://blog.process-one.net/xmpp-mobile-groupchat-introducing-muc-subscription/ and https://docs.ejabberd.im/developer/proposed-extensions/muc-sub/). Thanks!
Essentially, if we need an MUC room to stop being destroyed when all users go offline, could we not simply disable that feature - so that the service continues to operate even when participants leave or the room is empty. The service could still be made to continue pointing to the original room participants who joined the room, and in case there is a message sent in the room, the message would get queue up on each participant's stream. If a participant is offline, the message would enter his / her offline messages list (instead of the archive / MAM that MUC-Sub is currently utilizing). Why did we need to rely on the Pub-Sub and MAM model if this problem could have been solved using simply retention of the participant's reference in the room (even after he / she goes offline) and then leveraging the mod_offline module (which should happen automatically).
Am sure there is a fundamental reason here that am overlooking but appreciate if someone can throw some light please!
As the blog post explains, this is not a matter about keeping chat room alive or not. The fact that users cannot receive pushes when offline or when they reconnect, if they do not join again, is because MUC is based on presence. A user that is not present in the room is not an occupant of the room and is not supposed to receive anything.
I recommend you read careful XEP-0045 MUC and MUC Sub blog post again. The issue MUC Sub solves should be more obvious.
If you do that, you will notice that XEP-0045 define the idea of persistent MUC:
Persistent Room
A room that is not destroyed if the last occupant exits; antonym: Temporary Room.
Default in ejabberd is to create the room as temporary when a user joins, but the setting of the room can be changed so that it becomes persistent. In that case, it is not destroyed when the last occupant leave. You need to change room configurations option (same form you used to enable MUC Sub on that room).
You would generally want to combine this option for the room with MUC Sub enabling, so that MUC room are kept around even if no user are present in it.

Can creating table per synchronous chat instant be wrong idea

I am developing a chat kind of web app to experiment live group edit.
My idea, is something like wave, where you can edit even those you have already send.
I was planning to use mongodb or something similar for per chat basis.
My reason for that is: Say there are 100 texts in one instants of chat. And we have 10 such
chats. What happens is there will 1000 chats in the table in which its store. So even a person
in one chat edit his chat, the db has to look through all 1000. So if I use table per chat,
i felt it could improve speed and performance.
But I want to know from people who have done this before.
There are at least 2 obvious issues with the approach (not going to the schema design)
you can't have unlimited no of namespace (I am not sure how big is your use case)
Check this : http://docs.mongodb.org/manual/reference/limits/#namespaces
The write lock is per db, so if you make multiple tables/collections, it won't optimize on insertion

XMPP multiplayer gaming: should I store opponents as roster contacts?

I've read all 484 pages of Professional XMPP and read countless forum threads regarding rosters + XMPP and this question is still something I am struggling to solve. I'm looking for insight on best practices, so I at least know which direction to go in.
I'm building a cross-platform (web, iOS & Xbox), turn-based board game. Every player can have up to 100 different matches active at any given moment -- so they could easily skip from one game where it's not there turn to one where it is.
The game will feature a lobby where your list of active games are displayed, along with the name and online status of each opponent for that game (you may have up to 3 opponents, 4 total players per game).
Additionally, each player will have a friends list accessible from a different area which also lists online status.
I am using XMPP behind the scenes, completely transparent to the players, no one will ever sign in with a Jabber client or anything of the sort. I have complete control over how the information is displayed and utilized.
The main aspects I am using XMPP to solve are: notifications when an opponent has made a move, seeing my friends online statuses, and seeing my opponents online statuses, and in-game text chat.
So here's where I start having trouble: obviously your friends list will be contacts in your roster, so you can see their online status. But what about opponents? These are usually random opponents you will only play a single match with and never again -- yet your game with them may last up to 2 weeks.
Keeping in mind that everything is behind the scenes (ex: automatic subscription confirmations, etc) -- would the best course of action be to add each opponent to another group in your roster while the game is in progress and then remove them after the game is complete? That way you get presence notifications when that player is online? Or is this a case where PubSub could be utilized?
I've also considered using multi-user chat so I'd always have access to every users online status without subscriptions, but that seems far from efficient when there could be up to 20k players online at any given moment. Definitely sounds like a battery hog on mobile devices as well.
My other solution is to used share roster lists. Create a roster list for each game and assign that list to each player. Then delete the shared roster list once the game is complete.
I would choose Pubsub here. Of course, this means that you have to do some server side work too.
Send a directed presence to the opponents. This will allow them to see your presence.
I would consider using a multi-user chat for each game, and your own extension to the MUC protocol to handle game-state messages (opponent has made a move). The user can have a roster of their friends at the "global" level, but can still communicate with their opponents (and receive presence) using the MUC level (unless they decide to then add them as a friend).
See also: Advantages of Pubsub versus MUC
I agree that using MUCS (instant) would be better in this scenario. If you need to cleanup pubsub nodes of unwanted subscribers, it will definitely be a pain in the ass.