redis- Should I use redis to store chat messages? - mongodb

So I am currently working on a chat, and I wonder if I could use Redis to store the chat messages. The messages will be only at the web and I want at least a chat history of 20 messages for each private chat. The Chats subscribers will be already stored in MongoDB.
I mainly want to use Redis, because I get rid of the MongoDB stuff, for more speed.
I already use Pub/Sub, but what about storing a copy in Redis Lists? Also what about reading statuses, how could I implement that?

Redis only loses data in case of power outage, if the system is shutdown properly, it will save its data and in this case, data won't be lost.
It is good approach to dump data from redis to mongoDb/anyotherDb when a size limit is reached or on date basis (weekly or monthly) so that your realtime chat database stays light weighted.
Many modern systems now a days prepare for power outage, a ups will run and the system will shutdown properly.
see : https://hackernoon.com/how-to-shutdown-your-servers-in-case-of-power-failure-ups-nut-co-34d22a08e92
Also what about reading statuses, how could I implement that?
Depends on protocol you are implementing, if you are using xmpp, see this.
Otherwise, you can use a property in message model for e.g "DeliveryStatus" and set it to your enums (1. Sent, 2. Delivered, 3. Read). Mark message as Sent as soon as it is received at server. For Delivered and Read, your clients will send you back packets indicating the respective action has occurred.

As pointed in the comment above, the important thing to consider here is the persistency model. Redis offers some persistency (with snapshots and aof-files). The important thing is to first understand what you need:
can you afford to lose all the data? can you afford to lose some of the data? if the answer is no, then perhaps you should not bother with redis.

Related

Application Advise: SSE vs. WebSockets

I'm writing an application in which users will be able to send money to each other. I've built out most of it, but now comes the most important part, managing transactions as they occur.
What I'd like to do upon a successful transaction is to send an update to the recipient. Right now, my thinking is to do this via SSE or WebSockets. For this particular app, it doesn't appear that I need bi-directional communication, since the response would only be sent to the recipient's instance which should be listening for a response from the server.
I might be answering my own question here, but I also wanted to factor in scale. If my app grows to a million users, for instance, which technology would best be able to handle the number of transactions being processed without failure?
I'm also a little unsure as to how to implement this for the case where there is a multiplicity of users, but I only want a particular user to receive the update.
Any advise would be greatly appreciated.
Thanks!

Which is best polling or realtime for google applications like Gmail or Google Drive?

In general everyone say realtime is best for the performance of the application but is it good to have all the applications as realtime ??
There are some cases where polling might be better than real-time streaming. Essentially, it's when you have a massive event stream and the client cannot easily cope with this stream in real time. For example, you are pushing tons of events to a mobile device that dequeues the data more slowly than the producer. In such a case, thanks to polling, the client could ask for a new batch of data, process it quietly, than ask for another batch. Of course, all this makes sense if the data producer (the server) is able to resample the data flow so that at each request, it doesn't need to send all the same data it would send when streaming.
So, to go back to your specific question, both Gmail and Google Drive do not produce so much real-time data to need polling (I know this sounds counterintuitive!), and I would then say that real-time streaming would always be better than polling. But streaming is a bit more delicate than polling). You must monitor if the connection is healthy. It could be half-closed or half-opened and you need bidirectional heartbeats to make sure it's fully alive. In case of disconnection, you must be able to automatically reconnect and restore the state before the connection broke.

how to design a realtime database update system?

I am designing a whatsapp like messenger application for the desktop using WPF and .Net. Now, when a user creates a group I want other members of the group to receive a notification that they were added to a group. My frontend is built in C#.Net, which is connected to a RESTful Webservice (Ruby on Rails). I am using Postgres for the database. I also have a Redis layer to cache my rails models.
I am considering the following options.
1) Use Postgres's inbuilt NOTIFY/LISTEN mechanism which the clients can subscribe to directly. I foresee two issues here
i) Postgres might not be able to handle 10000's of clients subscribed directly.
ii) There is no guarantee of delivery if the client is disconnected
2) Use Redis' Pub/Sub mechanism to which the clients can subscribe. I am still concerned with no guarantee of delivery here.
3) Use a messaging queue like RabbitMQ. The producer of this queue will be postgres which will push in messages through triggers. The consumer of-course will be the .Net clients.
So far, I am inclined to use the 3rd option.
Does anyone have any suggestions how to design this?
In an application like WhatsApp itself, the client running in your phone is an integral part of a large and complex event-based, distributed system.
Without more context, it would be impossible to point in the right direction. That said:
For option 1: You seem to imply that each client, as in a WhatsApp client, would directly (or through some web service) communicate with Postgres as an event bus, which is not sound and would not scale because you can only have ONE Postgres instance.
For option 2: You have the same problem that in option 1 with worse failure modes.
For option 3: RabbitMQ seems like a reasonable ally here. It is distributed in nature and scales well. As a matter of fact, it runs on erlang just as most of WhatsApp does. Using triggers inside Postgres to publish messages however does not make a lot of sense.
You need a message bus because you would have lots of updates to do in the background, not to directly connect your users to each other. As you said, clients can be offline.
Architecture is more about deferring decisions than taking them.
I suggest that you start simple. Build a small, monolithic, synchronous system first, pushing updates as persisted data to all the involved users. For example; In a group of n users, just write n records to a table. It is already complicated to reliably keep track of who has received and read what.
This heavy "group" updates can then be moved to long-running processes using RabbitMQ or the like, but a system with several thousand users can very well work without such thing, especially because a simple message from user A to user B would not need many writes.

What are the possible use cases of the OrientDb Live Query feature?

I apologise if the question is naive. I wanted to understand what could be a few possible use cases of the live query feature.
Let's say - My database state changes but it doesn't change every minute (or hour). If I execute a live query against my database/class/cluster, I'm not really expecting the callback to be called anytime soon. But, hey, I would still want to be notified when there's a state change.
My need with Orientdb is more on lines of ElasticSearch's percolator bundled with a publish-subscribe system.
Is live query meant to cater to such use cases too? Or is my understanding of live query very limited? What could be a few possible use cases for the live query feature?
Thanks!
Whether or not Live Queries will be appropriate for your use case depends on a few things. There are several reason why live queries make sense. A few questions to ask are:
How frequently does the data change?
How soon after the data changes do you need to know about it?
How many different groups of data (e.g. classes, clusters) do you need to deal with?
How many clients are connected to the server?
If the data does not change very often, or if you can wait a set period of time before an update, or you don't have many clients (hitting the DB directly), or if you only have one thing feeding the database, then you might want to just do polling. There is a balance between holding a connection open that you send a message on very infrequently (live queries) and polling too often.
For example. It's possible that you have an application server (tomcat, node, etc) and that your clients connect via web sockets. Now lets say your app server makes one (or a few pooled) live query to the database. Now lets say your database has an update. It might just go from the database to the app server (e.g. node). Node may now be responsible for fanning out that message across 100 web sockets (1 for each connected client). In this case, the fact that node is connected to the database in a persistent way with a live query open, is not that big of a deal.
The question is. If you have thousands of clients connected, do they all need an immediate update. If so are you planning on having them polling at a short interval? If so, you probably could benefit from a live query. Lots of clients polling at a short interval will generate a lot of unnecessary traffic and queries.
Unfortunately at the end of the day, the answer is it depends. You probably need to prototype and then instrument under load to see what your tradeoffs are. But in principal, it is less about how frequently updates come, and more about how often you would have clients poll, and how many clients you have. If the answer is "short intervals and a lot of clients" Give live queries a try.

Interprocess messaging - MSMQ, Service Broker,?

I'm in the planning stages of a .NET service which continually processes incoming messages, which involves various transformations, database inserts and updates, etc. As a whole, the service is huge and complicated, but the individual tasks it performs are small, simple, and well-defined.
For this reason, and in order to allow for easy expansion in future, I want to split the service into several smaller services which basically perform part of the processing before passing it onto the next service in the chain.
In order to achieve this, I need some kind of intermediary messaging system that will pass messages from one service to another. I want this to happen in such a way that if a link in the chain crashing or is taken offline briefly, the messages will begin to queue up and get processed once the destination comes back online.
I've always used message queuing for this type of thing, but have recently been made aware of SQL Service Broker which appears to do something similar. Is SQLSB a viable alternative for this scenario and, if so, would I see any performance benefits by using that instead of standard Message Queuing?
Thanks
It sounds to me like you may be after a service bus architecture. This would provide you with the coordination and fault tolerance you are looking for. I'm most familiar and partial to NServiceBus, but there are others including Mass Transit and Rhino Service Bus.
If most of these steps initiate from a database state and end up in a database update, then merging your message storage with your data storage makes a lot of sense:
a single product to backup/restore
consistent state backups
a single high-availability/disaster recoverability solution (DB mirroring, clustering, log shipping etc)
database scale storage (IO capabilities, size and capacity limitations etc as per the database product characteristics, not the limits of message store products).
a single product to tune, troubleshoot, administer
In addition there are also serious performance considerations, as having your message store be the same as the data store means you are not required to do two-phase commit on every message interaction. Using a separate message store requires you to enroll the message store and the data store in a distributed transaction (even if is on the same machine) which requires two-phase commit and is much slower than the single-phase commit of database alone transactions.
In addition using a message store in the database as opposed to an external one has advantages like queryability (run SELECT over the message queues).
Now if we translate the abstract terms 'message store in the database as being Service Broker and 'non-database message store' as being MSMQ, you can see my point why SSB will run circles any time around MSMQ.
My recent experiences with both approaches (starting with Sql Server Service Broker) led me to the situation in which I cry for getting my messages out of SQL server. The problem is quasi-political but you might want to consider it: SQL server in my organisation is managed by a specialized DBA while application servers (i.e. messaging like NServiceBus) by developers and network team. Any change to database servers requires painful performance analysis from DBA and is immersed in fear that we might get standard SQL responsibilities down by our queuing engine living in the same space.
SSSB is pretty difficult to manage (not unlike messaging middleware) but the difference is that I am more allowed to screw something up in the messaging world (the worst that may happen is some pile of messages building up somewhere and logs filling up) and I can't afford for any mistakes in SQL world, where customer transactional data live and is vital for business (including data from legacy systems). I really don't want to get those 'unexpected database growth' or 'wait time alert' or 'why is my temp db growing without end' emails anymore.
I've learned that application servers are cheap. Just add message handlers, add machines... easy. Virtually no license costs. With SQL server it is exactly opposite. It now appears to me that using Service Broker for messaging is like using an expensive car to plow potato field. It is much better for other things.