I'm using PostgreSQL's NOTIFY command to send async events to inform external programs of the changes happening inside a database. It works perfect but now I've got a new scenario. I need to have several databases within an instance of PostgreSQL.
As I've read the documentation and tested it myself, NOTIFY does not go beyond the borders of a database (to other databases within the PostgreSQL instance).
Whenever the command NOTIFY channel is invoked, either by this session
or another one connected to the same database, all the sessions
currently listening on that notification channel are notified, and
each will in turn notify its connected client application.
Which means I have to listen to notifications of each database separately. And since I'm planning to provide my users with the capability to instantiate their own database on-demand, it means I have to make new listener connections for each new database as well. It poses a challenge and I really prefer if I can have a constant number of listener connections, regardless of the number of databases.
Does anyone know how to send notifications across databases in PostrgeSQL or some other feature I can use?
Related
I currently have a server that watch some events on the ethereum blockchain. When some events are triggered there, as my server is subscribed to them it will pick them up, do some stuff and fill my database accordingly.
That being said, for scalability purpose, let say I would like to now have several instance of my server. So now, I have 3 servers that watches the ethereum blockchain for events and fill my db.
What is the proper/standard way to tackle the fact that all my server will be pushing the same data on my db ?
I'm using a postgres database to maintain a list of rooms and connected users to each room.
Users can enter a room whenever they want, but they should leave the room when they close the browser.
A good flow of events should be
User enters room (user's room var is set) -> ... -> User disconnects
and server notices (user's room var is unset)
But what if this happens?
User enters room (user's room var is set) -> ... -> Server crashes or
shuts down for updates -> User disconnects and server doesn't notice
(user's room var is still set) -> Server is back on
In this last case, the database state is already broken. What's the best way to deal with something like this? Thanks
Let's divide the answer into 2 aspects:
User Aspect:
Regardless of the language at hand, you should be made aware of disconnection events using a Socket event/exception handling.
If the server crashes, your user will experience an abrupt socket disconnection/connection closing/session termination, depending on which framework your are using. TCP Sockets also have keepalive (SO_KEEPALIVE) exactly for that (you can usually control these (or similar) settings from the high-level protocol.
So, all you need to to do in that case is run maintenance code on the user's end (unset a variable in you describe case)
Server Aspect:
It's a bit trickier here. What you are basically looking for is ephemeral state management, meaning, the ability to react to abprut service/server termination (server crashes that result in an corrupted/unclean state), and clean-up after them
For that, Technologies like Zookeeper or Consul exist. I personally recommend Zookeeper, as I have built similar solutions on top of it in the past, several times.
With zookeeper, when your server startup, it can, for instance, creates an EPHEMERAL node. That node will be created once the server goes up, and will remain there for as long as the server is alive and connected to the Zookeeper cluster. If the server crashes unexpectedly, This node is removed.
You can then have a separate application/script that listens to events on that zk node/path. If it's suddenly remove, you can run a cleanup routine on the database.
This approach supports multiple app instances of course - you can listen on an events under a path and have all server instance register using different nodes under it. The removed node can contain instance specific identifiers, and you can use those to clean up specific instance state from the database.
It can also be a wise choice to remove clean-up/maintenance duty to a separate component
(Note that ZooKeeper requires careful attention when dealing connection/state events)
Some additional Zookeeper reading material
Final Thoughts:
Of course the answer can be fine-tuned based on specific needs that were not presented in the question.
When building complex, stateful solution, I personally aim to deal with crashes on all ends of the solutions, playing 'safe' where possible
An application that I'm working on uses AFTER CREATE/UPDATE/DELETE triggers to create pg_notify notifications when certain actions occur within the system. Currently, we have a small Node.JS application that LISTENs for the events and then immediately turns around and posts them to an AWS SNS topic, which gets forwarded to our SQS event queue. From that queue, we trigger all sorts of things based on the event (emails, SMSs, lambdas, long running jobs, etc).
This architecture works well, but the Node.JS application that sits in between the PostgreSQL instance and the SNS topic seems a bit fragile. I can't really run two copies in two availability zones, because messages will be duplicated.
I'm looking for a better way to get these Postgres notifications into SQS. Are there any options out there for this? If Postgres Aurora has something, we might consider that.
Use your current strategy of a small application that LISTENs for events. Just introduce a deduplication step between that app and your event subscribers. This will allow you to run several instances of your app.
For example, you could use a FIFO SQS queue. These automatically drop duplicate messages. Since FIFO queues cannot subscribe to SNS, you'd need to put messages directly to the queue instead of through SNS.
Alternatively, you could use DynamoDB to store checksums of your recent messages and if your app encounters a duplicate, drop it manually (make sure to use conditional writes to prevent race conditions).
Some options I've found:
Continue with the current method
I could keep the current small application that's redirecting events from my PostgreSQL RDS and dumping them into SNS->SQS. I can deploy it in a 1 region/max 1/min 1 auto-scaling group to make sure there is not more than copy running at a time.
Ditch my RDS and use a self hosted database
I could ditch RDS and run PostgreSQL on an EC2 instance and then use PL/Python along with the AWS-SDK to make calls to SNS instead of using pg_notify. I don't like this idea, because I lose the ease of use that comes with RDS.
For now, I'll be sticking with the current method, unless someone has some other ideas that I could explore. I'm sure there will be more options in the future (like when Aurora PostgreSQL adds support for calling Lambdas, like the Aurora MySQL has).
I have a distributed REST application, written in C++, with an integrated SQLite DB. The application is self contained - no apache or iis server, and no external mysql. The application is the logic behind a hardware sensor: the application monitors sensor(s), identifying and storing data of interest, and generating "events" when data of interest repeats. The creation of data of interest is synchronized across the Internet to multiple instances of the application using REST to communicate the synchronization.
Using basic authentication over https, each instance maintains a local key/value store of remote instances' user/pass authentication data. This is necessary because each communication with a remote instance of the application requires authentication.
My question is how to handle the situation when the human operator changes either the username or password in the application, while the application is in active synchronization with remote instances.
I'm thinking this is really no different than any other material application data changing - when a local username / password changes, a REST communication is posted to each synchronization instance containing the changed data for that remote's local key/value store. Any communications that fail get queued for when that remote is back, as that is material information the remote needs to maintain synchronization.
Because the communications occur over https, the fact that authentication data is being passed around is okay.
I thought I might need special logic to handle the race condition where one instance tries to communicate with another, but the other has just changed its authentication fields. The sender will queue with my current logic, and when the remote sends it's updated authentication data, the locally queued failed communications will start succeeding. So that does not appear to be an issue.
I guess this is a request for anyone that's been here before, what did you do? Maybe my search terms are weak here, because I'm not finding discussion of this issue.
I am designing a whatsapp like messenger application for the desktop using WPF and .Net. Now, when a user creates a group I want other members of the group to receive a notification that they were added to a group. My frontend is built in C#.Net, which is connected to a RESTful Webservice (Ruby on Rails). I am using Postgres for the database. I also have a Redis layer to cache my rails models.
I am considering the following options.
1) Use Postgres's inbuilt NOTIFY/LISTEN mechanism which the clients can subscribe to directly. I foresee two issues here
i) Postgres might not be able to handle 10000's of clients subscribed directly.
ii) There is no guarantee of delivery if the client is disconnected
2) Use Redis' Pub/Sub mechanism to which the clients can subscribe. I am still concerned with no guarantee of delivery here.
3) Use a messaging queue like RabbitMQ. The producer of this queue will be postgres which will push in messages through triggers. The consumer of-course will be the .Net clients.
So far, I am inclined to use the 3rd option.
Does anyone have any suggestions how to design this?
In an application like WhatsApp itself, the client running in your phone is an integral part of a large and complex event-based, distributed system.
Without more context, it would be impossible to point in the right direction. That said:
For option 1: You seem to imply that each client, as in a WhatsApp client, would directly (or through some web service) communicate with Postgres as an event bus, which is not sound and would not scale because you can only have ONE Postgres instance.
For option 2: You have the same problem that in option 1 with worse failure modes.
For option 3: RabbitMQ seems like a reasonable ally here. It is distributed in nature and scales well. As a matter of fact, it runs on erlang just as most of WhatsApp does. Using triggers inside Postgres to publish messages however does not make a lot of sense.
You need a message bus because you would have lots of updates to do in the background, not to directly connect your users to each other. As you said, clients can be offline.
Architecture is more about deferring decisions than taking them.
I suggest that you start simple. Build a small, monolithic, synchronous system first, pushing updates as persisted data to all the involved users. For example; In a group of n users, just write n records to a table. It is already complicated to reliably keep track of who has received and read what.
This heavy "group" updates can then be moved to long-running processes using RabbitMQ or the like, but a system with several thousand users can very well work without such thing, especially because a simple message from user A to user B would not need many writes.