How to always gracefully disconnect sockets on server kill? - sockets

In our flask socketio app, we have a socket.on("disconnect") that is called whenever a socket client disconnects, to handle the db state updates. However, when our server is killed due to a restart or due to a crash, this disconnect function cannot be called (since the server is transiently nonexistent), and is discarded. When the server is back up, all those socket disconnects to each frontend can never be processed properly, so the state is inconsistent.
Is there a way to "cache" these disconnect events to run when the server is back up? The end goal is to ideally have all the sockets would reconnect as well automatically, but currently we do this disconnect then reconnect manually. Our setup is Gunicorn flask threads being Nginx load balanced with a redis event queue with flask socket io.

You should register process signal
def handler(signum, frame):
# loop and handle all socket disconnect
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)

The best way to handle this is to not force-kill your server. Instead, handle a signal such as SIGINT and disconnect all clients in a corresponding signal handler.
Another alternative is to keep track of your connected clients in a database (redis, for example). If the application is killed and restarted, it can go through those users and perform any necessary cleanup before starting a new server instance.


what should the server do when a conneceted client was force killed the process which both using tcp socket?

while using net and stream socket, after client connect server, what should the server do when a conneceted client was force killed the process which both using tcp socket?
does the server know when a connected client was force killed the process?
The server knows when a client socket gets closed, which it implicitly does when the process owning the socket gets killed. The server does not get the reason why the socket gets closed though.
So there is no way for the server to react specifically at a socket close due to process killed. The server can only react to a socket closed at a time when the server does not expect the socket to get closed. How the server should react to this depends on the specific use case, i.e. there is no universal behavior.

Is it possible to deploy without downtime without disconnecting TCP sockets connected?

There is a long connected TCP socket. Up to two clients can connect to a server. In other words, the load is not high. However, once a TCP connection is made, the socket will not be disconnected unless there is an accident, such as a server power down or network failure. Is it possible to reuse an existing TCP socket when restarting the process? I think TCP load balancer like AWS NLB cannot be used since the existing socket won't be moved to a new application. I'd like to have a deployment without downtime, as the system i'm working on is a system that can suffer financial damage when a socket is broken and data is lost. Low-level socket programming is ok.
I have read CloudFlare's article explaining Nginx's Gracefully Reload mechanism. Since an HTTP server is a server that opens and closes sockets frequently, that article assumes that the server's connection would someday be closed, but my situation is slightly different. So I'm not sure if this can be used.
A socket can be shared between multiple processes, for example by opening the socket in same parent processing and forking a child process. But if the last process using the socket is closed the socket and thus the underlying connection is implicitly closed.
This means you must make sure that there is always a process open which uses the socket. This can be for example done if the deployment of the new software does not first exit the old process and then creates the new one but if the new process would start and the old process would transfer the socket to the new one, see Can I share a file descriptor to another process on linux or are they local to the process?
for how this can be done in Linux. Other ways would be using file descriptor inheritance when doing a fork().
Note that these sharing of file descriptors will only work with plain sockets where the state is fully kept in the OS kernel. It will be much harder or impossible with TLS sockets since in this case also the current user space state somehow needs to be shared.
Another way is to have some intermediate "proxy" which on the hand has the stable socket connection to your fragil application and on the other hand is a robust socket handling (i.e. reconnect when needed) to the application you want to update. Then this proxy transfers the traffic between both sides and will reconnect the socket if needed whenever a problem occurs.

How to prevent Amazon ELB sits in front of RabbitMQ close connection with celery?

ELB will auto close connection for 60 secs idling, with TCP connection switch to CLOSE_WAIT state
however, celery doesn't get noticed and keep publish task message
message will be kept in send buffer
when buffer is full, celery publishing call will be blocked.
Possible damages:
Message in send buffer will be lost
The blocking publishing call will be very harmful to single thread ioloop frameworks. e.g. Tornado
BROKER_TRANSPORT_OPTIONS = {'confirm_publish': True} to make celery wait for ack for each published message, if ack not receive, it will re-build connection and send again. Only apply to py-amqp (ref), performance downgrades.
Celery-RabbitMQ Heartbeat to keep connection active and avoiding ELB's auto close connection. Add additional network overhead, heartbeat might not deliver to both end in bad network environment and cause this solution not working.

How to deploy a WebSocket server?

When deploying a web application running on a traditional web server, you usually restart the web server after the code updates. Due to the nature of HTTP, this is not a problem for the users. On the next request they will get the latest updates.
But what about a WebSocket server? If I restart or kill the old process all connected users will get disconnected. So my question is, what kind of strategy have you used to deploy a WebSocket server smoothly?
You're right, every connected user will be disconnected if the server restarts.
I think the less bad solution is to tell to the client to reconnect in the onClose method of the client.
WebSockets is just a transport mechanism. Libraries like exist to build on that transport -- and provide heartbeats, browser fallbacks, graceful reconnects and handle other edge-cases found in real-time applications.
In our WebSocket-enabled application, is central to ensuring our continuous deployment setup doesn't break users' active socket connections.
If clients are connected directly to sever that does all sockets networking and application logic, then yes - they will be disconnected, due to TCP layer that holds connection.
If you have gateway that clients will be connecting to, and that gateway application is running on another server, but will communicate and forward messages to logical server, then logical server will send them back and gateway will send back to client responses. With such infrastructure, you have to implement stacking of packets on gateway until it will re-establish connection with logical server. Logical server might notify gateway server before restart. That way client will have connection, it will just wont receive any responses.
Or you can implement on client side reconnection.
With HTTP, every time you navigate away, browser actually is creating socket connection to server, transmits all data and closes it (in most cases). And then all website data is local, until you navigate away.
With WebSockets it is continuous connection, and there is no reconnection on requests. Thats why you have to implement simple mechanics when WebSockets getting closing event, you will try to reconnect periodically on client side.
It is more based on your specific needs.

Is there a way to wait for a listening socket on win32?

I have a server and client program on the same machine. The server is part of an application- it can start and stop arbitrarily. When the server is up, I want the client to connect to the server's listening socket. There are win32 functions to wait on file system changes (ReadDirectoryChangesW) and registry changes (RegNotifyChangeKeyValue)- is there anything similar for network changes? I'd rather not have the client constantly polling.
There is no such Win32 API, however this can be easily accomplished by using an event. The client would wait on that event to be signaled. The server would signal the event when it starts up.
The related API that you will need to use is CreateEvent, OpenEvent, SetEvent, ResetEvent and WaitForSingleObject.
If your server will run as a service, then for Vista and up it will run in session 0 isolation. That means you will need to use an event with a name prefixed with "Global\".
You probably do have a good reason for needing this, but before you implement this please consider:
Is there some reason you need a connect right away? I see this as a non issue because if you perform an action in the client, you can at that point make a new server connection.
Is the server starting and stopping more frequently than the client? You could switch roles of who listens/connects
Consider using some form of Windows synchronization, such as semaphore. The client can wait on the synchronization primitive and the server can signal it when it starts up.
Personally I'd use a UDP broadcast from the server and have the "client" listening for it. The server could broadcast a UDP packet every X period whilst running and when the client gets one, if it's not already connected, it could connect.
This has the advantage that you can move the client onto a different machine without any issues (and since the main connection from client to server is sockets already it would be a pity to tie the client and server to the same machine simply because you selected a local IPC method for the initial bootstrap).