Dropping a streaming HTTP connection as soon as possible after losing connection - sockets

So what we're trying to achieve is maintaining a vast number of concurrent connections from mobile devices to our Erlang HTTP server. Mobile devices of course can have have pretty intermittent connections, so we're looking to drop dead connections as soon as possible to avoid their overhead.
Now, I'm not sure at what level we should be detecting dead connections. TCP has keepalive packets, which require an ACK. So ideally we'd send a keepalive packet ever 15 seconds, and if we didn't receive the ACK within the next 15 seconds then we'd drop the connection. However, I've no idea if this is even possible in Erlang. Also, I think there's the possibility that some NATs, wi-fi routers and mobile networks are ACKing the keepalives for a certain amount of time, correct me if I'm wrong. Is that the case, and if so is there any TCP-level alternative way of doing 'heartbeats'?
We've also tried an application-level heartbeat - sending a \n down the HTTP stream. However, even with all applicable Erlang options set, including send_timeout, we're not getting any error for about 5 minutes under certain circumstances, such as, say, the mobile device straying too far from its wi-fi router.
How best can we implement a streaming HTTP connection that the server will drop as soon as possible after losing contact? Any help'd be much appreciated!

You can add a specific watchdog for HTTP connection. Watchdog will have configurable timeout that will be reset after each operation (read or write) on connection. And if there were no operations on socket within specified timeout - connection is closed.
This approach will eliminate the problem of stale connections (connections perfectly healthy but without any I/O activity). And if clients is out of coverage - connection will last only up to specified timeout. Also no keep-alive mechanism is needed when using watchdog approach.
The only drawback is that server will not detect broken connections immediately but will instead wait timeout specified in connection watchdog.

Isac's comment answered it for me - configuring the socket keep alive timeout at the machine level.
See http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html

Related

Why is my TCP socket showing connected but not responding?

I have a program using a bi-directional TCP socket to send messages from the host PC to a VLinx ethernet-to-serial converter and then on to a PLC via RS-232. During heavy traffic the socket will intermittently stop communicating although all soft tests of the connection show that it is connected, active and writeable. I suspect that something is interrupting the connection causing the socket to close with out FIN/ACK. How can I test to see where this disconnect might be occuring?
The program itself is written in VB6 and uses Catalyst SocketTools/SocketWrench as opposed to the standard Winsock library. The methodology, properties and code seem to be sound since the same setup works reliably at two other sites. It's just this one site in particular where this problem occurs. It only happens during production when there is traffic on the network and can lose connection anywhere between 20 - 100 times per 10-hour day.
There are redundant tests in place to catch this loss of communication and keep the system running. We have tests on ACK messages, message queue size, time between transmissions (tokens on 2s interval), etc. Typically, the socket will not be unresponsive for more than 30 seconds before it is caught, closed and re-established which works properly >99% of the time.
Previously I had enabled the SocketTools logging capabilities which did not capture any relevant information. Most recently I have tried to have the system ping the VLinx on the first sign of a missed message (2.5 seconds). Those pings have always been successful, meaning that if there is a momentary loss of connection at a switch or AP it does not stay disconnected for long.
I do not have access to the network hardware aside from the PC and VLinx that we own. The facility's IT is also not inclined to help track these kinds of things down because they work on a project-based model.
Does anyone have any suggestions what I can do to try and determine where the problem is occurring so that I can then try to come up with a permanent solution to this issue rather than the band-aid of reconnecting multiple times per day?
A tool like Wireshark may be helpful in seeing what's going on at the network level. The logging facility in SocketTools/SocketWrench can only report what's going on at the API level, and it sounds like whatever the underlying problem is occurs at a lower level in the TCP stack.
If this is occurring after periods of relative inactivity, followed by a burst of activity, one thing you could try doing is enabling keep-alive and see if that makes any difference.

why many libraries does not detect dead TCP connections?

TCP has a keep-alive mechanism to detect dead connections, but it surprised me that this option is turned off by default and many libraries/tools do not utilize this feature.
If I am understanding correctly, a TCP connection blocked in a recv call won't be able to detect if a connection has been actually aborted by peer if all the FIN/RST packets from peer have been lost.
A timeout parameter on client side may alleviate the issue but many libraries does not have a option to set timeout either. One example is that the mysql-python connector does not have a recv timeout option. Another example is that a Nginx server talks to a gunicorn backend with proxy_pass, gunicorn workers may stop responding due to dead connections on it, but there is no way for gunicorn workers to detect it.
Could anyone can explain the reason or correct me if I am wrong?
The term "dead connection" is a bit ambiguous -- it could mean any of the following:
The peer program closed its socket (or the peer program exited or crashed, and the peer computer's OS closed the socket as part of its standard process-cleanup)
Connectivity to the peer computer has suddenly been lost (this could happen because the peer computer lost power, or somebody pulled out the Ethernet cord that was connecting the peer computer to the router, or the peer's ISP had a router failure, or your ISP had a router failure, or etc)
The peer program is still running but simply decided (for some reason, probably due to a bug) to stop calling recv() on his TCP socket anymore.
The packet-path between your program and the remote peer still exists, sort of, but something along that path is dropping so many packets that the effective transmission rate of the TCP connection has dropped to approximately zero.
So the first question to answer is, which of the above conditions will the TCP layer detect on its own?
Condition (1) is the easy case -- the peer's TCP stack will send you the FIN packets, and when your program's network stack receives them, it will know for sure that the TCP connection is closed and act accordingly, and therefore your recv() call will return 0 very quickly.
In condition (2), the answer is "sometimes" -- in particular, if your program has any TCP data in the socket's output buffer that it is trying to send to the peer, and it never gets any ACK packets back regarding that data, then after a certain number of timeouts (and subsequent packet-resend attempts), your computer's TCP stack will give up, declare the connection dead, and unilaterally close the TCP connection; at which point recv() will return 0. If there are no outgoing TCP data packets trying to be sent, on the other hand, then the local TCP stack won't be waiting for any ACKs to come back, and therefore it won't time out when it doesn't get them, and therefore it won't ever give up and close the TCP connection. In this scenario, your recv() call could well block indefinitely, because the TCP connection is idle and the TCP stack has no way of knowing that the peer is gone (as opposed to simply not sending any data right now). It is this scenario that the SO_KEEPALIVE option was meant to handle, but since the designers of the SO_KEEPALIVE option wanted to conserve bandwidth by default, and sending automatic keepalive packets uses up additional bandwidth, they decided to make the keepalive option disabled by default. Also, the default send-a-keepalive interval is often quite long by modern standards (e.g. hours) and on some OS's it is difficult to change except on a system-wide basis, which make SO_KEEPALIVE of limited usefulness for many applications.
For conditions (3) and (4), the TCP connection isn't really "dead", it's just that some device (either the peer program, or a piece of networking gear somewhere between your program and the peer) is being uncooperative. Since the TCP layer can't know what the applications that are using it are trying to achieve, it wisely doesn't try to second-guess them in this regard, and it leaves the TCP connection open unless you explicitly tell it to close() the connection.
So now that we've described the TCP layer's behavior, what about the applications and API's that use it? i.e. why don't they try to improve on the basic TCP-stack behavior by offering better detection? The answer is that some of them do; e.g. by periodically sending dummy "ping" messages across any socket that would otherwise be idle, simply to "stimulate" the TCP stack into detecting when no ACKs are coming back as described in the paragraph about condition (2), above. Some go even further and expect the remote peer to send a corresponding "pong" message to come back on the same socket within (so many) seconds, and if it doesn't, the program will unilaterally close the socket. This sort-of works, but it also makes assumptions about the performance of your network, and that can lead to false positives and therefore unwanted disconnections when the peer is connecting via a slow or unreliable network, which is why many applications/libraries don't implement this (or at least don't enable it by default).
It's not surprising to me that keep-alive is turned off by default.
Because it's always possible that the peer program can freeze due to a bug or error, etc. In this case recv also blocks forever even if the TCP connection is alive. So keep-alive may be not so useful after all (except to prevent router from dropping connection). Various reasons might cause your recv to block forever anyway.
Besides, a low-level underlying protocol for general purpose should probably be kept as simple as possible.
In addition, I'm not surprised by your examples about not being able to set timeout either. Look at the most popular software tools in this world. They are polished, evolved, optimized, and used for such a long time. Yet many of them still freeze, crash, or misbehave rather frequently. Writing correct code is meticulous work. Not to mention further requirements like security, cross-platform, backward compatibility. Programmer's life is not easy.

Do I need to `ping` connected websocket connections?

I would like to keep the Websocket connection alive for an undefined amount of time. The socket will ideally be sending data every so often but this is not assured, and I also would not like to make assumptions since a user can be in an idle state.
I have an object that stores references to all websocket connections. Would it be appropriate for me to schedule a function every x number of minutes? seconds? that basically iterates through all the connections, pings them and then discards those that haven't received pongs? Or do I need to enable a flag that automatically keeps the connection alive?
I am using the ws library on my server, but create websocket connections natively on the client.
There's no good way for you, on the client end of things, to know how many proxies, firewalls, NATs, etc occur in the network path from your client machine to the destination server. Any one of those could have its own separate idle timer. Using TCP keepalive may work, but only for the TCP session from your client to the next hop -- which may or may not actually be the end server.
Given the above, I would recommend that yes, you should ping your connected WebSocket sessions periodically. Whether you receive the pong from the server is, from the point of view of keeping your connections alive through that (possibly convoluted) chain of network middleboxes, irrelevant; you simply want to make sure that everything along the path sees some traffic flowing in order to reset their idle timers.
Obviously you want to trade off how often you ping your connected WebSocket sessions with how much overhead is incurred; pinging every 1 second would be a bit much, for example. You may need some fine-tuning to determine, experimentally, just what a good ping interval is for your needs.
Hope this helps!

Does listen() backlog affect established TCP connections?

Would it be naive to create a TCP socket with a listen backlog set to minimum as a way of rate limiting new incoming connections? The server workload in question doesn't expect many new connections at any time but spends a lot of time servicing long open persistent connections. It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text. Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
Specifically the platform in use is Linux, and although it may be handled differently in different OSs, I expect them to all behave roughly the same.
EDIT What I mean by the "same" is that backlog doesn't affect established connections, though I do understand Linux discards them while Windows sends a reset.
Does listen() backlog affect established TCP connections?
It affects established connections that the server hasn't accepted yet via accept(), only in the sense that it limits the number of such connections that can exist.
Would it be naive to create a TCP socket with a listen backlog set to minimum as a way of rate limiting new incoming connections?
All it would accomplish would be to unnecessarily fail some connecting clients. They won't get any service until your server gets around to it anyway, and once the backlog queue fills they are rate-limited by your service code anyway. There is no particular reason why shortening the queue would have any beneficial effect. The other problem with the idea is that it isn't readily possible to determine what the minimum actually is, or whether you succeeded in setting it as the backlog queue length.
It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text.
That is correct. There is no reason why it should affect them: that's why you won't find it written down anywhere, any more than the fact that the phase of the moon doesn't affect it either.
Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving
No.
or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
They're not dropped. They simply aren't even created if they won't fit on the backlog queue. Ergo their resource consumption at the server is zero.
Specifically the platform in use is Linux, and although it may be handled differently in different OSs, I expect them to all behave roughly the same.
They don't. On Windows, an incoming connection when the backlog queue is full causes an RST to be issued. On other platforms it is simply ignored.
What you describe are several types of attacks like flooding, syn attacks and other goodies resulting in denial of service.
This topic is not easy, because protection has to be implemented in all the layers, including TCP. For instance a SYN attack, fiddling with the sequence numbers, ... . At that point the packet in question already came a long way, through the ethernet layer and ip layer, bottom line it is taking resources. So if your system is under attack, the attacking packets are in your data stream just like the good ones are. The faster you can detect a packet is faulty and drop it, the better. Usually a system that is under attack will be slower. Well at least the systems that I have worked with.
Some attacks try to bring your system in a faulty state permanently, this by exploiting bugs. For instance TCP has a receive queue, if packets are constantly arriving out of order they will be stored in that receive queue. If the missing packet never arrives, then this receive queue could keep on growing and growing. Without the proper defense , this would lead to the system going completely out of resources.
There are specialised tools (codenumicon for instance) to check the vulnerability of a TCP stack implementation. You can assume that the one on linux has been properly tested using similar tools.
An attack can also occur on the application layer. If you have a TCP server and it allows only a limited amount of sessions. A malicious user can simply take all the connections simply by establishing all the connections and then not doing anything with it. So you have to create some defense as well. Weather or not you set this limit very low or high does not change a thing. A malicious user will try anything to bring your system down. You need to built in defense anyway. You can connect to a webserver (HTTP) simply using telnet. If you don't send anything the server's defense will come into play and close the connection.
So bringing the amount of possible connections to a low value and thinking that this in itself is a form of protection is indeed naive.
Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
They are using resources of your machine and will make your system run slower.
It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text.
If it is normal user trying to establish a connection, even if he is doing it continuously, retrying upon failure. The influence will be minimal, close to nothing. But a malicious user that is flooding connections attempts will have influence on the system performance, because the system has to spent time identifying those flawed packets and dropping them asap.

Best socket options for client and sever that continuously transfer data

I am using Java (although I think the socket options is implement in most languages) to implement a client and server. The server sends data to the client for processing which the client acknowledges. On another port the client then sends the results of the processing back to the server. When it comes to options such as
SO_LINGER
SO_KEEPALIVE
SO_NODELAY
SO_REUSEADDRESS
SO_SENDBUFFER
SO_RECBUFFER
TCP_NODELAY
We have noticed that the connection between the client and server occasionally breaks. There will be a timeout on the send or the receive. When this happens will kill the socket and open a new one to continue.
What would be the best options to set in terms of the above scenario and is there anything that we could do from our side (programmatically or options-wise) to try minimize the amount of times the connection is dropped. We are using normal TCP/IP.
UPDATE:
The bounty on this ends soon. I haven't had a satisfactory answer yet so it is still open. I think everyone is missing the point of the quest. What is the best practice with regards to the options above for sockets that continuously chat. I have already got a ping packet in that if there is no work to be done (hardly ever the scenario) the normal message is sent with no inner elements so there is always processing.
Strictly speaking, you don't need any of these socket options:
* SO_LINGER
You need to set SO_LINGER only if your application still has outstanding packets to send when close(2) or shutdown(2) has been called. Not really applicable for your application.
* SO_KEEPALIVE
Sending keepalive-pings every two hours would really only help very long-lived but -very- quiet connections going through stateful firewalls with very long session timeouts. (Two hours between pings is entirely too long to be practical in today's Internet.)
* SO_NODELAY
This (presumably an alias for TCP_NODELAY) disables Nagle's algorithm, which is just a small-packet-avoidance problem. Perhaps Nagle is getting in the way in your application, but it takes special sequences of packets to introduce 500ms delays into processing; it never just hangs connections.
* SO_REUSEADDRESS
Useful for all 'servers' that listen on well-known port numbers; use on 'clients' is almost always covering up some bug or other, but it is sometimes necessary if requests must come from a well-known port number.
* SO_SENDBUFFER
* SO_RECBUFFER
These buffer sizes influence the kernel-side buffer sizes maintained for receiving or sending data while your program (receive buffer) or the socket (send buffer) isn't yet ready to accept more data. If these are set too small, your application might not transfer data as smoothly as possible, reducing throughput, but it should not lead to any stalls if these are set smaller than optimal. Of course, too large may put unreasonable demands on kernel memory, but there should be a reasonable system-wide maximum allowed size.
* TCP_NODELAY
Disables Nagle. Not likely to do more than introduce 500ms delays if your application sends multiple small packets before attempting a blocking read.
Really, you shouldn't need to set any socket options.
Can you distill your code into something that could be pasted here and tested or inspected? I'm used to TCP sessions surviving for days or weeks without trouble, so this is pretty surprising.
First I think that this page is relevant, regarding half-open connections.
http://nitoprograms.blogspot.com/2009/05/detection-of-half-open-dropped.html
That being said, TCP is designed to hide connection problems, so you may often find yourself in cases where the connection is broken, but neither side thinks it is. You have addressed this partially by using timeouts and taking that as a sign the connection is broken.
Since you are writing the client and server, I would avoid relying on TCP to tell you when the connection is broken altogether. I would just have the server also acknowledge the receipt of the result from the client. Then both sides will expect immediate responses to their messages, and you can track which messages have been ack'd and set an appropriately small timeout for receiving the ack. This is not a timeout on the send or receive, but a timeout on the time between sending a message and receiving the ack for that message. Then you can set the timeout appropriately depending on the quality of your connection (e.g. very small if you are running on loopback, but large if running over wireless with a weak signal).
Regarding the options you list, you will want to use SO_REUSEADDRESS so that you won't be prevented from reopening the socket, for example if it hasn't finished closing from a previously killed process.
You probably have, but it is best to check the obvious....
Have you verified that it IS the socket that is timing out, and not your code? Sockets are fairly stable, and while there might be an issue somewhere, it seems more likely that it is in your code. I would use logs, timestamps, and synchronised clocks to be sure.
There may be an issue that you genuinely DO take a long time to do the calculation, so maybe adding a 'I'm still thinking about it' message to your protocol that gets sent regularly, to keep the connection alive?
Of course networks will drop out from time to time regardless of what you do, and it sounds like you are already handling that case nicely.
try these options
SO_LINGER - for specyfying when the Socket close s called while some unsent data in the queue
TCP_NODELAY - For non blocking datat transfer
I would strongly encourage you to use a ping/echo model between client and server, so that if no data is sent for x seconds a ping message needs to be send. A typical reason for a break might be a firewall, which shuts down socketss because of inactivity.
The typical issue where the TCP model fails are physical problems e.g. a pulled/broken cable and hangs on one side, where technically someone is listening until a queue overrun kicks in (which might never happen given your amount of data).
What are the chances the connection is going through a NAT firewall somewhere along the way? Stateful firewalls maintain a table of open connections so that packets belonging to an allowed connection can quickly pass through the system, without forcing firewall admins to write overly-complex rule sets.
The downside is that this table can grow immensely large, so it must be pruned as connections are closed or as they appear to have simply grown stale and died quietly. A connection that has gone silent for 20 minutes is usually quiet enough to reaped. (Which is really very quick, as the TCP KEEPALIVE is typically two hours, making it nearly useless in the face of NAT firewalls.)
So: is this going through a NAT firewall? Is the connection quiet for long stretches? If so, add a ping/pong to your protocol, and fire it every few minutes.