I've read that NAT routers "assume a connection has been terminated if no data has been sent for a certain time period."
I've also read that TCP keepalive packets usually shouldn't contain any data.
So my questions are:
Are the above statements true?
Do NAT routers consider empty TCP keepalive packets when reordering/cleaning their tables?
I'm asking this because I need a reliable connection between two endpoints where both of them have to be able to detect and react to connection problems. I know that I might just implement a keepalive mechanism myself but I want to know whether the TCP implementation could be used for that.
I do believe the second statement refers to payload (The shortest possible TCP/IP packet is 40 bytes long - 20 bytes for TCP header + 20 bytes for IPv4 header).
Regarding the first, here's a quote from RFC 2663:
End of session for TCP, UDP and others
The end of a TCP session is detected when FIN is acknowledged by
both halves of the session or when either half receives a segment with
the RST bit in TCP flags field. However, because it is impossible for
a NAT device to know whether the packets it sees will actually be
delivered to the destination [...] the NAT device cannot safely assume
that the segments containing FINs or SYNs will be the last packets of
the session [...] Consequently, a session can be assumed to have been
terminated only after a period of 4 minutes subsequent to this
detection. The need for this extended wait period is described in RFC
793 [Ref 7], which suggests a TIME-WAIT duration of 2 * MSL (Maximum
Segment Lifetime) or 4 minutes.
Reference: https://www.rfc-editor.org/rfc/rfc2663
To my understanding, any packets that identifies a session would reset the TTL counter - but that depends heavily on implementation, since 'data' can be understood as 'packet' (minimum 40 bytes) or 'packet payload'. Nonetheless, #CodeCaster is spot-on; never assume that a connection is alive, make sure it is before sending (and, if possible and depending on criticality, acknowledge receipt.)
Related
I have a C/C++ application set up as follows:
A non-blocking TCP server socket on a linux platform
A thread which writes a small packet (less than 20 bytes) to the socket at 1 Hz
The socket is configured with keepalive enabled and with: keepidle=5, keepintvl=5 and keepcnt=3
My intention is that the keepalive mechanism should detect a physical disconnection of the network link. However, when the link is cut, I do not see the zero-length packets which should be generated by the keepalive mechanism (I am using tcpdump to monitor traffic). My impression is that what happens is this: after the cable disconnection, the application keeps making send requests and the fact that there are pending send requests prevents the keepalive mechanism from being activated. Is this explanation valid?
In order to check my explanation, I have modified my test as follows:
A non-blocking TCP server socket on a linux platform
A cyclical thread which writes a small packet (about 100 bytes) to the socket with a period of 30 seconds
The socket is configured with keepalive enabled and with: keepidle=5, keepintvl=5 and keepcnt=2
In this case, if I cut the connection, the keepalive mechanism triggers within about 15-20 seconds (which is what I would expect).
On a related point, I would like to understand the exact semantics of tcp_keepidle. This is defined as: "The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes". What exactly does 'idle' means in this context? Does it simply mean that nothing is received and nothing is put on the network; or does it mean that nothing is received and no send requests are made to the socket?
Say Server S have a successful TCP connection with Client C.
C is keep sending 256-byte-long packets to S.
Is it possible that one of packets only receive part of it, but the connection does not break (Can continue receive new packets correctly)?
I thought TCP protocol itself will guarantee not lose any bytes while connecting. But seems not?
P.S. I'm using Python's socketserver library.
The TCP protocol does guarantee delivery. Thus (assuming there are no bugs in your code and in the TCP stack), the scenario you describe is impossible.
Do bear in mind that TCP is stream- rather than packet-oriented. This means that you may need to call recv() multiple times to read the entire 256-byte packet.
As #NPE said, TCP is a stream oriented protocol, that means that there is no guarantee on how much data bytes are sent in each TCP packet nor how many bytes are available for reading in the receiving socket. What TCP ensures is that the receiving socket will be provided with the data bytes in the same order that they were sent.
Consider a communication through a TCP connection socket between two hosts A and B.
When the application in A requests to send 256 bytes, for example, the A's TCP stack can send them in one, or several individual packets, or even wait before sending them. So, B may receive one or several packets with all or part of the bytes requested to be sent by A, and so, when the application in B is notified of the availability of received bytes, it's not sure that it could read at once the 256 bytes.
The only guaranteed thing is that the bytes B reads are in the same order that A sent them.
If client socket sends:
Packet A - dropped
Packet B
Packet C
Will server socket receive and queue B and C and then when A is received B and C will be passed to the server application immediately? Or B and C will be resent too? Or no packets will be sent at all until A is delivered?
TCP is a sophisticated protocol that changes many parameters depending on the current network state, there are whole books written about the subject. The clearest way to answer your question is to say that TCP generally maintains a given send 'window' size in bytes. This is the amount of data that will be sent until previously sent acknowledgments are successfully returned.
In older TCP specifications a dropped packet within that window would result in a complete resend of data from the dropped packet onwards. To solve this problem as it's obviously a little wasteful, TCP now employs a selective acknowledgment (SACK) option (RFC 2018). This would result in just the lost/corrupted packet being resent.
Back to your example, assuming the window size is large enough to encompass all three packets, and providing you are taking advantage of the latest TCP standard (don't see why you wouldn't), if packet A were dropped only packet A would be resent. If all packets are individually larger than the window then the packets must be sent and acknowledged sequentially.
It depends on the latencies. In general, first A is resent. If the client gets it and already has B and C, it can acknowledge them as well.
If this happens fast enough, B and C won't be resent, or maybe only B.
I would be greatful for help, understanding how long it takes to establish a TCP connection when I have the Ping RoundTripTip:
According to Wikipedia a TCP Connection will be established in three steps:
1.SYN-SENT (=>CLIENT TO SERVER)
2.SYN/ACK-RECEIVED (=>SERVER TO CLIENT)
3.ACK-SENT (=>CLIENT TO SERVER)
My Questions:
Is it correct, that the third transmission (ACK-SENT) will not yet carry any payload (my data) but is only used for the connection establishement.(This leads to the conclusion, that the fourth packt will be the first packt to hold any payload....)
Is it correct to assume, that when my Ping RoundTripTime is 20 milliseconds, that in the example given above, the TCP Connection establishment would at least require 30 millisecons, before any data can be transmitted between the Client and Server?
Thank you very much
Tom
Those things are basically correct, though #2 assumes that the round-trip time is symmetric.
To measure this, called the "Time to Syn/ACK" (which is NOT the Time to Establish a connection - the connection is only half-open when in that state, you need the 3rd packet acknowledging the establishment to consider it established), you usually need professional tools that include their own TCP stack, enabling that kind of measurement. The most used one is called the Spirent Avalanche, but you also have Ixia's IxLoad or BreakingPoint Systems box (BPS has now been acquired by Ixia btw).
Note that, yes, 3rd packet won't have any data, and that is also true of the first two. They are only Syn and Syn+Ack flagged (those are TCP flags), and contain no application data. This initial exchange, called the Three-way Handshake therefore causes some overhead, which is why TCP is typically not used in real-time applications (voice, live video, etc..).
Also as stated, you can't assume that Latency = RTT/2. It is in fact very complicated to measure one-way latency above layer 3 (IP) - and you are already at layer 4 (TCP) here. This blog post covers in details the challenge of this: http://synsynack.wordpress.com/2012/04/09/realistic-latency-measurement-in-the-application-layers/
One of our customers is having trouble submitting data from our application (on their PC) to a server (different geographical location). When sending packets under 1100 bytes everything works fine, but above this we see TCP retransmitting the packet every few seconds and getting no response. The packets we are using for testing are about 1400 bytes (but less than 1472). I can send an ICMP ping to www.google.com that is 1472 bytes and get a response (so it's not their router/first few hops).
I found that our application sets the DF flag for these packets, and I believe a router along the way to the server has an MTU less than/equal to 1100 and dropping the packet.
This affects 1 client in 5000, but since everybody's routes will be different this is expected.
The data is a SOAP envelope and we expect a SOAP response back. I can't justify WHY we do it, the code to do this was written by a previous developer.
So... Are there any benefits OR justification to setting the DF flag on TCP packets for application data?
I can think of reasons it is needed for network diagnostics applications but not in our situation (we want the data to get to the endpoint, fragmented or not). One of our sysadmins said that it might have something to do with us using SSL, but as far as I know SSL is like a stream and regardless of fragmentation, as long as the stream is rebuilt at the end, there's no problem.
If there's no good justification I will be changing the behaviour of our application.
Thanks in advance.
The DF flag is typically set on IP packets carrying TCP segments.
This is because a TCP connection can dynamically change its segment size to match the path MTU, and better overall performance is achieved when the TCP segments are each carried in one IP packet.
So TCP packets have the DF flag set, which should cause an ICMP Fragmentation Needed packet to be returned if an intermediate router has to discard a packet because it's too large. The sending TCP will then reduce its estimate of the connection's Path MTU (Maximum Transmission Unit) and re-send in smaller segments. If DF wasn't set, the sending TCP would never know that it was sending segments that are too large. This process is called PMTU-D ("Path MTU Discovery").
If the ICMP Fragmentation Needed packets aren't getting through, then you're dealing with a broken network. Ideally the first step would be to identify the misconfigured device and have it corrected; however, if that doesn't work out then you add a configuration knob to your application that tells it to set the TCP_MAXSEG socket option with setsockopt(). (A typical example of a misconfigured device is a router or firewall that's been configured by an inexperienced network administrator to drop all ICMP, not realising that Fragmentation Needed packets are required by TCP PMTU-D).
The operation of Path-MTU discovery is described in RFC 1191, https://www.rfc-editor.org/rfc/rfc1191.
It is better for TCP to discover the Path-MTU than to have every packet over a certain size fragmented into two pieces (typically one large and one small).
Apparently, some protocols like NFS benefit from avoiding fragmentation (link text). However, you're right in that you typically shouldn't be requesting DF unless you really require it.