Sending data immediately after accept. Data loss possibility - sockets

I have read following on msdn about accept function:
https://msdn.microsoft.com/pl-pl/library/windows/desktop/ms737526(v=vs.85).aspx
When using the accept function, realize that the function may return
before connection establishment has traversed the entire distance
between sender and receiver. This is because the accept function
returns as soon as it receives a CONNECT ACK message; in ATM, a
CONNECT ACK message is returned by the next switch in the path as soon
as a CONNECT message is processed (rather than the CONNECT ACK being
sent by the end node to which the connection is ultimately
established). As such, applications should realize that if data is
sent immediately following receipt of a CONNECT ACK message, data loss
is possible, since the connection may not have been established all
the way between sender and receiver.
Could someone explain it in more details? What it has with SYN, SYN ACK? What's the problem here? So when such data loss can happen, and how to prevent it?

You're omitting an important paragraph on that page, right before your quote:
The following are important issues associated with connection setup,
and must be considered when using Asynchronous Transfer Mode (ATM)
with Windows Sockets 2
That is, it is only applicable when you use things like AF_ATM and SOCKADDR_ATM. It is not relevant for TCP which you seem to imply with:
What it has with SYN, SYN ACK

Related

error/timeout detection from socket send() call

I am troubleshooting a socket connection issue where a peer irregularly gets WSAETIMEDOUT (10060) from socket send() and I would like to understand the detail on the actual TCP level.
The actual implementation is done with Winsock blocking socket and has the following call pattern:
auto result = ::send(...);
if (result == SOCKET_ERROR)
{
auto err = ::WSAGetLastError();
// err can be WSAETIMEDOUT
}
As far as I understand the send returns immediately if the outgoing data is copied to the kernel buffer [as asked in another SO].
On the other hand, I assume that the error WSAETIMEDOUT should be caused by missing TCP ACK from the receiving side. Right? (see Steffen Ullrich's answer)
What I am not sure is if such WSAETIMEDOUT only happens when option SO_SNDTIMEO is set.
The default value of SO_SNDTIMEO is 0 for never timeout. Does it mean that an unsuccessful send would block forever? or is there any built-in/hard-coded timeout on Windows for such case?
And how TCP retransmission come into play?
I assume that unacknowledged packet would trigger retransmission. But what happen if all retransmission attempts fail? is the socket connection just stall? or WSAETIMEDOUT would be raised (independent from SO_SNDTIMEO)?
My assumption for my connection issue would be like this:
A current send operation returns SOCKET_ERROR and has error code with WSAETIMEDOUT because the desired outgoing data cannot be copied to kernel buffer which is still occupied with old outgoing data which is either lost or cannot be ACKed from socket peer in time. Is my understanding right?
Possible causes may be: intermediate router drops packets, intermediate network just gets disconnected or peer has problem to receive. What else?
What can be wrong on receiving side?
Maybe the peer application hangs and stops reading data from socket buffer. The receive buffer (on receiver side) is full and block sender to send data.
Thanks you for clarifying all my questions.
On the other hand, I assume that the error WSAETIMEDOUT should be caused by missing TCP ACK from the receiving side. Right?
No. send does not provide any information if the data are acknowledged by the other side. A timeout simply means that the data could not be stored in time into the local socket buffer - because it was already full all the time. The socket buffer stays full if data can not be delivered to the other side, i.e. if the recipient does not read the data or not fast enough.
But what happen if all retransmission attempts fail? is the socket connection just stall?
TCP sockets will not try to retransmit data forever but give up after some time and treat the connection dead - and the associated socket closed. This error will be propagated to the application within send. Thus in this case send might return with WSAETIMEDOUT (or ETIMEDOUT on UNIX systems) due to retransmission timeout even before the the send timeout of the socket (SO_SNDTIMEO) was finished.

What is the difference between Nagle algorithm and 'stop and wait'?

I saw the socket option TCP_NODELAY, which is used to turn on or off the Nagle alorithm.
I checked what the Nagle algorithm is, and it seems similar to 'stop and wait'.
Can someone give me a clear difference between these two concepts?
In a stop and wait protocol, one
sends a message to the peer
waits for an ack for that message
sends the next message
(i.e. one cannot send a new message until the previous one has been acknowledged)
Nagle's algorithem as used in TCP is orthoginal to this concept. When the TCP application sends some data, the protocol buffers the data and waits a little while to see if there's more data to be sent instead of sending data to the peer immediately.
If the application has more data to send in this small timeframe, the protocol stack merges that data into the current buffer and can send it as one large message.
This concept could very well be applied to a stop and go protocol as well. (Note that TCP is not a stop and wait protocol)
The Nagle Algorithm is used to control whether the socket provider sends outgoing data immediately as-is at the cost of less efficient network transmissions (off), or if it buffers outgoing data so it can make more efficient network transmissions at the cost of speed (on).
Stop and Wait is a mechanism used to ensure the integrity of transmitted data, by making the sender send a frame of data and then wait for an acknowledgement from the receiver before sending another frame, thus ensuring frames are received in the same order in which they are sent.
These two features operate independently of each other.

Does TCP ensure packet is received by sequence that server send it

I'm working on an gameServer that communicate with game client, but wonder whether the packet server send to client remain sequence when client received it ?
like server sends packets A,B,C
but the client received B,A,C ?
I have read the great blog http://packetlife.net/blog/2010/jun/7/understanding-tcp-sequence-acknowledgment-numbers/
It seems that every packet send by the server has an ack corresponding by client, but it does not say why the packet received by client has the same sequence with server
It's worth reading TCP's RFC, particularly section 1.5 (Operation), which explains the process. In part, it says:
The TCP must recover from data that is damaged, lost, duplicated, or delivered out of order by the internet communication system. This is achieved by assigning a sequence number to each octet transmitted, and requiring a positive acknowledgment (ACK) from the receiving TCP. If the ACK is not received within a timeout interval, the data is retransmitted. At the receiver, the sequence numbers are used to correctly order segments that may be received out of order and to eliminate duplicates. Damage is handled by adding a checksum to each segment transmitted, checking it at the receiver, and discarding damaged segments.
I don't see where it's ever made explicit, but since the acknowledgement (as described in section 2.6) describes the next expected packet, the receiving TCP implementation is only ever acknowledging consecutive sequences of packets from the beginning. That is, if you never receive the first packet, you never send an acknowledgement, even if you've received all other packets in the message; if you've received 1, 2, 3, 5, and 6, you only acknowledge 1-3.
For completeness, I'd also direct your attention to section 2.6, again, after it describes the above-quoted section in more detail:
An acknowledgment by TCP does not guarantee that the data has been delivered to the end user, but only that the receiving TCP has taken the responsibility to do so.
So, TCP ensures the order of packets, unless the application doesn't receive them. That exception probably wouldn't be common, except for cases where the application is unavailable, but it does mean that an application shouldn't assume that a successful send is equivalent to a successful reception. It probably is, for a variety of reasons, but it's explicitly outside of the protocol's scope.
TCP guarantees sequence and integrity of the byte stream. You will not receive data out of sequence. From RFC 793:
Reliable Communication: A stream of data sent on a TCP connection is delivered reliably and in
order at the destination.

Half-Established TCP Connections

Half-Established Connections
With a half-established connection I mean a connection for which the client's call to connect() returned successfully, but the servers call to accept() didn't. This can happen the following way: The client calls connect(), resulting in a SYN packet to the server. The server goes into state SYN-RECEIVED and sends a SYN-ACK packet to the client. This causes the client to reply with ACK, go into state ESTABLISHED and return from the connect() call. If the final ACK is lost (or ignored, due to a full accept queue at the server, which is probably the more likely scenario), the server is still in state SYN-RECEIVED and the accept() does not return. Due to timeouts associated with the SYN-RECEIVED state the SYN-ACK will be resend, allowing the client to resend the ACK. If the server is able to process the ACK eventually, it will go into state ESTABLISHED as well. Otherwise it will eventually reset the connection (i.e. send a RST to the client).
You can create this scenario by starting lots of connections on a single listen socket (if you do not adjust the backlog and tcp_max_syn_backlog). See this questions and this article for more details.
Experiments
I performed several experiments (with variations of this code) and observed some behaviour I cannot explain. All experiments where performed using Erlang's gen_tcp and a current Linux, but I strongly suspect that the answers are not specific to this setup, so I tried to keep it more general here.
connect() -> wait -> send() -> receive()
My starting point was to establish a connection from the client, wait between 1 to 5 seconds, send a "Ping" message to the server and wait for the reply. With this setup I observed that the receive() failed with the error closed when I had a half-established connection. There was never an error during the send() on a half-established connection. You can find a more detailed description of this setup here.
connect() -> long wait -> send()
To see, if I can get errors while sending data on a half-established connection I waited for 4 minutes before sending data. The 4 minutes should cover all timeouts and retries associated with the half-established connection. Sending data was still possible, i.e. send() returned without error.
connect() -> receive()
Next I tested what happens if I only call receive() with a very long timeout (5 minutes). My expectation was to get an closed error for the half-established connections, as in the original experiments. Alas, nothing happend, no error was thrown and the receive eventually timed out.
My questions
Is there a common name for what I call a half-established connection?
Why is the send() on a half-established connection successful?
Why does a receive() only fail if I send data first?
Any help, especially links to detailed explanations, are welcome.
From the client's point of view, the session is fully established, it sent SYN, got back SYN/ACK and sent ACK. It is only on the server side that you have a half-established state. (Even if it gets a repeated SYN/ACK from the server, it will just re-ACK because it's in the established state.)
The send on this session works fine because as far as the client is concerned, the session is established. The sent data does not have to be acknowledged by the far side in order to succeed (the send system call is finished when the data is copied into kernel buffers) but see below.
I believe here that the send actually is generating an error on the connection (probably a RST) because the receiving system cannot ACK data on a session it has not finished establishing. My guess is that any system call referencing the socket on the client side that happens after the send plus a short delay (i.e. when the RST has had a chance to come back) will result in an error.
The receive by itself never causes an error because the client side doesn't need to do anything (I mean TCP protocol-wise) for a receive; it's just idly waiting. But once you send some data, you've forced the server side's hand: it either has completed the session establishment (in which case it can accept the data) or it must send a reset (my guess here that it can't "hold" undelivered data on a session that isn't fully established).

How synchronized are sockets if at all?

I already read this question about socket synchronization but I still dont get it yet.
Recently I was working on a relatively simple client/server app where the communication happens over a tcp socket. The client is written in PHP using the C-like functions (especially fsockopen and fgetc) PHP provides to interact with sockets, the server is written in node.js using a Stream for outputting data.
The protocol is quite simple, the message is just a string which ends with a 0-byte character.
Basically it works like this:
SERVER: Message 1
CLIENT: Ack 1
SERVER: Message 2
CLIENT: Ack 2
....
Which really worked fine as my client processed one message at a time by reading char by char from the socket until a 0-byte was encountered which designates the end of the message. Then the client writes back to the server that it has successfully received the message (thats the Ack <message id> part).
Now this happened:
SERVER: Message 1
CLIENT: Ack 1
SERVER: Message 2
CLIENT: Ack 2
SERVER: Message 3
Message 4
Message 5
Message 6
CLIENT: <DOH!>
....
Meaning the server unexpectedly sent multiple messages in one "batch" to the client, although every message is a single stream.write(...) operation on the server. It seemed like the messages were buffered somewhere and then sent to the client at once. My client code couldnt cope with multiple messages in the socket WITHOUT an Ack response in between, so it cut off the remaining messages after id 3.
So my question is:
How synchronized are sockets in their read and writes? From the question above I understand that a socket is basically two uni-directional pipes, which means they are not synchronized at all?
How can it happen that some messages were sent to my client in a simple "one message-one ack" manner and then suddendly multiple messages are written to the stream?
Does it actually change the picture if the socket is opened in a blocking/non-blocking manner?
I tested this on a Ubuntu VM (so no load or anything that could provoke strange behaviour) using PHP 5.4 and node 0.6.x.
TCP is an abstraction of a bi-directional stream, and as such has no concept of messages and cannot preserve message boundaries. There is no guarantee how multiple send() or recv() calls will map to TCP packets. You should treat send() as if calling it multiple times is equivalent to calling it once with the concatenation of all the data. More importantly, when receiving, you should make sure that your code interprets the incoming data exactly the same way, no matter how it was split over indvidual recv() calls.
To receive properly, you can use a buffer where you store incomplete messages. But be careful that when you have an incomplete message in a buffer, the next recv() call may complete the current message, as well as provide zero or more complete messages, and possibly part of another incomplete message.
The blocking or non-blocking mode doesn't change anything here - it's only about the way your application interfaces with the OS.
There are two synchronization concepts to deal with:
The (generally) synchronous operation of send() or recv().
The asynchronous way that one process sends a message and the way the other process handles the message.
If you can, try to avoid a design that keeps a client and server in process-synchronized "lock step" with each other. That's asking for trouble. What if the one of the processes closes unexpectedly? The other process/thread might hang on a recv() that will never come. It's one thing for your design to expect each message to be acknowledged eventually, but it's quite another for your design to expect that only one message can be sent, then it must be acknowledged, before you may send another.
Consider this:
Server: send 1
Client: ack 1
Server: send 2
Server: send 3
Client: ack 2
Server: send 4
Client: ack 3
Client: ack 4
A design that can accommodate this situation is better than one that expects:
Server: send 1
Client: ack 1
Server: send 2
Client: ack 2
Server: send 3
Client: ack 3
Server: send 4
Client: ack 4