TCP/IP using Ada Sockets: How to correctly finish a packet? [duplicate] - sockets

This question already has answers here:
TCP Connection Seems to Receive Incomplete Data
(5 answers)
Closed 3 years ago.
I'm attempting to implement the Remote Frame Buffer protocol using Ada's Sockets library and I'm having trouble controlling the length of the packets that I'm sending.
I'm following the RFC 6143 specification (https://tools.ietf.org/pdf/rfc6143.pdf), see comments in the code for section numbers...
-- Section 7.1.1
String'Write (Comms, Protocol_Version);
Put_Line ("Server version: '"
& Protocol_Version (1 .. 11) & "'");
String'Read (Comms, Client_Version);
Put_Line ("Client version: '"
& Client_Version (1 .. 11) & "'");
-- Section 7.1.2
-- Server sends security types
U8'Write (Comms, Number_Of_Security_Types);
U8'Write (Comms, Security_Type_None);
-- client replies by selecting a security type
U8'Read (Comms, Client_Requested_Security_Type);
Put_Line ("Client requested security type: "
& Client_Requested_Security_Type'Image);
-- Section 7.1.3
U32'Write (Comms, Byte_Reverse (Security_Result));
-- Section 7.3.1
U8'Read (Comms, Client_Requested_Shared_Flag);
Put_Line ("Client requested shared flag: "
& Client_Requested_Shared_Flag'Image);
Server_Init'Write (Comms, Server_Init_Rec);
The problem seems to be (according to wireshark) that my calls to the various 'Write procedures are causing bytes to queue up on the socket without getting sent.
Consequently two or more packet's worth of data are being sent as one and causing malformed packets. Sections 7.1.2 and 7.1.3 are being sent consecutively in one packet instead of being broken into two.
I had wrongly assumed that 'Reading from the socket would cause the outgoing data to be flushed out, but that does not appear to be the case.
How do I tell Ada's Sockets library "this packet is finished, send it right now"?

To enphasize https://stackoverflow.com/users/207421/user207421 comment:
I'm not a protocols guru, but from my own experience, the usage of TCP (see RFC793) is often misunderstood.
The problem seems to be (according to wireshark) that my calls to the various 'Write procedures are causing bytes to queue up on the socket without getting sent.
Consequently two or more packet's worth of data are being sent as one and causing malformed packets. Sections 7.1.2 and 7.1.3 are being sent consecutively in one packet instead of being broken into two.
In short, TCP is not message-oriented.
Using TCP, sending/writing to socket results only append data to the TCP stream. The socket is free to send it in one exchange or several, and if you have lengthy data to send and message oriented protocol to implement on top of TCP, you may need to handle message reconstruction. Usually, an end of message special sequence of characters is added at the end of the message.
Processes transmit data by calling on the TCP and passing buffers of data as arguments. The TCP packages the data from these buffers into segments and calls on the internet module to transmit each segment to the destination TCP. The receiving TCP places the data from a segment into the receiving user's buffer and notifies the receiving user. The TCPs include control information in the segments which they use to ensure reliable ordered data transmission.
See also https://stackoverflow.com/a/11237634/7237062, quoting:
TCP is a stream-oriented connection, not message-oriented. It has no
concept of a message. When you write out your serialized string, it
only sees a meaningless sequence of bytes. TCP is free to break up
that stream up into multiple fragments and they will be received at
the client in those fragment-sized chunks. It is up to you to
reconstruct the entire message on the other end.
In your scenario, one would typically send a message length prefix.
This way, the client first reads the length prefix so it can then know
how large the incoming message is supposed to be.
or TCP Connection Seems to Receive Incomplete Data, quoting:
The recv function can receive as little as 1 byte, you may have to call it multiple times to get your entire payload. Because of this, you need to know how much data you're expecting. Although you can signal completion by closing the connection, that's not really a good idea.
Update:
I should also mention that the send function has the same conventions as recv: you have to call it in a loop because you cannot assume that it will send all your data. While it might always work in your development environment, that's the kind of assumption that will bite you later.

Related

Bidirectional communication of Unix sockets

I'm trying to create a server that sets up a Unix socket and listens for clients which send/receive data. I've made a small repository to recreate the problem.
The server runs and it can receive data from the clients that connect, but I can't get the server response to be read from the client without an error on the server.
I have commented out the offending code on the client and server. Uncomment both to recreate the problem.
When the code to respond to the client is uncommented, I get this error on the server:
thread '' panicked at 'called Result::unwrap() on an Err value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', src/main.rs:77:42
MRE Link
Your code calls set_read_timeout to set the timeout on the socket. Its documentation states that on Unix it results in a WouldBlock error in case of timeout, which is precisely what happens to you.
As to why your client times out, the likely reason is that the server calls stream.read_to_string(&mut response), which reads the stream until end-of-file. On the other hand, your client calls write_all() followed by flush(), and (after uncommenting the offending code) attempts to read the response. But the attempt to read the response means that the stream is not closed, so the server will wait for EOF, and you have a deadlock on your hands. Note that none of this is specific to Rust; you would have the exact same issue in C++ or Python.
To fix the issue, you need to use a protocol in your communication. A very simple protocol could consist of first sending the message size (in a fixed format, perhaps 4 bytes in length) and only then the actual message. The code that reads from the stream would do the same: first read the message size and then the message itself. Even better than inventing your own protocol would be to use an existing one, e.g. to exchange messages using serde.

I can not sent short messages by TCP protocol

I have a trouble to tune TCP client-server communication.
My current project has a client, running on PC (C#) and a server,
running on embedded Linux 4.1.22-ltsi.
Them use UDP communication to exchanging data.
The client and server work in blocking mode and
send short messages one to 2nd
(16, 60, 200 bytes etc.) that include either command or set of parameters.
The messages do note include any header with message length because
UDP is message oriented protocol. Its recvfrom() API returns number of received bytes.
For my server's program structure is important to get and process entire alone message.
The problem is raised when I try to implement TCP communication type instead of UDP.
The server's receive buffer (recv() TCP API) is 2048 bytes:
#define UDP_RX_BUF_SIZE 2048
numbytes = recv(fd_connect, rx_buffer, UDP_RX_BUF_SIZE, MSG_WAITALL/*BLOCKING_MODE*/);
So, the recv() API returns from waiting when rx_buffer is full, i.e after it receives
2048 bytes. It breaks all program approach. In other words, when client send 16 bytes command
to server and waits an answer from it, server's recv() keeps the message
"in stomach", until it will receive 2048 bytes.
I tried to fix it as below, without success:
On client side (C#) I set the socket parameter theSocket.NoDelay.
When I checked this on the sniffer I saw that client sends messages "as I want",
with requested length.
On server side I set TCP_NODELAY socket option to 1
int optval= 1;
setsockopt(fd,IPPROTO_TCP, TCP_NODELAY, &optval, sizeof(optval);
On server side (Linux) I checked socket options SO_SNDLOWAT/SO_RCVLOWAT and they are 1 byte each one.
Please see the attached sniffer's log picture. 10.0.0.10 is a client. 10.0.0.106 is a server. It is seen, that client activates PSH flag (push), informing the server side to move the incoming data to application immediately and do not fill a buffer.
Additional question: what is SSH encrypted packets that runs between the sides. I suppose that it is my Eclipse debugger on PC (running server application through the same Ethernet connection) sends them. Am I right?
So, my problem is how to cause `recv() API to return each short message (16, 60, 200 bytes etc.) instead of accumulating them until receiving buffer fills.
TCP is connection oriented and it also maintains the order in which packets are sent and received.
Having said that, in TCP client, you will receive the stream of bytes and not the individual udp message as in UDP. So you will need to send the packet length and marker as the initial bytes.
So client can first find the packet length and then read data till packet length is reached and then expect new packet length.
You can also check for library like netty, zmq to do this extra work

Using "send" to tcp socket/Windows/c

For c send function(blocking way) it's specified what function returns with size of sent bytes when it's received on destinations. I'm not sure that I understand all nuances, also after writing "demo" app with WSAIoctl and WSARecv on server side.
When send returns with less bytes number than asked in buffer-length parameter?
What is considered as "received on destinations"? My first guess it's when it sit on server's OS buffer and server application is notified. My second one it's when server application recv call have read it fully?
Unless you are using a (somewhat exotic) library, a send on a socket will return the number of bytes passed to the TCP buffer successfully, not the number of bytes received by the peer (see Microsoft´s docs for example).
When you are streaming data via a socket, you need to check the bytes effectively accepted into the TCP send buffer. That´s why usually a send command is inside a loop that will issue several sends if needed.
Errors in send are local: for example if the socket is closed by the peer during a sending operation (making your socket invalid) or if the operation times out (TCP buffer not emptying, i. e. peer not receiving data fast enough or some other trouble).
After all send is completed you have no easy way of knowing if the peer received all the bytes you sent. You´ll usually just issue closesocket and make sure that your socket has a proper linger option set (i. e. only close after timeout or sucessfully finishing the send). Alternatively you wait for a confirmation by the peer (for example via a recv that returns zero bytes, indicating that the connection was gracefully closed).
Edit: typo

What is meant by record or data boundaries in the sense of TCP & UDP protocol?

I am learning to sockets and found the word Data OR Record Boundaries in SOCK_SEQPACKET communication protocol? Can anyone explain in simple words what is Data boundary and how the SOCK_SEQPACKET is different from SOCK_STREAM & SOCK_DGRAM ?
This answer https://stackoverflow.com/a/9563694/1076479 has a good succinct explanation of message boundaries (a different name for "record boundaries").
Extending that answer to SOCK_SEQPACKET:
SOCK_STREAM provides reliable, sequenced communication of streams of data between two peers. It does not maintain message (record) boundaries, which means the application must manage its own boundaries on top of the stream provided.
SOCK_DGRAM provides unreliable transmission of datagrams. Datagrams are self-contained capsules and their boundaries are maintained. That means if you send a 20 byte buffer on peer A, peer B will receive a 20 byte message. However, they can be dropped, or received out of order, and it's up to the application to figure that out and handle it.
SOCK_SEQPACKET is a newer technology that is not yet widely used, but tries to marry the benefits of both of the above. That is, it provides reliable, sequenced communication that also transmits entire "datagrams" as a unit (and hence maintains message boundaries).
It's easiest to demonstrate the concept of message boundaries by showing what happens when they're neglected. Beginners often post client code like this here on SO (using python for convenience):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('192.168.4.122', 9000))
s.send(b'FOO') # Send string 1
s.send(b'BAR') # Send string 2
reply = s.recv(128) # Receive reply
And server code similar to this:
lsock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
lsock.bind(('', 9000))
lsock.listen(5)
csock, caddr = lsock.accept()
string1 = csock.recv(128) # Receive first string
string2 = csock.recv(128) # Receive second string <== XXXXXXX
csock.send(b'Got your messages') # Send reply
They don't understand then why the server hangs on the second recv call, while the client is hung on its own recv call. That happens because both strings the client sent (may) get bundled together and received as a single unit in the first recv on the server side. That is, the message boundary between the two logical messages was not preserved, and so string1 will often contain both chunks run together: 'FOOBAR'
(Often there are other timing-related aspects to the code that influence when/whether that actually happens or not.)

How is determining body length by closing connection reliable (RFC 2616 4.4.5)

I can't get one thing straight. The RFC 2616 in 4.4.5 states that Message Length can be determined "By the server closing the connection.".
This implies, that it is valid for a server to respond (e.g. returning a large image) with a response, that has no Content-Length in the header, but the client is supposed to keep fetching till the connection is closed and then assume all data has been downloaded.
But how is a client to know for sure that the connection was closed intentionally by the server? A server app could have crashed in the middle of sending the data and the server's OS would most likely send FIN packet to gracefully close the TCP connection with the client.
You are absolutely right, that mechanism is totally unreliable. This is covered in RFC 7230:
Since there is no way to distinguish a successfully completed,
close-delimited message from a partially received message interrupted
by network failure, a server SHOULD generate encoding or
length-delimited messages whenever possible. The close-delimiting
feature exists primarily for backwards compatibility with HTTP/1.0.
Fortunately most of HTTP traffic today are HTTP/1.1, with Content-Length or "Transfer-Encoding" to explicitly define the end of message.
The lesson is that, a message must have it own way of termination; we cannot repurpose the underlying transport layer's EOF as the message's EOF.
On that note, a (well-formed) html document, or a .gif, .avi etc, does define its own termination; we will know if we received an incomplete document. Therefore it is not so much of a problem to transmit it over HTTP/1.0 without Content-Length.
However, for plain text document, javascript, css etc. EOF is used to marked the end of the document, therefore it's problematic over HTTP/1.0.