c++ posix sockets recv functionality - sockets

I have a perhaps noobish question to ask, I've looked around but haven't seen a direct answer addressing it and thought I might get a quick answer here. In a simple TCP/IP client-server select loop using bsd sockets, if a client sends two messages that arrive simultaneously at a server, would one call to recv at the server return both messages bundled together in the buffer, or does recv force each distinct arriving message to be read separately?
I ask because I'm working in an environment where I can't tell how the client is building its messages to send. Normally recv reports that 12 bytes are read, then 915, then 12 bytes, then 915, and so on in such an alternating 12 to 915 pattern... but then sometimes it reports 927 (which is 915+12). I was thinking that either the client is bundling some of it's information together before it sends it out to the server, or that the messages arrive before recv is invoked and then recv pulls all the pending bytes simultaneously. So I wanted to make sure I understood recv's behavior properly. I think perhaps I'm missing something here in my understanding, and I hope someone can point it out, thanks!

TCP/IP is a stream-based transport, not a datagram-based transport. In a stream, there is no 1-to-1 correlation between send() and recv(). That is only true for datagrams. So, you have to be prepared to handle multiple possibilities:
a single call to send() may fit in a single TCP packet and be read in full by a single call to recv().
a single call to send() may span multiple TCP packets and need multiple calls to recv() to read everything.
multiple calls to send() may fit in a single TCP packet and be read in full by a single call to recv().
multiple calls to send() may span multiple TCP packets and require multiple calls to recv() for each packet.
To illustrate this, consider two messages are being sent - send("hello", 5) and send("world", 5). The following are a few possible combinations when calling recv():
"hello" "world"
"hel" "lo" "world"
"helloworld"
"hel" "lo" "worl" "d"
"he" "llow" "or" "ld"
Get the idea? This is simply how TCP/IP works. Every TCP/IP implementation has to account for this fragementation.
In order to receive data properly, there has to be a clear separation between logical messages, not individual calls to send(), as it may take multiple calls to send() to send a single message, and multiple recv() calls to receive a single message in full. So, taking the earlier example into account, let's add a separator between the messages:
send("hello\n", 6);
send("world", 5);
send("\n", 1);
On the receiving end, you would call recv() as many times as it takes until a \n character is received, then you would process everything you had received leading up to that character. If there is any read data left over when finished, save it for later processing and start calling recv() again until the next \n character, and so on.
Sometimes, it is not possible to place a unique character between messages (maybe the message body allows all characters to be used, so there is no distinct character available to use as a separator). In that case, you need to prefix the message with the message's length instead, either as a preceeding integer, a structured header, etc. Then you simply call recv() as many times as needed until you have received the full integer/header, then you call recv() as many times as needed to read just as many bytes as the length/header specifies. When finished, save any remaining data if needed, and start calling recv() all over again to read the next message length/header, and so on.

It is definitely valid for both messages to be returned in a single recv call (see Nagle's Algorithm). TCP/IP guarantees order (the bytes from the messages won't be mixed). In addition to them being returned together in a single call, it is also possible for a single message to require multiple calls to recv (although it would be unlikely with packets as small as described).

The only thing you can count on is the order of the bytes. You cannot count on how they are partitioned into recv calls. Sometimes things get merged either at the endpoint or along the way. Things can also get broken up along the way and so arrive independently. It does sound like your sender is sending alternating 12 and 915 but you can't count on it.

Related

Does UDP allow repacketization?

I know that for TCP you can have for example Nagle's Algorithm enabled. However, can you have something similar for UDP?
Practical Question(assume UDP socket):
If I call send() two times in a short period of time with 1 byte of data in each send() call. Is it possible that the transport layer decides to send only 1 UPD packet with the 1 byte + 1 byte = 2 bytes of data?
Thanks in advance!
No. UDP datagrams are delivered intact exactly as sent, or not at all.
Not according to the RFC (RFC 768). Above IP facilities themselves, UDP really only provides, as extras, port-based routing and a little bit of extra detection for corruption or misrouting.
That means there's no facility to combine datagrams. In fact, since it's meant to be transaction oriented, I would say that combining two transactions into one may well be a bad idea in terms of keeping these transaction disparate.
Otherwise, you would need a layer above UDP which could figure out how to extract these transactions from a datagram. At the moment, that's not necessary since the datagram is the transaction.
As added support (though not, of course, definitive) for this contention, see the UDP wikipedia page:
Datagrams – Packets are sent individually and are checked for integrity only if they arrive. Packets have definite boundaries which are honored upon receipt, meaning a read operation at the receiver socket will yield an entire message as it was originally sent.
However, the best support for it comes from one of its clients. UDP was specially engineered for TFTP (among other things) and that protocol breaks down if you cannot distinguish a transaction.
Specifically, one of the TFTP transaction types is the data transaction which consists of an opcode, block number and up to 512 bytes of data. Without a length indication at the start or a sentinel value at the end, there is no way to work out where the next transaction would start unless there is a one-to-one mapping between transaction and datagram.
As an aside, the other four TFTP transaction types have either a fixed length or end-of-string sentinel values but the data transaction is the decider here.

Sending And Receiving Sockets (TCP/IP)

I know that it is possible that multiple packets would be stacked to the buffer to be read from and that a long packet might require a loop of multiple send attempts to be fully sent. But I have a question about packaging in these cases:
If I call recv (or any alternative (low-level) function) when there are multiple packets awaiting to be read, would it return them all stacked into my buffer or only one of them (or part of the first one if my buffer is insufficient)?
If I send a long packet which requires multiple iterations to be sent fully, does it count as a single packet or multiple packets? It's basically a question whether it marks that the package sent is not full?
These questions came to my mind when I thought about web sockets packaging. Special characters are used to mark the beginning and end of a packet which sorta leads to a conclusion that it's not possible to separate multiple packages.
P.S. All the questions are about TCP/IP but you are welcomed to share information (answers) about UDP as well.
TCP sockets are stream based. The order is guaranteed but the number of bytes you receive with each recv/read could be any chunk of the pending bytes from the sender. You can layer a message based transport on top of TCP by adding framing information to indicate the way that the payload should be chunked into messages. This is what WebSockets does. Each WebSocket message/frame starts with at least 2 bytes of header information which contains the length of the payload to follow. This allows the receiver to wait for and re-assemble complete messages.
For example, libraries/interfaces that implement the standard Websocket API or a similar API (such as a browser), the onmessage event will fire once for each message received and the data attribute of the event will contain the entire message.
Note that in the older Hixie version of WebSockets, each frame was started with '\x00' and terminated with '\xff'. The current standardized IETF 6455 (HyBi) version of the protocol uses the header information that contains the length which allows much easier processing of the frames (but note that both the old and new are still message based and have basically the same API).
TCP connection provides for stream of bytes, so treat it as such. No application message boundaries are preserved - one send can correspond to multiple receives and the other way around. You need loops on both sides.
UDP, on the other hand, is datagram (i.e. message) based. Here one read will always dequeue single datagram (unless you mess with low-level flags on the socket). Event if your application buffer is smaller then the pending datagram and you read only a part of it, the rest of it is lost. The way around it is to limit the size of datagrams you send to something bellow the normal MTU of 1500 (less IP and UDP headers, so actually 1472).

If you send data over a socket in one call to send(), will it be received in one call to receive()?

I've seen several uses of sockets where programmers send a command or some information over a TCP/IP socket, and expect it to be received in one call on the receiving side.
For eg, transmitting
mySocket.Send("SomeSpecificCommand")
They assume the receive side will receive all the data in one call. For eg:
Dim data(255) As Byte
Dim nReceived As Long = s.Receive(data, 0, data.Count, SocketFlags.None)
Dim str As String = Encoding.ASCII.GetString(data, 0, n)
If str = "SomeSpecificCommand" Then
DoStuff()
...
The example above doesn't use any terminator, so the programmer is relying on the fact that the sockets implementation is not allowed, for example, to return "SomeSpecif" in a first call to Receive(), and "cCommand" in a later call to Receive(). (NOTE - In the example, the buffer is sized to be larger than the expected string).
I've never before given this much thought and had just assumed that this type of coding is unsafe and have always used delimiters. Have I been wasting my time (and processor cycles)?
There is no guarantee that it will all arrive at the same time. The code (the app's protocol) needs to deal with the possibility that data from one send may arrive in multiple pieces or the possibility that data from more than one send could arrive in one receive.
Short snippets of data sent in one short call to send() will usually arrive in one call to recv(), which is why code like that will work most of the time. However, it's not guaranteed and therefore bad practice to rely on it.
TCP buffers the data and may split it up as it sees fit. TCP tries to send as few packets as possible to conserve bandwidth, so it won't split up the data for no good reason. However, if it's been queueing up some data and the data from one call to send() happens to straddle a packet boundary, that data will be split up.
Alternately, TCP could try to send it in one packet, but then a router anywhere along the path to the destination could come back and say "this packet is too big!". Then TCP will split it into smaller packets.
When sending data across a network, you should expect your data to be fragmented across multiple packets and structure your code and data to deal with this. In the example case where you are sending a handful of bytes, everything will work fine.. until you start sending larger packets.
If you are expecting to receive one message at a time then you can just loop reading bytes for an interval after the first bytes arrive. This is simple but inefficient.
A delimiter could be used as suggested but then you have to guard against accidentally including the delimiter within the regular data. If you are only sending text then you can use null or some non-printable character. If you are sending binary data then this becomes more difficult as any occurrence of the delimiter within the data needs to be escaped by the sender and un-escaped by the receiver.
An alternative to delimiters is to add a field to the front of the data containing a message length. This is better than using a delimiter as it removes the need for escaping data and better than simply looping until a timer expires as it will be more responsive.
No, its not a good idea to assume that the server (assuming your the client) is gonna only send you one socket response. The server could be running though a list of procedures that returns multiple results. I would continue to read from the socket until there is nothing left to pick up, then wait a few miliseconds and test again. If nothing shows up, chances are good that the server has finished sending responses.
There are several types of sockets. TCP uses SOCK_STREAM, which don't preserve message boundaries. SOCK_SEQPACKET sockets do preserve message boundaries.
EDIT: SCTP supports both SOCK_STREAM and SOCK_SEQPACKET.

Why don't I get all the data when with my non-blocking Perl socket?

I'm using Perl sockets in AIX 5.3, Perl version 5.8.2
I have a server written in Perl sockets. There is a option called "Blocking", which can be set to 0 or 1. When I use Blocking => 0 and run the server and client send data (5000 bytes), I am able to recieve only 2902 bytes in one call. When I use Blocking => 1, I am able to recieve all the bytes in one call.
Is this how sockets work or is it a bug?
This is a fundamental part of sockets - or rather, TCP, which is stream-oriented. (UDP is packet-oriented.)
You should never assume that you'll get back as much data as you ask for, nor that there isn't more data available. Basically more data can come at any time while the connection is open. (The read/recv/whatever call will probably return a specific value to mean "the other end closed the connection.)
This means you have to design your protocol to handle this - if you're effectively trying to pass discrete messages from A to B, two common ways of doing this are:
Prefix each message with a length. The reader first reads the length, then keeps reading the data until it's read as much as it needs.
Have some sort of message terminator/delimiter. This is trickier, as depending on what you're doing you may need to be aware of the possibility of reading the start of the next message while you're reading the first one. It also means "understanding" the data itself in the "reading" code, rather than just reading bytes arbitrarily. However, it does mean that the sender doesn't need to know how long the message is before starting to send.
(The other alternative is to have just one message for the whole connection - i.e. you read until the the connection is closed.)
Blocking means that the socket waits till there is data there before returning from a recieve function. It's entirely possible there's a tiny wait on the end as well to try to fill the buffer before returning, or it could just be a timing issue. It's also entirely possible that the non-blocking implementation returns one packet at a time, no matter if there's more than one or not. In short, no it's not a bug, but the specific 'why' of it is the old cop-out "it's implementation specific".

Send multiple datagrams using a single send() call?

When datagram-based socket (raw socket or UDP) is used with gather-style send, all the data are concatenated to form a single IP packet. Is there a way to send several datagrams using a single call?
The call you are looking for is sendmmsg() however it is not yet implemented or even up for much discussion. You can see it's receive side twin recvmmsg() in the latest 2.6.3 Linux kernel.
I don't think so ... How would you expect the IP stack to infer where you intend the packet datagram to be?
What you are asking is a bit funny since gather-style send() as the name says gathers data from multiple places in memory and puts it together into one buffer which it then sends.
So you have multiple parts of data you want to send multiple datagrams. Why don't you send them with separate calls to send?
You can actually call connect() on a datagram socket to specify a default target so you can can send() or write() without specifying the destination address each time.