Send multiple datagrams using a single send() call? - sockets

When datagram-based socket (raw socket or UDP) is used with gather-style send, all the data are concatenated to form a single IP packet. Is there a way to send several datagrams using a single call?

The call you are looking for is sendmmsg() however it is not yet implemented or even up for much discussion. You can see it's receive side twin recvmmsg() in the latest 2.6.3 Linux kernel.

I don't think so ... How would you expect the IP stack to infer where you intend the packet datagram to be?

What you are asking is a bit funny since gather-style send() as the name says gathers data from multiple places in memory and puts it together into one buffer which it then sends.
So you have multiple parts of data you want to send multiple datagrams. Why don't you send them with separate calls to send?
You can actually call connect() on a datagram socket to specify a default target so you can can send() or write() without specifying the destination address each time.

Related

C# BeginSend/BeginReceive sometimes send or receive data attatched [duplicate]

I have two apps sending tcp packages, both written in python 2. When client sends tcp packets to server too fast, the packets get concatenated. Is there a way to make python recover only last sent package from socket? I will be sending files with it, so I cannot just use some character as packet terminator, because I don't know the content of the file.
TCP uses packets for transmission, but it is not exposed to the application. Instead, the TCP layer may decide how to break the data into packets, even fragments, and how to deliver them. Often, this happens because of the unterlying network topology.
From an application point of view, you should consider a TCP connection as a stream of octets, i.e. your data unit is the byte, not a packet.
If you want to transmit "packets", use a datagram-oriented protocol such as UDP (but beware, there are size limits for such packets, and with UDP you need to take care of retransmissions yourself), or wrap them manually. For example, you could always send the packet length first, then the payload, over TCP. On the other side, read the size first, then you know how many bytes need to follow (beware, you may need to read more than once to get everything, because of fragmentation). Here, TCP will take care of in-order delivery and retransmission, so this is easier.
TCP is a streaming protocol, which doesn't expose individual packets. While reading from stream and getting packets might work in some configurations, it will break with even minor changes to operating system or networking hardware involved.
To resolve the issue, use a higher-level protocol to mark file boundaries. For example, you can prefix the file with its length in octets (bytes). Or, you can switch to a protocol that already handles this kind of stuff, like http.
First you need to know if the packet is combined before it is sent or after. Use wireshark to check it the sender is sending one packet or two. If it is sending one, then your fix is to call flush() after each write. I do not know the answer if the receiver is combining packets after receiving them.
You could change what you are sending. You could send bytes sent, followed by the bytes. Then the other side would know how many bytes to read.
Normally, TCP_NODELAY prevents that. But there are very few situations where you need to switch that on. One of the few valid ones are telnet style applications.
What you need is a protocol on top of the tcp connection. Think of the TCP connection as a pipe. You put things in one end of the pipe and get them out of the other. You cannot just send a file through this without both ends being coordinated. You have recognised you don't know how big it is and where it ends. This is your problem. Protocols take care of this. You don't have a protocol and so what you're writing is never going to be robust.
You say you don't know the length. Get the length of the file and transmit that in a header, followed by the number of bytes.
For example, if the header is a 64bits which is the length, then when you receive your header at the server end, you read the 64bit number as the length and then keep reading until the end of the file which should be the length.
Of course, this is extremely simplistic but that's the basics of it.
In fact, you don't have to design your own protocol. You could go to the internet and use an existing protocol. Such as HTTP.

how can I transfer large data over tcp socket

how can I transfer large data without splitting. Am using tcp socket. Its for a game. I cant use udp and there might be 1200 values in an array. Am sending array in json format. But the server receiving it like splitted.
Also is there any option to send http request like tcp? I need the response in order. Also it should be faster.
Thanks,
You can't.
HTTP may chunk it
TCP will segment it
IP will packetize it
routers will fragment it ...
and TCP will reassemble it all at the other end.
There isn't a problem here to solve.
You do not have much control over splitting packets/datagrams. The network decides about this.
In the case of IP, you have the DF (don't fragment) flag, but I doubt it will be of much help here. If you are communicating over Ethernet, then 1200 element array may not fit into an Ethernet frame (payload size is up to the MTU of 1500 octets).
Why does your application depend on the fact that the whole data must arrive in a single unit, and not in a single connection (comprised potentially of multiple units)?
how can I transfer large data without splitting.
I'm interpreting the above to be roughly equivalent to "how can I transfer my data across a TCP connection using as few TCP packets as possible". As others have noted, there is no way to guarantee that your data will be placed into a single TCP packet -- but you can do some things to make it more likely. Here are some things I would do:
Keep a single TCP connection open. (HTTP traditionally opens a separate TCP connection for each request, but for low-latency you can't afford to do that. Instead you need to open a single TCP connection, keep it open, and continue sending/receiving data on it for as long as necessary).
Reduce the amount of data you need to send. (i.e. are there things that you are sending that the receiving program already knows? If so, don't send them)
Reduce the number of bytes you need to send. (The easiest way to do this is to zlib-compress your message-data before you send it, and have the receiving program decompress the message after receiving it. This can give you a size-reduction of 50-90%, depending on the content of your data)
Turn off Nagle's algorithm on your TCP socket. That will reduce latency by 200mS and discourage the TCP stack from playing unnecessary games with your data.
Send each data packet with a single send() call (if that means manually copying all of the data items into a separate memory buffer before calling send(), then so be it).
Note that even after you do all of the above, the TCP layer will still sometimes spread your messages across multiple packets, etc -- that's just the way TCP works. And even if your local TCP stack never did that, the receiving computer's TCP stack would still sometimes merge the data from consecutive TCP packets together inside its receive buffer. So the receiving program is always going to "receive it like splitted" sometimes, because TCP is a stream-based protocol and does not maintain message boundaries. (If you want message boundaries, you'll have to do your own framing -- the easiest way is usually to send a fixed-size (e.g. 1, 2, or 4-byte) integer byte-count field before each message, so the receiver knows how many bytes it needs to read in before it has a full message to parse)
Consider the idea that the issue may be else where or that you may be sending too much unnecessary data. In example with PHP there is the isset() function. If you're creating an internet based turn based game you don't (need to send all 1,200 variables back and forth every single time. Just send what changed and when the other player receives that data only change the variables are are set.

c++ posix sockets recv functionality

I have a perhaps noobish question to ask, I've looked around but haven't seen a direct answer addressing it and thought I might get a quick answer here. In a simple TCP/IP client-server select loop using bsd sockets, if a client sends two messages that arrive simultaneously at a server, would one call to recv at the server return both messages bundled together in the buffer, or does recv force each distinct arriving message to be read separately?
I ask because I'm working in an environment where I can't tell how the client is building its messages to send. Normally recv reports that 12 bytes are read, then 915, then 12 bytes, then 915, and so on in such an alternating 12 to 915 pattern... but then sometimes it reports 927 (which is 915+12). I was thinking that either the client is bundling some of it's information together before it sends it out to the server, or that the messages arrive before recv is invoked and then recv pulls all the pending bytes simultaneously. So I wanted to make sure I understood recv's behavior properly. I think perhaps I'm missing something here in my understanding, and I hope someone can point it out, thanks!
TCP/IP is a stream-based transport, not a datagram-based transport. In a stream, there is no 1-to-1 correlation between send() and recv(). That is only true for datagrams. So, you have to be prepared to handle multiple possibilities:
a single call to send() may fit in a single TCP packet and be read in full by a single call to recv().
a single call to send() may span multiple TCP packets and need multiple calls to recv() to read everything.
multiple calls to send() may fit in a single TCP packet and be read in full by a single call to recv().
multiple calls to send() may span multiple TCP packets and require multiple calls to recv() for each packet.
To illustrate this, consider two messages are being sent - send("hello", 5) and send("world", 5). The following are a few possible combinations when calling recv():
"hello" "world"
"hel" "lo" "world"
"helloworld"
"hel" "lo" "worl" "d"
"he" "llow" "or" "ld"
Get the idea? This is simply how TCP/IP works. Every TCP/IP implementation has to account for this fragementation.
In order to receive data properly, there has to be a clear separation between logical messages, not individual calls to send(), as it may take multiple calls to send() to send a single message, and multiple recv() calls to receive a single message in full. So, taking the earlier example into account, let's add a separator between the messages:
send("hello\n", 6);
send("world", 5);
send("\n", 1);
On the receiving end, you would call recv() as many times as it takes until a \n character is received, then you would process everything you had received leading up to that character. If there is any read data left over when finished, save it for later processing and start calling recv() again until the next \n character, and so on.
Sometimes, it is not possible to place a unique character between messages (maybe the message body allows all characters to be used, so there is no distinct character available to use as a separator). In that case, you need to prefix the message with the message's length instead, either as a preceeding integer, a structured header, etc. Then you simply call recv() as many times as needed until you have received the full integer/header, then you call recv() as many times as needed to read just as many bytes as the length/header specifies. When finished, save any remaining data if needed, and start calling recv() all over again to read the next message length/header, and so on.
It is definitely valid for both messages to be returned in a single recv call (see Nagle's Algorithm). TCP/IP guarantees order (the bytes from the messages won't be mixed). In addition to them being returned together in a single call, it is also possible for a single message to require multiple calls to recv (although it would be unlikely with packets as small as described).
The only thing you can count on is the order of the bytes. You cannot count on how they are partitioned into recv calls. Sometimes things get merged either at the endpoint or along the way. Things can also get broken up along the way and so arrive independently. It does sound like your sender is sending alternating 12 and 915 but you can't count on it.

Sending And Receiving Sockets (TCP/IP)

I know that it is possible that multiple packets would be stacked to the buffer to be read from and that a long packet might require a loop of multiple send attempts to be fully sent. But I have a question about packaging in these cases:
If I call recv (or any alternative (low-level) function) when there are multiple packets awaiting to be read, would it return them all stacked into my buffer or only one of them (or part of the first one if my buffer is insufficient)?
If I send a long packet which requires multiple iterations to be sent fully, does it count as a single packet or multiple packets? It's basically a question whether it marks that the package sent is not full?
These questions came to my mind when I thought about web sockets packaging. Special characters are used to mark the beginning and end of a packet which sorta leads to a conclusion that it's not possible to separate multiple packages.
P.S. All the questions are about TCP/IP but you are welcomed to share information (answers) about UDP as well.
TCP sockets are stream based. The order is guaranteed but the number of bytes you receive with each recv/read could be any chunk of the pending bytes from the sender. You can layer a message based transport on top of TCP by adding framing information to indicate the way that the payload should be chunked into messages. This is what WebSockets does. Each WebSocket message/frame starts with at least 2 bytes of header information which contains the length of the payload to follow. This allows the receiver to wait for and re-assemble complete messages.
For example, libraries/interfaces that implement the standard Websocket API or a similar API (such as a browser), the onmessage event will fire once for each message received and the data attribute of the event will contain the entire message.
Note that in the older Hixie version of WebSockets, each frame was started with '\x00' and terminated with '\xff'. The current standardized IETF 6455 (HyBi) version of the protocol uses the header information that contains the length which allows much easier processing of the frames (but note that both the old and new are still message based and have basically the same API).
TCP connection provides for stream of bytes, so treat it as such. No application message boundaries are preserved - one send can correspond to multiple receives and the other way around. You need loops on both sides.
UDP, on the other hand, is datagram (i.e. message) based. Here one read will always dequeue single datagram (unless you mess with low-level flags on the socket). Event if your application buffer is smaller then the pending datagram and you read only a part of it, the rest of it is lost. The way around it is to limit the size of datagrams you send to something bellow the normal MTU of 1500 (less IP and UDP headers, so actually 1472).

Determining Packets Received with Winsock2

Is there a way to determine how many packets where received while using recv() with Winsock? I am looking for a solution to implement at the client, without special requirements on the server side (which I have no control of)
You'd need to packet-sniff using something like the WinPCap. Then you could correlate the packets captured with the socket used.