Why does TCP not tell me how many bytes have been received?

Why does TCP not tell me how many bytes have been received? - sockets

When sending data on a TCP connection, the TCP stack makes sure the bytes arrive, and lost packets are resent as needed. To accomplish that, the other end sends ACK messages to acknowledge it has received the data.
As the TCP stack receives ACK messages, it knows that the other end has received some data and can stop retransmitting. While it doesn't know exactly how much data the other end has received (ACKs may get lost or delayed) it at least has a lower bound of data that has already been received on the other end.
To my knowledge, TCP implementations don't make that information available to high level callers.
Specifically I'm thinking of the POSIX socket APIs, as far as I know there's no way to tell how much data the other end has received. I only know how much I sent, but due to large buffer sizes it could take a lot of time for that data to be received.
Obviously, if I control the other end, I could occasionally report how much data has been received with explicit messages (either on the same tcp connection or via a separate channel), but that seems inelegant.
Why is that information not exposed to the user? It would be useful for things like progress bars.

Related

very long block in send(), seems to the thread related, not TCP

I have an application whose main purpose is to transform a RTP stream into an HTTP stream. One thread is receiving RTP packets and write them into a circular buffer and another thread acts as a mini webserver and answers HTTP request by reading from that buffer (only one GET request can happen at a time).
This HTTP thread, once the GET has been received is a simple loop that call send() whenever there is something in the circular buffer. But sometimes, the send() blocks for an insane amount of time (like >1s), creating audio dropout.
To be clear, RTP packets arrive in due real time, no over or underflow here. The HTTP socket is, on purpose, blocking as it is expected that the receiver regulates its flow using TCP when it does not need audio (enough on its own buffers). But the HTTP client is not overwhelmed by audio as the RTP source is, again, just doing realtime.
But obviously, something else happens and I've observed that on Linux, MacOS and Windows (the code works on all these) and on two different network topologies.
I'm wondering if the send() long blocks are not due to something else than the TCP flow control, like something I'm missing with what happens when a thread blocks in a send()

Get a wireshark trace so you can see where the TCP stall is happening. I suspect what is happening is any of the following:
You're actually sending faster than client is consuming. I think you've already ruled that out...
The more likely case is that an IP packet is getting lost and TCP is stuck waiting for the ACK, times out, and then retransmits. Meanwhile your sending thread is trying to stuff more data into the socket and it's getting backed up and eventually blocks.
One simple things you can do is to try increasing the send buffer (SO_SNDBUF) on the socket you send with. This value specifies how many untransmitted bytes that the app can write to the socket before blocking. And if possible, increase the receive buffer (SO_RCVBUF) on the client side. That way, if the network takes a burp for a couple of seconds, your socket will take longer to fill up before blocking.
int size = 512*1024;
setsockopt(sock, SOL_SOCKET, SO_SNDBUF, &size, sizeof(size));

tcp or udp for a game server?

I know, I know. This question has been asked many times before. But I've spent an hour googling now without finding what I am looking for so I will ask it again and mention my context along with what makes the decision hard for me:
I am writing the server for a game where the response time is very important and a packet loss every now and then isn't a problem.
Judging by this and the fact that I as a server mostly have to send the same data to many different clients, the obvious answer would be UDP.
I had already started writing the code when I came across this:
In some applications TCP is faster (better throughput) than UDP.
This is the case when doing lots of small writes relative to the MTU size. For example, I read an experiment in which a stream of 300 byte packets was being sent over Ethernet (1500 byte MTU) and TCP was 50% faster than UDP.
In my case the information units I'm sending are <100 bytes, which means each one fits into a single UDP packet (which is quite pleasant for me because I don't have to deal with the fragmentation) and UDP seems much easier to implement for my purpose because I don't have to deal with a huge amount of single connections, but my top priority is to minimize the time between
client sends something to server
and
client receives response from server
So I am willing to pick TCP if that's the faster way.
Unfortunately I couldn't find more information about the above quoted case, which is why I am asking: Which protocol will be faster in my case?

UDP is still going to be better for your use case.
The main problem with TCP and games is what happens when a packet is dropped. In UDP, that's the end of the story; the packet is dropped and life continues exactly as before with the next packet. With TCP, data transfer across the TCP stream will stop until the dropped packet is successfully retransmitted, which means that not only will the receiver not receive the dropped packet on time, but subsequent packets will be delayed also -- most likely they will all be received in a burst immediately after the resend of the dropped packet is completed.
Another feature of TCP that might work against you is its automatic bandwidth control -- i.e. TCP will interpret dropped packets as an indication of network congestion, and will dial back its transmission rate in response; potentially to the point of dialing it down to near zero, in cases where lots of packets are being lost. That might be useful if the cause really was network congestion, but dropped packets can also happen due to transient network errors (e.g. user pulled out his Ethernet cable for a couple of seconds), and you might not want to handle those problems that way; but with TCP you have no choice.
One downside of UDP is that it often takes special handling to get incoming UDP packets through the user's firewall, as firewalls are often configured to block incoming UDP packets by default. For an action game it's probably worth dealing with that issue, though.
Note that it's not a strict either/or option; you can always write your game to work over both TCP and UDP, and either use them simultaneously, or let the program and/or the user decide which one to use. That way if one method isn't working well, you can simply use the other one, and it only takes twice as much effort to implement. :)
In some applications TCP is faster (better throughput) than UDP. This
is the case when doing lots of small writes relative to the MTU size.
For example, I read an experiment in which a stream of 300 byte
packets was being sent over Ethernet (1500 byte MTU) and TCP was 50%
faster than UDP.
If this turns out to be an issue for you, you can obtain the same efficiency gain in your UDP protocol by placing multiple messages together into a single larger UDP packet. i.e. instead of sending 3 100-byte packets, you'd place those 3 100-byte messages together in 1 300-byte packet. (You'd need to make sure the receiving program is able to correctly intepret this larger packet, of course). That's really all that the TCP layer is doing here, anyway; placing as much data into the outgoing packets as it has available and can fit, before sending them out.

Are TCP/IP Sockets Atomic?

It is my understanding that a write to a TCP/IP socket will be atomic if the amount of data written is small. By atomic, I mean that the receiver will receive all of the data or none of the data. However, it is not atomic, if the amount of the data written is large. Am I correct? and if so, what counts as large?
Thanks,
Bob

No. TCP is a byte-stream protocol. No messages, no datagram-like behaviour.

For UDP, that is true, because all data written by the app is sent out in one UDP datagram.
For TCP, that is not true, unless the application sends only 1 byte of data at a time. A write to a TCP socket will write all of the data to a buffer that is associated with that socket. TCP will then read data from that buffer in the background and send it to the receiver. How much data TCP actually sends in one TCP segment depends on variables of its flow control mechanisms, and other factors, including:
Receive Window published by the other node (receiver)
Amount of data sent in previous segments in flight that are not acknowledged yet
Slow start and congestion avoidance algorithm state
Negotiated maximum segment size (MSS)
In TCP, you can never assume what the application writes to a socket is actually received in one read by the receiver. Data in the socket's buffer can be sent to the receiver in one or many TCP segments. At any moment when data is made available, the receiver can perform a socket read and return with whatever data is actually available at that moment.
Of course, all sent data will eventually reach the receiver, if there is no failure in the middle preventing that, and if the receiver does not close the connection or stop reading before the data arrives.

how can I transfer large data over tcp socket

how can I transfer large data without splitting. Am using tcp socket. Its for a game. I cant use udp and there might be 1200 values in an array. Am sending array in json format. But the server receiving it like splitted.
Also is there any option to send http request like tcp? I need the response in order. Also it should be faster.
Thanks,

You can't.
HTTP may chunk it
TCP will segment it
IP will packetize it
routers will fragment it ...
and TCP will reassemble it all at the other end.
There isn't a problem here to solve.

You do not have much control over splitting packets/datagrams. The network decides about this.
In the case of IP, you have the DF (don't fragment) flag, but I doubt it will be of much help here. If you are communicating over Ethernet, then 1200 element array may not fit into an Ethernet frame (payload size is up to the MTU of 1500 octets).
Why does your application depend on the fact that the whole data must arrive in a single unit, and not in a single connection (comprised potentially of multiple units)?

how can I transfer large data without splitting.
I'm interpreting the above to be roughly equivalent to "how can I transfer my data across a TCP connection using as few TCP packets as possible". As others have noted, there is no way to guarantee that your data will be placed into a single TCP packet -- but you can do some things to make it more likely. Here are some things I would do:
Keep a single TCP connection open. (HTTP traditionally opens a separate TCP connection for each request, but for low-latency you can't afford to do that. Instead you need to open a single TCP connection, keep it open, and continue sending/receiving data on it for as long as necessary).
Reduce the amount of data you need to send. (i.e. are there things that you are sending that the receiving program already knows? If so, don't send them)
Reduce the number of bytes you need to send. (The easiest way to do this is to zlib-compress your message-data before you send it, and have the receiving program decompress the message after receiving it. This can give you a size-reduction of 50-90%, depending on the content of your data)
Turn off Nagle's algorithm on your TCP socket. That will reduce latency by 200mS and discourage the TCP stack from playing unnecessary games with your data.
Send each data packet with a single send() call (if that means manually copying all of the data items into a separate memory buffer before calling send(), then so be it).
Note that even after you do all of the above, the TCP layer will still sometimes spread your messages across multiple packets, etc -- that's just the way TCP works. And even if your local TCP stack never did that, the receiving computer's TCP stack would still sometimes merge the data from consecutive TCP packets together inside its receive buffer. So the receiving program is always going to "receive it like splitted" sometimes, because TCP is a stream-based protocol and does not maintain message boundaries. (If you want message boundaries, you'll have to do your own framing -- the easiest way is usually to send a fixed-size (e.g. 1, 2, or 4-byte) integer byte-count field before each message, so the receiver knows how many bytes it needs to read in before it has a full message to parse)

Consider the idea that the issue may be else where or that you may be sending too much unnecessary data. In example with PHP there is the isset() function. If you're creating an internet based turn based game you don't (need to send all 1,200 variables back and forth every single time. Just send what changed and when the other player receives that data only change the variables are are set.

Boost Asio UDP retrieve last packet in socket buffer

I have been messing around Boost Asio for some days now but I got stuck with this weird behavior. Please let me explain.
Computer A is sending continuos udp packets every 500 ms to computer B, computer B desires to read A's packets with it own velocity but only wants A's last packet, obviously the most updated one.
It has come to my attention that when I do a:
mSocket.receive_from(boost::asio::buffer(mBuffer), mEndPoint);
I can get OLD packets that were not processed (almost everytime).
Does this make any sense? A friend of mine told me that sockets maintain a buffer of packets and therefore If I read with a lower frequency than the sender this could happen. ¡?
So, the first question is how is it possible to receive the last packet and discard the ones I missed?
Later I tried using the async example of the Boost documentation but found it did not do what I wanted.
http://www.boost.org/doc/libs/1_36_0/doc/html/boost_asio/tutorial/tutdaytime6.html
From what I could tell the async_receive_from should call the method "handle_receive" when a packet arrives, and that works for the first packet after the service was "run".
If I wanted to keep listening the port I should call the async_receive_from again in the handle code. right?
BUT what I found is that I start an infinite loop, it doesn't wait till the next packet, it just enters "handle_receive" again and again.
I'm not doing a server application, a lot of things are going on (its a game), so my second question is, do I have to use threads to use the async receive method properly, is there some example with threads and async receive?

One option is to take advantage of the fact that when the local receive buffer for your UDP socket fills up, subsequently received packets will push older ones out of the buffer. You can set the local receive buffer size to be large enough for one packet, but not two. This will make the newest packet to arrive always cause the previous one to be discarded. When you then ask for the packet using receive_from, you'll get the latest (and only) one.
Here are the API docs for changing the receive buffer size with Boost:
http://www.boost.org/doc/libs/1_37_0/doc/html/boost_asio/reference/basic_datagram_socket/receive_buffer_size.html
The example appears to be wrong, in that it shows a tcp socket rather than a udp socket, but changing that back to udp should be easy (the trivially obvious change should be the right one).

With Windows (certainly XP, Vista, & 7); if you set your recv buffer size to zero you'll only receive datagrams if you have a recv pending when the datagram arrives. This MAY do what you want but you'll have to sit and wait for the next one if you post your recv just after the last datagram arrives ...
Since you're doing a game, it would be far better, IMHO, is to use something built on UDP rather than UDP itself. Take a look at ENet which supports reliable data over UDP and also unreliable 'sequenced' data over UDP. With unreliable sequenced data you only ever get the 'latest' data. Or something like RakNet might be useful to you as it does a lot of games stuff and also includes stuff similar to ENet's sequenced data.
Something else you should bear in mind is that with raw UDP you may get those datagrams out of order and you may get them more than once. So you're likely gonna need your own sequence number in their anyway if you don't use something which sequences the data for you.

P2engine is a flexible and efficient platform for making p2p system development easier. Reliable UDP, Message Transport , Message Dispatcher, Fast and Safe Signal/Slot...

You're going about it the wrong way. The receiving end has a FIFO queue. Once the queue gets filled new arriving packets are discarded.
So what you need to do on the receiver is just to keep reading the packets as fast as possible and process them as they arrive.
Your receiving end should easily be able to handle receiving a packet every 500ms. I'd say you've got a bug in your code and from what you describe yes you do.
It could be this, make sure in handle_receive that you only call async_receive_from if there is no error.

I think that I have your same problem, to solve the problem I read the bytes_available and compare with packet width until I receive the last package:
boost::asio::socket_base::bytes_readable command(true);
socket_server.io_control(command);
std::size_t bytes_readable = command.get();
Here is the documentation.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse