very long block in send(), seems to the thread related, not TCP

very long block in send(), seems to the thread related, not TCP - sockets

I have an application whose main purpose is to transform a RTP stream into an HTTP stream. One thread is receiving RTP packets and write them into a circular buffer and another thread acts as a mini webserver and answers HTTP request by reading from that buffer (only one GET request can happen at a time).
This HTTP thread, once the GET has been received is a simple loop that call send() whenever there is something in the circular buffer. But sometimes, the send() blocks for an insane amount of time (like >1s), creating audio dropout.
To be clear, RTP packets arrive in due real time, no over or underflow here. The HTTP socket is, on purpose, blocking as it is expected that the receiver regulates its flow using TCP when it does not need audio (enough on its own buffers). But the HTTP client is not overwhelmed by audio as the RTP source is, again, just doing realtime.
But obviously, something else happens and I've observed that on Linux, MacOS and Windows (the code works on all these) and on two different network topologies.
I'm wondering if the send() long blocks are not due to something else than the TCP flow control, like something I'm missing with what happens when a thread blocks in a send()

Get a wireshark trace so you can see where the TCP stall is happening. I suspect what is happening is any of the following:
You're actually sending faster than client is consuming. I think you've already ruled that out...
The more likely case is that an IP packet is getting lost and TCP is stuck waiting for the ACK, times out, and then retransmits. Meanwhile your sending thread is trying to stuff more data into the socket and it's getting backed up and eventually blocks.
One simple things you can do is to try increasing the send buffer (SO_SNDBUF) on the socket you send with. This value specifies how many untransmitted bytes that the app can write to the socket before blocking. And if possible, increase the receive buffer (SO_RCVBUF) on the client side. That way, if the network takes a burp for a couple of seconds, your socket will take longer to fill up before blocking.
int size = 512*1024;
setsockopt(sock, SOL_SOCKET, SO_SNDBUF, &size, sizeof(size));

Related

What is the difference between Nagle algorithm and 'stop and wait'?

I saw the socket option TCP_NODELAY, which is used to turn on or off the Nagle alorithm.
I checked what the Nagle algorithm is, and it seems similar to 'stop and wait'.
Can someone give me a clear difference between these two concepts?

In a stop and wait protocol, one
sends a message to the peer
waits for an ack for that message
sends the next message
(i.e. one cannot send a new message until the previous one has been acknowledged)
Nagle's algorithem as used in TCP is orthoginal to this concept. When the TCP application sends some data, the protocol buffers the data and waits a little while to see if there's more data to be sent instead of sending data to the peer immediately.
If the application has more data to send in this small timeframe, the protocol stack merges that data into the current buffer and can send it as one large message.
This concept could very well be applied to a stop and go protocol as well. (Note that TCP is not a stop and wait protocol)

The Nagle Algorithm is used to control whether the socket provider sends outgoing data immediately as-is at the cost of less efficient network transmissions (off), or if it buffers outgoing data so it can make more efficient network transmissions at the cost of speed (on).
Stop and Wait is a mechanism used to ensure the integrity of transmitted data, by making the sender send a frame of data and then wait for an acknowledgement from the receiver before sending another frame, thus ensuring frames are received in the same order in which they are sent.
These two features operate independently of each other.

how to receive large number of UDP packets continously in vc++

I am writing an GUI application which receives UDP packets from a FPGA board of 4Gb data continuously (application is a data retrieval system).
I created my own class inherited from CAyncSocket and on receive message I am reading packets through ReceiveFrom API and writing data to file.
As packets are sent continuously from FPGA (about 400k packets of 1KB data) my application is missing the packets. I am receiving only 200k packets. but when I am monitoring with Wireshark all packets are received.
Can anyone suggest any technique or algorithm to solve this problem, so that I can receive large number of UDP packets without loss.

The first thing to understand and accept is that you cannot guarantee that no UDP packets will be dropped. It is part of the nature of the UDP transport layer that any step in the transmission is allowed to drop a UDP packet for any reason, and that this is something that will happen from time to time. In your case, it sounds like the Windows networking stack is dropping the incoming UDP packets after receiving them from the network card, probably because the incoming-UDP-packets buffer associated with your socket is too full and does not have room to store them. This could happen for example if your write-to-disk calls occasionally take a number of milliseconds to return, during which time your app is unable to read more data from the UDP socket.
That said, there are a few things you can do to make the dropping of packets somewhat less likely.
The first (and easiest) thing to do is to increase the size of your socket's incoming-packets-buffer, using setsockopt(SO_RCVBUF). This helps because the larger the buffer is, the more time your program will have to read packets out of the buffer before the networking stack fills the buffer up entirely and starts dropping packets because it has no place to put them.
If that isn't sufficient for your purposes, the other thing you can do is spawn a separate thread that does nothing but receive incoming UDP packets and add them to a queue (for another thread to process later). Because this thread does nothing else besides receive UDP packets, it will be able to respond quickly when new packets have arrived, and thus the incoming-sockets-buffer will be less likely to ever fill up and overflow. You'll probably want to run this thread at a high priority if possible, so that there is less chance of it being held off of the CPU in the case where other threads or programs are competing for CPU time.
If you've implemented both of the above and the rate of packet loss still isn't acceptable, then you may have to step back and re-evaluate your approach. This might include switching from UDP protocol to TCP, or rewriting your code as an in-kernel driver, or switching to a real-time OS that can make better guarantees about response times.

TCP connection for real time

I want to use a real time TCP connection, I have a streaming of data from server , and I receive it by a client, but this client is too slow to receive as fast as the sender is, so the server buffer the data until it's reach the destination, for example if I "produce" data at time t, and suppose that the client are 10 time slower, then the data produced at time t, will arrive at time 10t.
I want to make the server "drop" the data that can't reach the client at the present time, and send the new data which is expected to arrive at the time?
B.S : I know that UDP protocol do this, but I want to do this by TCP.

I've done this sort of thing in the past, and got reasonably good results. Here's how I did it:
1) On the sending side, use setsockopt(SOL_SOCKET, SO_SNDBUF) to make the server's TCP socket's send buffer as small as you can get away with (since you can't drop data once it's already in the socket's send buffer, you want to keep as little data there as possible)
2) On the sending side, never proactively send() any outgoing data into the socket at all. Instead, write a function (we'll call it DumpCurrentStateToBuffer()) that writes the "current state" bytes (that you want to send to the client) into an in-memory buffer.
3) When the client's socket select()'s (or poll()'s, or whatever mechanism you use) as ready-for-write, call DumpCurrentStateToBuffer() to create a memory-buffer of bytes that are to be sent to the client. Now send that data to the client (if you're using blocking I/O you can do it synchronously, at the cost of potentially stalling your server until the data can be sent; OTOH if you're using non-blocking I/O, you may need to keep the memory-buffer and your current sent-bytes index into the buffer around as state variables, so you can keep sending more sub-chunks of the memory buffer over time, whenever the socket indicates that it can receive more bytes)
4) Once the memory-buffer's contents have been fully sent, you can free the memory buffer, and then wait for the socket to select as ready-for-write again; when it does, goto (3).
This technique doesn't solve all of TCP's non-real-time issues; for example, a dropped TCP packet will still have to be resent to the client. What it does do is guarantee that the client-to-server data backlog will never be more than one or two "states" long, because you never generate any new data unless/until there is at least some room in the socket's output buffer.

Is a successful send() "atomic"?

Does a successful call to send() with the number returned equal to the amount specified in the size parameter guarantee that no "partial sends" will occur?
Or is there some way that the OS might be interrupted while servicing the system call, send part of the data, wait for a possibly long time, then send the rest and return without notifying me with a smaller return value?
I'm not talking about a case where there is not enough room in the kernel buffer; I realize that I would then get a smaller return value and have to try again.
Update:
Based on the answers so far, my question could be rephrased as follows:
Is there any way for packets/data to be sent over the wire before the call to send() returns?

Does a successful call to send() with the number returned equal to the amount specified in >the size parameter guarantee that no "partial sends" will occur?
No, it's possible that parts of your data gets passed over the wire, and another part only goes as far as being copied into the internal buffers of the local TCP stack. send() will return the no. of bytes passed to the local TCP stack, not the no. of bytes that gets passed onto the wire (and even if the data reaches the wire, it might not reach the peer).
Or is there some way that the OS might be interrupted while servicing the system call, send part of the data, wait for a possibly long time, then send the rest and return without notifying me with a smaller return value?
As send() only returns the no. of bytes passed into the local TCP stack, not whether send() actually sends anything, you can't really distinguish these two cases anyway. But yes, it's possibly only some data makes it over the wire. Even if there's enough space in the local buffer, the peer might not have enough space. If you send 2 bytes, but the peer only has room for 1 more byte, 1 byte might be sent, the other will reside in the local tcp stack until the peer has enough room again.
(That's an extreme example, most TCP stacks protects against sending such small segments of data at a time, but the same applies if you try to send 4k of data but the peer only have room for 3k).
I'm not talking about a case where there is not enough room in the kernel buffer; I realize that I would then get a smaller return value and have to try again
That will only happen if your socket is non-blocking. If it's blocking and the local buffers are full, send() will wait until there's room in the local buffers again (or, it might return
a short count if parts of the data was delivered, but an error occured in the mean time.)
Edit to answer:
Is there any way for packets/data to be sent over the wire before the call to send() returns?
Yes. That might happen for many reasons.
e.g.
The local buffers gets filled up by that recent send() call, and you use blocking I/O.
The TCP stack sends your data over the wire but decides to schedule other processes to
run before that sending process returns from send().

Though this depends on the protocol you are using, the general question is no.
For TCP the data gets buffered inside the kernel and then sent out at the discretion of the TCP packetization algorithm, which is pretty hairy - it keeps multiple timers, minds path MTU trying to avoid IP fragmentation.
For UDP you can only assume this kind of "atomicity" if your datagram does not exceed link frame size (usual value is 1472 = 1500 of ethernet frame - 20 bytes of IP header - 8 bytes of UDP header). Otherwise your sending host will have to IP-fragment the datagram.
Then intermediate routers can still IP-fragment the passing packet if their outgoing link MTU is less then the packet size.

Boost Asio UDP retrieve last packet in socket buffer

I have been messing around Boost Asio for some days now but I got stuck with this weird behavior. Please let me explain.
Computer A is sending continuos udp packets every 500 ms to computer B, computer B desires to read A's packets with it own velocity but only wants A's last packet, obviously the most updated one.
It has come to my attention that when I do a:
mSocket.receive_from(boost::asio::buffer(mBuffer), mEndPoint);
I can get OLD packets that were not processed (almost everytime).
Does this make any sense? A friend of mine told me that sockets maintain a buffer of packets and therefore If I read with a lower frequency than the sender this could happen. ¡?
So, the first question is how is it possible to receive the last packet and discard the ones I missed?
Later I tried using the async example of the Boost documentation but found it did not do what I wanted.
http://www.boost.org/doc/libs/1_36_0/doc/html/boost_asio/tutorial/tutdaytime6.html
From what I could tell the async_receive_from should call the method "handle_receive" when a packet arrives, and that works for the first packet after the service was "run".
If I wanted to keep listening the port I should call the async_receive_from again in the handle code. right?
BUT what I found is that I start an infinite loop, it doesn't wait till the next packet, it just enters "handle_receive" again and again.
I'm not doing a server application, a lot of things are going on (its a game), so my second question is, do I have to use threads to use the async receive method properly, is there some example with threads and async receive?

One option is to take advantage of the fact that when the local receive buffer for your UDP socket fills up, subsequently received packets will push older ones out of the buffer. You can set the local receive buffer size to be large enough for one packet, but not two. This will make the newest packet to arrive always cause the previous one to be discarded. When you then ask for the packet using receive_from, you'll get the latest (and only) one.
Here are the API docs for changing the receive buffer size with Boost:
http://www.boost.org/doc/libs/1_37_0/doc/html/boost_asio/reference/basic_datagram_socket/receive_buffer_size.html
The example appears to be wrong, in that it shows a tcp socket rather than a udp socket, but changing that back to udp should be easy (the trivially obvious change should be the right one).

With Windows (certainly XP, Vista, & 7); if you set your recv buffer size to zero you'll only receive datagrams if you have a recv pending when the datagram arrives. This MAY do what you want but you'll have to sit and wait for the next one if you post your recv just after the last datagram arrives ...
Since you're doing a game, it would be far better, IMHO, is to use something built on UDP rather than UDP itself. Take a look at ENet which supports reliable data over UDP and also unreliable 'sequenced' data over UDP. With unreliable sequenced data you only ever get the 'latest' data. Or something like RakNet might be useful to you as it does a lot of games stuff and also includes stuff similar to ENet's sequenced data.
Something else you should bear in mind is that with raw UDP you may get those datagrams out of order and you may get them more than once. So you're likely gonna need your own sequence number in their anyway if you don't use something which sequences the data for you.

P2engine is a flexible and efficient platform for making p2p system development easier. Reliable UDP, Message Transport , Message Dispatcher, Fast and Safe Signal/Slot...

You're going about it the wrong way. The receiving end has a FIFO queue. Once the queue gets filled new arriving packets are discarded.
So what you need to do on the receiver is just to keep reading the packets as fast as possible and process them as they arrive.
Your receiving end should easily be able to handle receiving a packet every 500ms. I'd say you've got a bug in your code and from what you describe yes you do.
It could be this, make sure in handle_receive that you only call async_receive_from if there is no error.

I think that I have your same problem, to solve the problem I read the bytes_available and compare with packet width until I receive the last package:
boost::asio::socket_base::bytes_readable command(true);
socket_server.io_control(command);
std::size_t bytes_readable = command.get();
Here is the documentation.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse