I've a non-blocking socket used for writing operations (send).
I would like to know if the select() is the only way to detect if:
- the socket connect completed successfully when connect returns EINPROGRESS
- the availability of socket for write operations when send return EWOULDBLOCK or EGAIN
Is polling an alternative to select()? In my application I've already a thread that wakes up each 1 second that can check cyclically if connect() returns with 0 (connection is OK) and send returns with 0 (sending is OK) if some bytes have to be sent.
Is polling an alternative to select()?
It's an alternative, but not a good one. You don't know how long to sleep for. select() does. On average a manual poll must sleep for double the required time per attempt, and waste CPU cycles while looping until success. select() doesn't have any of those issues.
Related
A lot of examples can be found about non blocking TCP sockets, however I find it difficult to find good explanation about how UDP sockets should be handled with poll/select/epoll system calls.
Blocking or non blocking ?
When we have to deal with tcp sockets then it makes sense to set them to non blocking, since it takes only one slow client connection to prevent the server from serving other clients. However, there are no ACK messages in UDP, so my assumption is that writing to UDP should be fast enough for both cases. Does that mean that we can safely use blocking UDP socket with the family of poll system calls if each time we are going to send small amount of data (10Kb for example)? From this discussion I assume that ARP request is the only point that can substantially block the sendto function, however isn't it a one time thing?
Return value of sendto
Let's say the socket is non-blocking. Can there be a scenario that I try to send 1000 bytes of data, and the sendto function sends only some part of it (say 300 bytes)? Does that mean that it has just sent a UDP packet with 300 bytes, and next time I use sendto I have to consider that it will send in a new UDP packet again? Is this situation still possible for blocking sockets?
Return value of recvfrom
The same question applies for recvfrom. Can there be a situation that I will need to call recvfrom more than once to obtain the full UDP packet. Is
that behaviour different for blocking/non-blocking sockets.
I have an application whose main purpose is to transform a RTP stream into an HTTP stream. One thread is receiving RTP packets and write them into a circular buffer and another thread acts as a mini webserver and answers HTTP request by reading from that buffer (only one GET request can happen at a time).
This HTTP thread, once the GET has been received is a simple loop that call send() whenever there is something in the circular buffer. But sometimes, the send() blocks for an insane amount of time (like >1s), creating audio dropout.
To be clear, RTP packets arrive in due real time, no over or underflow here. The HTTP socket is, on purpose, blocking as it is expected that the receiver regulates its flow using TCP when it does not need audio (enough on its own buffers). But the HTTP client is not overwhelmed by audio as the RTP source is, again, just doing realtime.
But obviously, something else happens and I've observed that on Linux, MacOS and Windows (the code works on all these) and on two different network topologies.
I'm wondering if the send() long blocks are not due to something else than the TCP flow control, like something I'm missing with what happens when a thread blocks in a send()
Get a wireshark trace so you can see where the TCP stall is happening. I suspect what is happening is any of the following:
You're actually sending faster than client is consuming. I think you've already ruled that out...
The more likely case is that an IP packet is getting lost and TCP is stuck waiting for the ACK, times out, and then retransmits. Meanwhile your sending thread is trying to stuff more data into the socket and it's getting backed up and eventually blocks.
One simple things you can do is to try increasing the send buffer (SO_SNDBUF) on the socket you send with. This value specifies how many untransmitted bytes that the app can write to the socket before blocking. And if possible, increase the receive buffer (SO_RCVBUF) on the client side. That way, if the network takes a burp for a couple of seconds, your socket will take longer to fill up before blocking.
int size = 512*1024;
setsockopt(sock, SOL_SOCKET, SO_SNDBUF, &size, sizeof(size));
I have a tcp socket for my app. TCP keep alive is enabled with a 10 seconds freq.
In addition, I also have msgs flowing between the app and the server every 1 sec to get status.
So, since there are msgs flowing anyway over the socket at a faster rate, there will be no keep alives flowing at all.
Now,consider this scenario: The remote server is down, so the periodic msg send (that happens every 1 sec) fails 3-5 times in a row. I dont think by enabling tcp keep alives, we can detect that the socket is broken, can we?
Do we have to then build logic in our code to ensure that if this periodic msg fails a certain number of times in a row, the other end is to be assumed dead?
Let me know.
In your application it makes no sense to enable keep alive.
Keep alive is for applications that have an open connection, and don't use it all the time, you are using it all the time so keep alive is not needed.
When you send something and the other end has crashed, TCP on the client will send all retransmissions with an increasing timeout. Finally if you have a blocking socket, you well get an error indication on the send operation where you know that you have to close the socket and retry a connection.
An error indication is where the return code of the socket operation is < 0.
I don't know the value of these timeouts by heart but it can go up to a minute or longer.
When the server is gracefully shutdown, meaning it will close its send of the socket, you will get that information by receiving 0 bytes on your receiving socket.
You might wanna check out my answer of yesterday as well :
Reset TCP connection if server closes/crashes mid connection
No, you don't need to assume anything. The connection will break either because a send will time out or a keep alive will time out. Either way, the connection will break and you'll start getting errors on reads and writes.
I'm writing a server in Linux that will have to support simultaneous read/write operations from multiple clients. I want to use the select function to manage read/write availability.
What I don't understand is this: Suppose I want to wait until a socket has data available to be read. The documentation for select states that it blocks until there is data available to read, and that the read function will not block.
So if I'm using select and I know that the read function will not block, why would I need to set my sockets to non-blocking?
There might be cases when a socket is reported as ready but by the time you get to check it, it changes its state.
One of the good examples is accepting connections. When a new connection arrives, a listening socket is reported as ready for read. By the time you get to call accept, the connection might be closed by the other side before ever sending anything and before we called accept. Of course, the handling of this case is OS-dependent, but it's possible that accept will simply block until a new connection is established, which will cause our application to wait for indefinite amount of time preventing processing of other sockets. If your listening socket is in a non-blocking mode, this won't happen and you'll get EWOULDBLOCK or some other error, but accept will not block anyway.
Some kernels used to have (I hope it's fixed now) an interesting bug with UDP and select. When a datagram arrives select wakes up with the socket with datagram being marked as ready for read. The datagram checksum validation is postponed until a user code calls recvfrom (or some other API capable of receiving UDP datagrams). When the code calls recvfrom and the validating code detects a checksum mismatch, a datagram is simply dropped and recvfrom ends up being blocked until a next datagram arrives. One of the patches fixing this problem (along with the problem description) can be found here.
Other than the kernel bugs mentioned by others, a different reason for choosing non-blocking sockets, even with a polling loop, is that it allows for greater performance with fast-arriving data. Think what happens when a blocking socket is marked as "readable". You have no idea how much data has arrived, so you can safely read it only once. Then you have to get back to the event loop to have your poller check whether the socket is still readable. This means that for every single read from or write to the socket you have to do at least two system calls: the select to tell you it's safe to read, and the reading/writing call itself.
With non-blocking sockets you can skip the unnecessary calls to select after the first one. When a socket is flagged as readable by select, you have the option of reading from it as long as it returns data, which allows faster processing of quick bursts of data.
This going to sound snarky but it isn't. The best reason to make them non-blocking is so you don't block.
Think about it. select() tells you there is something to read but you don't know how much. Could be 2 bytes, could be 2,000. In most cases it more efficient to drain whatever data is there before going back to select. So you enter a while loop to read
while (1)
{
n = read(sock, buffer, 200);
//check return code, etc
}
What happens on the last read when there is nothing left to read? If the socket isn't non-blocking you will block, thereby defeating (at least partially) the point of the select().
One of the benefits, is that it will catch any programming errors you make, because if you try to read a socket that would normally block you, you'll get EWOULDBLOCK instead. For objects other than sockets, the exact api behaviour may change, see http://www.scottklement.com/rpg/socktut/nonblocking.html.
I have a list of nonblocking sockets.
I could call recv in each one (in this case, some calls shall fail) or poll the list and later call recv on ready sockets.
Is there a performance difference between these approaches?
Thanks!
Unless the rate of data on the sockets is quite high (eg: recv() will fail <25% of the time), using poll() or select() is almost always the better choice.
Modern operating system will intelligent block a poll() operation until one of fds in the set is ready (the kernel will block the thread on a set of fds, awaking it only when that fd has been accessed... ultimately, this happens far more than necessary, resulting in some busy-waiting, but it's better than nothing), while a recv() loop will always result in busy waiting.