Is it valid to use SO_LINGER with a udp socket? - sockets

Is it valid to use SO_LINGER with a udp socket?
If yes, would you please describe the situation or illustrative-example where SO_LINGER would be appropriate with a udp socket?
a little background:
I have never used the SO_LINGER option so I am unfamiliar with what it does or when it is appropriate to use (especially with a udp socket).
more specific context for this question:
I ask because I ran across some open source code that was implementing a udp socket and using the SO_LINGER socket option (from my 30 seconds of looking at the code it looks like a pretty standard udp socket receive and then pushing the data out to some sort of consumer via gnuradio's API).
From the short reading I have done on SO_LINGER on man pages and webpages all the pages have talked about SO_LINGER being used with connection oriented protocols (i.e. tcp, et al.)... but this code is doing a non-connection oriented protocol, udp in this specific case, so I am confused why the SO_LINGER is being used with a udp socket.

When used with a positive read timeout, it causes close() to block for up to that timeout while the socket send buffer still has unsent data in it, and any difficulty writing that data will be returned as an error from the close() method. This applies equally well to UDP as to TCP. It is little used.
The other use of SO_LINGER, with a zero timeout, applies only to TCP.

Related

SFML sockets, what does a UDP socket returning "Disconnected" mean?

I'm writing a network program using SFML, and as my understanding was, UDP sockets are utterly connection-less
When i try read from my socket, I'm getting a "Disconnected" error code, but the documentation doesn't seem to mention UDP sockets being able to return this kind of error (only TCP ones being able to)
What could a UDP socket being Disconnected possibly mean?
While UDP as a protocol is "connectionless", the socket APIs support virtual connections to allow connection oriented functions to continue to work. When you call connect on a UDP socket, the OS remembers the connection data you set just as it normally would and it filters things that are not consistent with the virtual connection, this allows you to use interfaces like recv, send and getpeername because the peer is implicit. If you don't use connect, then you need to use interfaces like sendmsg. sendto, recvmsg and recvfrom where the peer is being communicated on a per packet basis.
In the case of SFML, it isn't necessarily using something that needs a connection, though, it is remapping other errors such as timeouts to Disconnected.

Differences between TCP sockets and web sockets, one more time [duplicate]

This question already has answers here:
What is the fundamental difference between WebSockets and pure TCP?
(4 answers)
Closed 4 years ago.
Trying to understand as best as I can the differences between TCP socket and websocket, I've already found a lot of useful information within these questions:
fundamental difference between websockets and pure TCP
How to establish a TCP Socket connection from a web browser (client side)?
and so on...
In my investigations, I went through this sentence on wikipedia:
Websocket differs from TCP in that it enables a stream of messages instead of a stream of bytes
I'm not totally sure about what it means exactly. What are your interpretations?
When you send bytes from a buffer with a normal TCP socket, the send function returns the number of bytes of the buffer that were sent. If it is a non-blocking socket or a non-blocking send then the number of bytes sent may be less than the size of the buffer. If it is a blocking socket or blocking send, then the number returned will match the size of the buffer but the call may block. With WebSockets, the data that is passed to the send method is always either sent as a whole "message" or not at all. Also, browser WebSocket implementations do not block on the send call.
But there are more important differences on the receiving side of things. When the receiver does a recv (or read) on a TCP socket, there is no guarantee that the number of bytes returned corresponds to a single send (or write) on the sender side. It might be the same, it may be less (or zero) and it might even be more (in which case bytes from multiple send/writes are received). With WebSockets, the recipient of a message is event-driven (you generally register a message handler routine), and the data in the event is always the entire message that the other side sent.
Note that you can do message based communication using TCP sockets, but you need some extra layer/encapsulation that is adding framing/message boundary data to the messages so that the original messages can be re-assembled from the pieces. In fact, WebSockets is built on normal TCP sockets and uses frame headers that contains the size of each frame and indicate which frames are part of a message. The WebSocket API re-assembles the TCP chunks of data into frames which are assembled into messages before invoking the message event handler once per message.
WebSocket is basically an application protocol (with reference to the ISO/OSI network stack), message-oriented, which makes use of TCP as transport layer.
The idea behind the WebSocket protocol consists of reusing the established TCP connection between a Client and Server. After the HTTP handshake the Client and Server start speaking WebSocket protocol by exchanging WebSocket envelopes. HTTP handshaking is used to overcome any barrier (e.g. firewalls) between a Client and a Server offering some services (usually port 80 is accessible from anywhere, by anyone). Client and Server can switch over speaking HTTP in any moment, making use of the same TCP connection (which is never released).
Behind the scenes WebSocket rebuilds the TCP frames in consistent envelopes/messages. The full-duplex channel is used by the Server to push updates towards the Client in an asynchronous way: the channel is open and the Client can call any futures/callbacks/promises to manage any asynchronous WebSocket received message.
To put it simply, WebSocket is a high level protocol (like HTTP itself) built on TCP (reliable transport layer, on per frame basis) that makes possible to build effective real-time application with JS Clients (previously Comet and long-polling techniques were used to pull updates from the Server before WebSockets were implemented. See Stackoverflow post: Differences between websockets and long polling for turn based game server ).

Does a SOCK_STREAM socket strip away the TCP bits around data?

Can someone explain what socket(AF_INET, SOCK_STREAM, 0) actually does when you bind it to an interface?
I have a socket like this listening on an interface, and, for example, if I do a http GET from my browser, when I do a read() on the socket, I literally see the buffer starts at the "GET /HTTP..." data that is the same data that shows up in the HTTP protocol packet of my wireshark capture of the same thing. How come I don't see the TCP SYN, SYN/ACK, ACK packets as the start of the buffer?
I thought having this socket on an interface would show me literally everything, but it seems like it only shows the data, not the metadata around it.
It's a SOCK_STREAM socket. It provides a byte-stream between two applications. You never see packets on a byte-stream socket, only a stream of bytes. That the stream of bytes takes place by an exchange of packets is an implementation detail that is made invisible to the application. (And might even be bypassed if both endpoints are on the same machine.)

Will a TCP RST cause a host to drop the receive buffer?

Upon receiving a TCP RST packet, will the host drop all the remaining data in the receive buffer that has already been ACKed by the remote host but not read by the application process using the socket?
I'm wondering if it's dangerous to close a socket as soon as I'm not interested in what the other host has to say anymore (e.g. to conserver resources); e.g. if that could cause the other party to lose any data I've already sent, but he has not yet read.
Should RSTs generally be avoided and indicate a complete, bidirectional failure of communication, or are they a relatively safe way to unidirectionally force a connection teardown as in the example above?
I've found some nice explanations of the topic, they indicate that data loss is quite possible in that case:
http://blog.olivierlanglois.net/index.php/2010/02/06/tcp_rst_flag_subtleties
http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable also gives some more information on the topic, and offers a solution that I've used in my code. So far, I've not seen any RSTs sent by my server application.
Application-level close(2) on a socket does not produce an RST but a FIN packet sent to the other side, which results in normal four-way connection tear-down. RSTs are generated by the network stack in response to packets targeting not-existing TCP connection.
On the other hand, if you close the socket but the other side still has some data to write, its next send(2) will result in EPIPE.
With all of the above in mind, you are much better off designing your own protocol on top of TCP that includes explicit "logout" or "disconnect" message.

Emulating accept() for UDP (timing-issue in setting up demultiplexed UDP sockets)

For an UDP server architecture that will have long-lived connections, one architecture is to have one socket that listens to all incoming UDP traffic, and then create separate sockets for each connection using connect() to set the remote address. My question is whether it is possible to do this atomically similar to what accept() does for TCP.
The reason for creating a separate socket and using connect() is that this makes it easy to spread the packet-processing across multiple threads, and also make it easier to have the socket directly associated with the data structures that are needed for processing.
The demultiplexing logic in the networking stack will route the incoming packets to the most specific socket.
Now my question is basically what happens when one wants to emulate accept() for UDP like this:
Use select() with a fd-set that includes the UDP server-socket.
Then read a packet from the UDP server-socket.
Then create a new UDP socket which is then connect()ed to the remote address
I call select() with a fd-set that includes both sockets.
What is returned?
given that a packet arrives to the OS somewhere between 1 and 3.
Will the packet be demultiplexed to the UDP server-socket, or will it be demultiplexed to the more specific socket created in 3. That is, at what point does demultiplexing take place? When the packet arrives, or must it happen "as if" it arrived at point 4?
Follow-up question in case the above does not work: What's the best way to do this?
I see that this discussion is from 2009, but since it keeps popping up when I search, I thought I should share my approach. Both to get some feedback and because I am curios about how the author of the question solved the problem.
The way I chose emulate UDP-accept was a combination of number one and two in nik's answer. I have a root thread which listens on a given socket. I have chosen to use TCP for simplicity, but changing this socket to UDP is not very hard. When a client wants to "connect" to my server using UDP, it first connects to the TCP socket and requests a new connection.
The root thread then proceeds by creating a UDP socket, binds it to a local interface, does connect and sets up data structures. This file descriptor is then passed to the thread that will be responsible for the connection. The IP/port information of the new UDP socket is passed back to the client, which creates a new UDP socket and sends data to the provided IP/port.
This approach works well for my use, but the additional steps for setting up a flow introduces an overhead. In some cases, this overhead might not be acceptable.
I found this question after asking it myself here...
UDP server and connected sockets
Since connect() is available for UDP to specify the peer address, I wonder why accept() wasn't made available to effectively complete the connected UDP session from the server side. It could even move the datagram (and any others from the same client) that triggered the accept() over to the new descriptor.
This would enable better server scalability (see the rationale behind SO_REUSEPORT for more background), as well as reliable DTLS authentication.
This will not work.
You have two simple options.
Create a multi-threaded program that has a 'root' thread listening on the UDP socket and 'dispatching' received packets to the correct thread based on the source. This is because you want to segregate processing by source.
Extend your protocol so the the sources accept an incoming connection on some fixed port and then continue with the protocol communication. In this case you would let the source request on the standard UDP port (of your choice), then your end will respond from a new UDP socket to the sources' UDP port. This way you have initiated a new UDP path from your end backwards to the known UDP port of each source. That way you have different UDP sockets at your end.