Emulating accept() for UDP (timing-issue in setting up demultiplexed UDP sockets) - select

For an UDP server architecture that will have long-lived connections, one architecture is to have one socket that listens to all incoming UDP traffic, and then create separate sockets for each connection using connect() to set the remote address. My question is whether it is possible to do this atomically similar to what accept() does for TCP.
The reason for creating a separate socket and using connect() is that this makes it easy to spread the packet-processing across multiple threads, and also make it easier to have the socket directly associated with the data structures that are needed for processing.
The demultiplexing logic in the networking stack will route the incoming packets to the most specific socket.
Now my question is basically what happens when one wants to emulate accept() for UDP like this:
Use select() with a fd-set that includes the UDP server-socket.
Then read a packet from the UDP server-socket.
Then create a new UDP socket which is then connect()ed to the remote address
I call select() with a fd-set that includes both sockets.
What is returned?
given that a packet arrives to the OS somewhere between 1 and 3.
Will the packet be demultiplexed to the UDP server-socket, or will it be demultiplexed to the more specific socket created in 3. That is, at what point does demultiplexing take place? When the packet arrives, or must it happen "as if" it arrived at point 4?
Follow-up question in case the above does not work: What's the best way to do this?

I see that this discussion is from 2009, but since it keeps popping up when I search, I thought I should share my approach. Both to get some feedback and because I am curios about how the author of the question solved the problem.
The way I chose emulate UDP-accept was a combination of number one and two in nik's answer. I have a root thread which listens on a given socket. I have chosen to use TCP for simplicity, but changing this socket to UDP is not very hard. When a client wants to "connect" to my server using UDP, it first connects to the TCP socket and requests a new connection.
The root thread then proceeds by creating a UDP socket, binds it to a local interface, does connect and sets up data structures. This file descriptor is then passed to the thread that will be responsible for the connection. The IP/port information of the new UDP socket is passed back to the client, which creates a new UDP socket and sends data to the provided IP/port.
This approach works well for my use, but the additional steps for setting up a flow introduces an overhead. In some cases, this overhead might not be acceptable.

I found this question after asking it myself here...
UDP server and connected sockets
Since connect() is available for UDP to specify the peer address, I wonder why accept() wasn't made available to effectively complete the connected UDP session from the server side. It could even move the datagram (and any others from the same client) that triggered the accept() over to the new descriptor.
This would enable better server scalability (see the rationale behind SO_REUSEPORT for more background), as well as reliable DTLS authentication.

This will not work.
You have two simple options.
Create a multi-threaded program that has a 'root' thread listening on the UDP socket and 'dispatching' received packets to the correct thread based on the source. This is because you want to segregate processing by source.
Extend your protocol so the the sources accept an incoming connection on some fixed port and then continue with the protocol communication. In this case you would let the source request on the standard UDP port (of your choice), then your end will respond from a new UDP socket to the sources' UDP port. This way you have initiated a new UDP path from your end backwards to the known UDP port of each source. That way you have different UDP sockets at your end.

Related

SFML sockets, what does a UDP socket returning "Disconnected" mean?

I'm writing a network program using SFML, and as my understanding was, UDP sockets are utterly connection-less
When i try read from my socket, I'm getting a "Disconnected" error code, but the documentation doesn't seem to mention UDP sockets being able to return this kind of error (only TCP ones being able to)
What could a UDP socket being Disconnected possibly mean?
While UDP as a protocol is "connectionless", the socket APIs support virtual connections to allow connection oriented functions to continue to work. When you call connect on a UDP socket, the OS remembers the connection data you set just as it normally would and it filters things that are not consistent with the virtual connection, this allows you to use interfaces like recv, send and getpeername because the peer is implicit. If you don't use connect, then you need to use interfaces like sendmsg. sendto, recvmsg and recvfrom where the peer is being communicated on a per packet basis.
In the case of SFML, it isn't necessarily using something that needs a connection, though, it is remapping other errors such as timeouts to Disconnected.

Socket broadcast basics

I am building an application that will deploy in effect on multiple "clients" with a common "server". Clearly I could communicate between each client and the server using a single read-write socket for each client-server link, or a read socket and a write socket per link if I really wanted to.
But what if there are (hopefully good) reasons that the server wants to read from any client, and broadcast back to all? If you have a connectionless protocol like UDP, can the server use only a single read-write socket, or must it use one for reading and one for writing? What about the clients? And does this change if you use a connection-based protocol like TCP?
If you have a connectionless protocol like UDP, can the server use only a single read-write socket, or must it use one for reading and one for writing? What about the clients? And does this change if you use a connection-based protocol like TCP?
A socket as an endpoint which has at least a local address and port in case of UDP and TCP. Only data received for this ip and port are delivered to the socket and all data send from this socket contain the local ip and port as the source. A socket can be connected, in which case also the destination IP and port is known. With TCP a socket needs to be connected, with UDP not.
This means:
You can use the same unconnected UDP socket to send data to multiple peers (destination is an argument for the sendto function). You cannot do this with TCP, i.e. you need a connected socket for each single peer.
You can receive data from multiple peers on an unconnected UDP socket. You cannot do this with TCP.
The special broadcast address can be used with UDP but not with TCP, since with TCP you need to have a connection between only two clients which is not the case with broadcast.
See also a related question with answer for more information: Bidirectional UDP Multicast
But what if there are (hopefully good) reasons that the server wants
to read from any client, and broadcast back to all?
Well, then you'd probably want to use a UDP socket (either instead of, or in addition to, some TCP sockets) :)
If you have a connectionless protocol like UDP, can the server use
only a single read-write socket, or must it use one for reading and
one for writing?
A single UDP socket is sufficient for both reading and writing (although some multithreaded designs might find it easier to use two separate sockets instead; either way will work).
What about the clients?
Clients can also use a single socket for both sending and receive UDP packets, if that's what you're asking.
And does this change if you use a connection-based protocol like TCP?
With TCP sockets you can also use a single socket for both sending and receiving. However you will need one TCP socket for each destination that you want to send or receive to/from. (Contrast this with UDP where a single UDP socket can be used in conjunction with sendto() or recvfrom() to communicate with multiple peers)
As per your requirement, you have two ways :
By using TCP connection only : Server reads message from client and for the broadcasting to all clients,server writes message to all client's TCP sockets(connected to clients) and clients read that message from TCP socket(connected to server).This method requires that client and server knows the IP addresses of each other
By using TCP connection for the client-server direct communication and UDP for broadcasting : In this method,client and server communicates (directly one to one) using TCP connection. For broadcasting the message, server can broadcast the message over the network using UDP socket and clients have UDP broadcast receiver for receiving the broadcast message.

What exactly is Socket

I don't know exactly what socket means.
A server runs on a specific computer and has a socket that is bound to a specific port number. The server just waits, listening to the socket for a client to make a connection request.
When the server accepts the connection, it gets a new socket bound to the same local port and also has its remote endpoint set to the address and port of the client.
It needs a new socket so that it can continue to listen to the original socket for connection requests while tending to the needs of the connected client.
So, socket is some class created in memory? And for every client connection there is created new instance of this class in memory? Inside socket is written the local port and port and IP number of the client which is connected. Can someone explain me more in details the definition of socket?
Thanks
A socket is effectively a type of file handle, behind which can lie a network session.
You can read and write it (mostly) like any other file handle and have the data go to and come from the other end of the session.
The specific actions you're describing are for the server end of a socket. A server establishes (binds to) a socket which can be used to accept incoming connections. Upon acceptance, you get another socket for the established session so that the server can go back and listen on the original socket for more incoming connections.
How they're represented in memory varies depending on your abstraction level.
At the lowest level in C, they're just file descriptors, a small integer. However, you may have a higher-level Socket class which encapsulates the behaviour of the low-level socket.
According to "TCP/IP Sockets in C-Practical Guide for Programmers" by Michael J. Doonahoo & Kenneth L. Calvert (Chptr 1, Section 1.4, Pg 7):
A socket is an abstraction through which an application may send
and receive data,in much the same way as an open file allows an application to read and write data to stable storage.
A socket allows an application to "plug in" to the network and communicate
with other applications that are also plugged in to the same network.
Information written to the socket by an application on one machine can be
read by an application on a different machine, and vice versa.
Refer to this book to get clarity about sockets from a programmers point of view.
A network socket is one endpoint in a communication flow between two programs running over a network.
A socket is the combination of IP address plus port number
This is the typical sequence of sockets requests from a server application in the connectionless context of the Internet in which a server handles many client requests and does not maintain a connection longer than the serving of the immediate request:
Steps to implement
At Server side
initilize socket()
--
bind()
--
recvfrom()
--
(wait for a sendto request from some client)
--
(process the sendto request)
--
sendto (in reply to the request from the client...for example, send an HTML file)
A corresponding client sequence of sockets requests would be:
socket()
--
bind()
--
sendto()
--
recvfrom()
so that you can make a pipeline connection ..
for more http://www.steves-internet-guide.com/tcpip-ports-sockets
I found this article in online.
So to put it all back together, a socket is the combination of an IP
address and a port, and it acts as an endpoint for receiving or
sending information over the internet, which is kept organized by TCP.
These building blocks (in conjunction with various other protocols and
technologies) work in the background to make every google search,
facebook post, or introductory technical blog post possible.
https://medium.com/swlh/understanding-socket-connections-in-computer-networking-bac304812b5c
Socket definition
A communication between two processes running on two computer systems can be completely specified by the association: {protocol, local-address, local-process, remote-address, remote-process} We also define a half association as either {protocol, local-address, local-process} or {protocol, remote-address, remote-process}, which specify half of a connection. This half association is also called socket, or transport address. The term socket has been popularized by the Berkeley Unix networking system, where it is "an end point of communication", which corresponds to the definition of half association.

Broadcasting ip:port by socket server

I'm trying to find a way for client to know socket server ip:port, without explicitly defining it. Generally I have a socket server running on portable device that's connect to network over DHCP (via WiFi), and ideally clients should be able to find it automaticaly.
So I guess a question is whether socket server can somehow broadcast it's address over local network? I think UPnP can do this, but I'd rather not get into it.
I'm quite sure that this question was asked on Stack lot's of times, but I could find proper keywords to search for it.
One method of doing this is via UDP broadcast packets. See beej's guide if you're using BSD sockets. And here is Microsoft's version of the same.
Assuming all the clients of the application are on the same side of a router then a broadcast address of 255.255.255.255 (or ff02::1 for IPv6) should be more than adequate.
Multicast is another option, but if this is a LAN-only thing I don't think that's necessary.
Suggestion
Pick a UDP port number (say for the sake of an example we pick 1667). The client should listen to UDP messages on 255.255.255.255:1667 (or whatever the equivalent is. e.g.: IPEndPoint(IPAddress.Any, 1667)). The server should broadcast messages on the same address.
Format Suggestion
UDP Packet: First four bytes as a magic number, next four bytes an IPv4 address (and you might want to add other things like a server name).
The magic number is just in case there is a collision with another application using the same port. Check both the length of the packet and the magic number.
Server would broadcast the packet at something like 30 second time intervals. (Alternatively you could have the server send a response only when a client sends a request via broadcast.)
Some options are:
DNS-SD (which seems to translate to "Apple Bonjour"): it has libraries on macOS, but it needs to install the Bonjour service on Windows. I don't know the Linux situation for this. So, it's multi-platform but you need external libraries.
UDP broadcast or multicast
Some other fancy things like Ethernet broadcast, raw sockets, ...
For your case (clients on a WiFi network), a UDP broadcast packet would suffice, it's multi-platform, and not too difficult to implement from the ground up.
Choosing this option, the two main algorithms are:
The server(s) send an "announce" broadcast packet, with clients listening to the broadcast address. Once clients receive the "announce" packet, they know about the server address. Now they can send UDP packets to the server (which will discover their addresses for sending a reply), or connect using TCP.
The client(s) send a "discover" broadcast packet, with the server(s) listening to the broadcast address. Once the server(s) receive the "discover" packet, it can reply directly to it with an "announce" UDP packet.
One or the other could be better for your application, it depends.
Please consider these arguments:
Servers usually listen to requests and send replies
A server that sends regular "announce" broadcast packets over a WiFi network, for a client that may arrive or not, wastes the network bandwidth, while a client knows exactly when it needs to poll for available servers, and stop once it's done.
As a mix of the two options, a server could send a "gratuitous announce" broadcast packet once it comes up, and then it can listen for "discover" broadcast requests from clients, replying directly to one of them using a regular UDP packet.
From here, the client can proceed as needed: send direct requests with UDP to the server, connect to a TCP address:port provided in the "announce" packet, ...
(this is the scheme I used in an application I am working on)

UDP for multiplayer game

I have no experience with sockets nor multiplayer programming.
I need to code a multiplayer mode for a game I made in c++. It's a puzzle game but the game mode will not be turn-based, it's more like cooperative.
I decided to use UDP, so I've read some tutorials, and all the samples I find decribes how to create a client that sends data and a server that receives it.
My game will be played by two players, and both will send and receive data to/from the other.
Do I need to code a client and a server?
Should I use the same socket to send and receive?
Should I send and receive data in the same port?
Thanks, I'm kind of lost.
Read how the masters did it:
http://www.bluesnews.com/abrash/chap70.shtml
Read the code:
git clone git://quake.git.sourceforge.net/gitroot/quake/quake
Open one UDP socket and use sendto and recvfrom. The following file contains the functions for the network client.
quake/libs/net/nc/net_udp.c
UDP_OpenSocket calls socket (PF_INET, SOCK_DGRAM, IPPROTO_UDP)
NET_SendPacket calls sendto
NET_GetPacket calls recvfrom
Do I need to code a client and a server?
It depends. For a two player game, with both computers on the same LAN, or both on the open Internet, you could simply have the two computers send packets to each other directly.
On the other hand, if you want your game to work across the Internet, when one or both players are behind a NAT and/or firewall, then you have the problem that the NAT and/or firewall will probably filter out the other player's incoming UDP packets, unless the local player goes to the trouble of setting up port-forwarding in their firewall... something that many users are not willing (or able) to do. In that case, you might be better off running a public server that both clients can connect to, which forwards data from one client to another. (You might also consider using TCP instead of UDP in that case, at least as a fallback, since TCP streams are in general likely to have fewer issues with firewalls than UDP packets)
Should I use the same socket to send and receive?
Should I send and receive data in the same port?
You don't have to, but you might as well -- there's no downside to using just a single socket and a single port, and it will simplify your code a bit.
Note that this answer is all about using UDP sockets. If you change your mind to use TCP sockets, it will almost all be irrelevant.
Do I need to code a client and a server?
Since you've chosen to to use UDP (a fair choice if your data isn't really important and benefits more from lower latency than reliable communication), you don't have much of a choice here: a "server" is a piece of code for receiving packets from the network, and your "client" is for sending packets into the network. UDP doesn't provide any mechanism for the server to communicate to the client (unlike TCP which establishes a 2 way socket). In this case, if you want to have two way communication between your two hosts, they'll each need server and client code.
Now, you could choose to use UDP broadcasts, where both clients listen and send on the broadcast address (usually 192.168.1.255 for home networks, but it can be anything and is configurable). This is slightly more complex to code for, but it would eliminate the need for client/server configuration and may be seen as more plug 'n play for your users. However, note that this will not work over the Internet.
Alternatively, you can create a hybrid method where hosts are discovered by broadcasting and listening for broadcasts, but then once the hosts are chosen you use host to host unicast sockets. You could provide fallback to manually specify network settings (remote host/port for each) so that it can work over the Internet.
Finally, you could provide a true "server" role that all clients connect to. The server would then know which clients connected to it and would in turn try to connect back to them. This is a server at a higher level, not at the socket level. Both hosts still need to have packet sending (client) and receiving (server) code.
Should I use the same socket to send and receive?
Well, since you're using UDP, you don't really have a choice. UDP doesn't establish any kind of persistent connection that they can communicate back and forth over. See the above point for more details.
Should I send and receive data in the same port?
In light of the above question, your question may be better phrased "should each host listen on the same port?". I think that would certainly make your coding easier, but it doesn't have to. If you don't and you opt for the 3rd option of the first point, you'll need a "connect back to me on this port" datafield in the "client's" first message to the server.