What does it mean to bind a multicast (UDP) socket? - sockets

I am using multicast UDP between hosts that have multiple network interfaces.
I am using boost::asio, and am confused by the 2 operations receivers have to make: bind, then join-group.
Why do you need to specify the local address of an interface, during bind, when you do that with every multicast group that you join?
The sister-question regards the multicast port: Since during sending, you send to a multicast address & port, why, during subscription to a multicast group, you only specify the address, not the port - the port being specified in the confusing call to bind.
Note: the "join-group" is a wrapper over setsockopt(IP_ADD_MEMBERSHIP), which as documented, may be called multiple times on the same socket to subscribe to different groups (over different networks?). It would therefore make perfect sense to ditch the bind call and specify the port every time I subscribe to a group.
From what I see, always binding to "0.0.0.0" and specifying the interface address when joining the group, works very well. Confused.

To bind a UDP socket when receiving multicast means to specify an address and port from which to receive data (NOT a local interface, as is the case for TCP acceptor bind). The address specified in this case has a filtering role, i.e. the socket will only receive datagrams sent to that multicast address & port, no matter what groups are subsequently joined by the socket. This explains why when binding to INADDR_ANY (0.0.0.0) I received datagrams sent to my multicast group, whereas when binding to any of the local interfaces I did not receive anything, even though the datagrams were being sent on the network to which that interface corresponded.
Quoting from UNIX® Network Programming Volume 1, Third Edition: The Sockets Networking API by W.R Stevens.
21.10. Sending and Receiving
[...] We want the receiving socket to bind the multicast group and
port, say 239.255.1.2 port 8888. (Recall that we could just bind the
wildcard IP address and port 8888, but binding the multicast address
prevents the socket from receiving any other datagrams that might
arrive destined for port 8888.) We then want the receiving socket to
join the multicast group. The sending socket will send datagrams to
this same multicast address and port, say 239.255.1.2 port 8888.

The "bind" operation is basically saying, "use this local UDP port for sending and receiving data. In other words, it allocates that UDP port for exclusive use for your application. (Same holds true for TCP sockets).
When you bind to "0.0.0.0" (INADDR_ANY), you are basically telling the TCP/IP layer to use all available adapters for listening and to choose the best adapter for sending. This is standard practice for most socket code. The only time you wouldn't specify 0 for the IP address is when you want to send/receive on a specific network adapter.
Similarly if you specify a port value of 0 during bind, the OS will assign a randomly available port number for that socket. So I would expect for UDP multicast, you bind to INADDR_ANY on a specific port number where multicast traffic is expected to be sent to.
The "join multicast group" operation (IP_ADD_MEMBERSHIP) is needed because it basically tells your network adapter to listen not only for ethernet frames where the destination MAC address is your own, it also tells the ethernet adapter (NIC) to listen for IP multicast traffic as well for the corresponding multicast ethernet address. Each multicast IP maps to a multicast ethernet address. When you use a socket to send to a specific multicast IP, the destination MAC address on the ethernet frame is set to the corresponding multicast MAC address for the multicast IP. When you join a multicast group, you are configuring the NIC to listen for traffic sent to that same MAC address (in addition to its own).
Without the hardware support, multicast wouldn't be any more efficient than plain broadcast IP messages. The join operation also tells your router/gateway to forward multicast traffic from other networks. (Anyone remember MBONE?)
If you join a multicast group, all the multicast traffic for all ports on that IP address will be received by the NIC. Only the traffic destined for your binded listening port will get passed up the TCP/IP stack to your app. In regards to why ports are specified during a multicast subscription - it's because multicast IP is just that - IP only. "ports" are a property of the upper protocols (UDP and TCP).
You can read more about how multicast IP addresses map to multicast ethernet addresses at various sites. The Wikipedia article is about as good as it gets:
The IANA owns the OUI MAC address 01:00:5e, therefore multicast
packets are delivered by using the Ethernet MAC address range
01:00:5e:00:00:00 - 01:00:5e:7f:ff:ff. This is 23 bits of available
address space. The first octet (01) includes the broadcast/multicast
bit. The lower 23 bits of the 28-bit multicast IP address are mapped
into the 23 bits of available Ethernet address space.

Correction for What does it mean to bind a multicast (udp) socket? as long as it partially true at the following quote:
The "bind" operation is basically saying, "use this local UDP port for sending and receiving data. In other words, it allocates that UDP port for exclusive use for your application
There is one exception. Multiple applications can share the same port for listening (usually it has practical value for multicast datagrams), if the SO_REUSEADDR option applied. For example
int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); // create UDP socket somehow
...
int set_option_on = 1;
// it is important to do "reuse address" before bind, not after
int res = setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, (char*) &set_option_on,
sizeof(set_option_on));
res = bind(sock, src_addr, len);
If several processes did such "reuse binding", then every UDP datagram received on that shared port will be delivered to each of the processes (providing natural joint with multicasts traffic).
Here are further details regarding what happens in a few cases:
attempt of any bind ("exclusive" or "reuse") to free port will be successful
attempt to "exclusive binding" will fail if the port is already "reuse-binded"
attempt to "reuse binding" will fail if some process keeps "exclusive binding"

It is also very important to distinguish a SENDING multicast socket from a RECEIVING multicast socket.
I agree with all the answers above regarding RECEIVING multicast sockets.
The OP noted that binding a RECEIVING socket to an interface did not help.
However, it is necessary to bind a multicast SENDING socket to an interface.
For a SENDING multicast socket on a multi-homed server, it is very important to create a separate socket for each interface you want to send to. A bound SENDING socket should be created for each interface.
// This is a fix for that bug that causes Servers to pop offline/online.
// Servers will intermittently pop offline/online for 10 seconds or so.
// The bug only happens if the machine had a DHCP gateway, and the gateway is no longer accessible.
// After several minutes, the route to the DHCP gateway may timeout, at which
// point the pingponging stops.
// You need 3 machines, Client machine, server A, and server B
// Client has both ethernets connected, and both ethernets receiving CITP pings (machine A pinging to en0, machine B pinging to en1)
// Now turn off the ping from machine B (en1), but leave the network connected.
// You will notice that the machine transmitting on the interface with
// the DHCP gateway will fail sendto() with errno 'No route to host'
if ( theErr == 0 )
{
// inspired by 'ping -b' option in man page:
// -b boundif
// Bind the socket to interface boundif for sending.
struct sockaddr_in bindInterfaceAddr;
bzero(&bindInterfaceAddr, sizeof(bindInterfaceAddr));
bindInterfaceAddr.sin_len = sizeof(bindInterfaceAddr);
bindInterfaceAddr.sin_family = AF_INET;
bindInterfaceAddr.sin_addr.s_addr = htonl(interfaceipaddr);
bindInterfaceAddr.sin_port = 0; // Allow the kernel to choose a random port number by passing in 0 for the port.
theErr = bind(mSendSocketID, (struct sockaddr *)&bindInterfaceAddr, sizeof(bindInterfaceAddr));
struct sockaddr_in serverAddress;
int namelen = sizeof(serverAddress);
if (getsockname(mSendSocketID, (struct sockaddr *)&serverAddress, (socklen_t *)&namelen) < 0) {
DLogErr(#"ERROR Publishing service... getsockname err");
}
else
{
DLog( #"socket %d bind, %# port %d", mSendSocketID, [NSString stringFromIPAddress:htonl(serverAddress.sin_addr.s_addr)], htons(serverAddress.sin_port) );
}
Without this fix, multicast sending will intermittently get sendto() errno 'No route to host'.
If anyone can shed light on why unplugging a DHCP gateway causes Mac OS X multicast SENDING sockets to get confused, I would love to hear it.

Related

Why we use the local IP address in identifying sockets?

When a server want to create a socket, it will use a combination of its IP address and some well-known port, let us say 80. So, when a packet arrived, both the server IP and port 80 will be used to decide whether the packet goes to that socket or not.
The question is why do we need to check the IP address of the server, since the packet (aka datagram) passed the network layer check and was certainly destined for this server. In other words, the network layer will not pass the packet to transport layer if the destination IP is not the server IP, so why do we use the IP address in the socket?
And if a host (a client or a server) created multiple sockets (network processes) using both its IP and some port numbers, is there any case where the IP could be different in these sockets?
Thanks in advance!
Why do we need to check the IP address of the server, since the packet (aka datagram) passed the network layer?
The Data Link Layer uses Media Access Control (MAC) addresses to direct packets. When a packet arrives at your computer operating system (OS), it arrived either because the MAC address matched the hardware address or it was a broadcast (ff:ff:ff:ff:ff:ff).
Once the packet is received, your OS determines if it is destined for an IP address assigned to the computer. At this point, the OS has several options:
If the IP address matches an assigned IP, deliver to any waiting applications or reject the packet and handle any needed Internet Control Message Protocol (ICMP) required.
Should the IP not match an assigned, your OS checks if IP routing is enabled. Then either rejects the packet issuing any required reply or forwards the packet to the destination IP in the routing table by creating a new packet targeting the MAC address of the destination router.
If a host (a client or a server) created multiple sockets (network processes) using both its IP and some port numbers, is there any case where the IP could be different in these sockets?
If your OS assigns more than one IP address to an interface, all of those IP addresses would be available to be used. You can open sockets using any available IP (usually INADDR_ANY or similar). In a listening context, your port will be available to every IP address assigned. In a transmitting context, your IP will be set depending on the outbound interface.

Sending UDP packets to several nodes using ipv6

Currently, my application code uses udp broadcast to send the packets. While porting the application to ipv6 , how can I send the UDP packets to several nodes. Broadcast ipv4 address cant be passed directly to AF_INET6 sockets. I am new to this field.
IPv6 doesn't have broadcast. Instead, you need to use multicast, and each host wishing to receive the multicast will need to join the multicast group. Choose the multicast group carefully since IPv6 multicast has scopes and flags in the multicast addressing which you need to respect.

Why is UDP socket identified by destination IP address and destination port?

According to "Computer networking: a top-down approach", Kurose et al., a UDP socket is fully identified by destination IP and destination port.
Why do we need destination IP here? I thought UDP only need the destination port for the demultiplexing.
The machine may have multiple IPs, and different sockets may be bound to the same port on different IPs. It needs to use the destination IP to know which of these sockets the incoming datagram should be sent to.
In fact, it's quite common to use a different socket for each IP. When sending the reply, we want to ensure that the source IP matches the request's destination IP, so that the client can tell that the response came from the same server it sent to. By using different sockets for each IP, and sending the reply out the same socket that the request came in on, this consistency is maintained. Some socket implementations have an extension to allow setting the source IP at the time the reply is being sent, so they can use a single socket for all IPs, but this is not part of the standard sockets API.
I think that you are confusing UDP with Mulitcast.
Multicast is a broadcast protocol that doesn't need a destination IP address. It only needs a port number because it is delivered to all IP's on the given port.
UDP, by contrast, is only delivered to one IP. This is why it needs that destination IP address.

Setting UDP Port for receiving multicast packets and handling port conflicts

Lets say there is a LAN where several hosts have an instance of some application running.
Each instance of that application intends to receive packets sent to a particular Multicast IP Address on some port (say 5000).
For this to happen, at each host the application needs to bind socket on the port 5000.
If that port is not available on one of the hosts (i.e. a different application socket has occupied it), what are the ways to handle such a scenario?
In short,
1) What to do when an application intending to receive multicast packets on a particular port runs into a Port Conflict?
2) What are the best practices, if any, to choose a UDP port for this purpose?

Getting the IP of the interface that received a recvfrom() UDP packet (Microsoft)

Using recvfrom() on a socket bound to INADDR_ANY on a Microsoft multihomed PC.
when recvfrom() gets an UDP packet: how can I find the Interface (IP) that received the packet?
There is no way to know the receiving IP when a single listening socket is bound to multiple IPs. Instead of binding a single socket to INADDR_ANY, you can query the machine's list of local IPs using GetAdaptersInfo() and/or GetAdapterAddresses(), then create a separate listening socket for each IP. You can use getsockname() to know which IP a given socket is bound to, but only when that socket is bound to a specific IP, not multiple IPs.