IP_ADD_MEMBERSHIP fails when set both on interface and its subinterface; is that expected? - sockets

I'm debugging a 3rd-party network application and trying to figure out why it reports errors when calling setsockopt with IP_ADD_MEMBERSHIP to set up a multicast group. The application is in C++, but I've written an MWE in python that replicates the same syscalls:
import socket
import struct
ETH0_IP = "192.168.88.85"
ETH0_1_IP = "192.168.88.254"
MULTICAST_IP = "224.0.0.7"
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
ip = socket.inet_aton(ETH0_IP)
s.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF, ip)
group = struct.pack("4s4s", socket.inet_aton(MULTICAST_IP), ip)
s.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, group)
# s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
s2 = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
ip2 = socket.inet_aton(ETH0_1_IP)
s2.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF, ip2)
group2 = struct.pack("4s4s", socket.inet_aton(MULTICAST_IP), ip2)
# the second group is added to the first socket so that we can only bind to one socket and read data from it
s.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, group2)
At the second IP_ADD_MEMBERSHIP call I get error OSError: [Errno 98] Address already in use.
I found out this only happens when ETH0_1_IP is a subinterface of ETH_0_IP. And I'm not sure if this is expected. If it is, is there a way to actually detect this situation and discard subinterfaces of already bound interfaces? Further, would my multicast socket receive data sent to the subinterface if registration for it fails with the above error?
For the sake of completeness:
$ cat /etc/network/interfaces
auto lo
iface lo inet loopback
iface lo inet6 loopback
auto eth0:1
iface eth0:1 inet static
address 192.168.88.254
netmask 255.255.240.0

Linux is tracking your alias interface as the same interface and so rejecting the attempt to re-use the interface.
In a bit more detail, I have run your code successfully on CentOS 7 using two separate physical interfaces with no changes. If I then change the code to use an alias on the same physical address, it fails with the same error that you see.
Digging a little further, I see that if I dump the interface indeces (using SIOCGIFINDEX) for the physical adaptor and the alias, they do indeed have the same index.
If you want to use Python to check this for yourself, have a quick look at https://gist.github.com/firaxis/0e538c8e5f81eaa55748acc5e679a36e for some code (missing imports of ctypes and socket) and then try something like this:
print(Interface(name="eth0").index)
print(Interface(name="eth0:1").index)

Related

Network manager does not edit settings

I am trying to move my organisation from Centos7 to Centos8 and Rocky linux which have network manager. Due to the multi-homed system I am trying to setup scriping to autoconnect since out of the box NM loses connectivity but I am a bit stuck.
If I try to run For example
nmcli c modify ens3 "IP4.DNS[0]" "8.8.8.8"
I get the Error: invalid or not allowed setting 'IP4': 'IP4' not among [connection, 802-3-ethernet (ethernet), 802-1x, dcb, sriov, ethtool, match, ipv4, ipv6, hostname, tc, proxy]. From what I understand NM is unable to modify these settings but I not understand why, or who set them up. I suspect it is somewhere in cloud init or in the dhcp-reply ??
nmcli connection show ens3 | grep IP4
IP4.ADDRESS[1]:136.ZZ.XX.XXX/23
IP4.GATEWAY:136.ZZ.YY.YY
...
IP4.DOMAIN[1]:openstacklocal
[root#chkorocky syck]# nmcli c show ens3 | grep ipv4
ipv4.method: auto
ipv4.dns: --
ipv4.dns-search: --
ipv4.addresses: --
ipv4.gateway: --
Is there anyway to understand where these extra attributes come from? Somehow ipv4.XX do not get set up at all but instead other variables with similar names allow NM to work ?

LuaSocket How to bind to interface?

Is there a SO_BINDTODEVICE equivalent or workaround for LuaSocket?
I've tried:
ifconfig to fetch the inet addr of my interface (e.g. ethA 1.1.1.1) + setpeername("1.1.1.1", 0). When I tcpdump on "ethA", I don't see my packet. Not too sure what the difference between bind versus bindtodevice is - I thought bindtodevice was just a shortcut to fetch the ip address from the interface name but that doesn't seem to be the case.
local udp = socket.udp()
udp:settimeout(1)
udp:setsockname("1.1.1.1", 0)
udp:setpeername("2.2.2.2", 12345)
udp:send(query)
The ip-multicast-if from https://tst2005.github.io/lua-socket/udp.html which was the only thing that mentioned interface in the documentation didn't seem to work.
local udp = socket.udp()
udp:setoption("ip-multicast-if", "1.1.1.1")
udp:settimeout(1)
udp:setsockname("*", 0) -- I've also tried "1.1.1.1" here and it didn't work.
udp:setpeername("2.2.2.2", 12345)
udp:send(query)
I see that it may be an option for luaposix, but I don't have that package and I don't want to bring in an additional dependency just for this.

OSError micropython

I get OSError on esp8266. First request is successful but second and more are failed with OSError, I dont know why. Can you help me?
Edit: I solved it. I wrote the solution at the end of the codes.
import network
name, password="wifiname", "passwordd"
wlan = network.WLAN(network.STA_IF)
wlan.active(True) # activate the interface
wlan.scan() # scan for access points
wlan.isconnected() # check if the station is connected to an AP
wlan.connect(name, password) # connect to an AP
wlan.config('mac') # get the interface's MAC address
wlan.ifconfig() # get the interface's IP/netmask/gw/DNS addresses
ap = network.WLAN(network.AP_IF) # create access-point interface
ap.active(True) # activate the interface
ap.config(essid='ESP-AP') # set the ESSID of the access point
print('Wifi connected! My IP:', wlan.ifconfig()[0])
import urequests
import time
while 1:
try:
t1=time.time()
r=urequests.get('https://saitamatechnoo.web.app/')
t2=time.time()
print(r.status_code, 'Time:', t2-t1)
except OSError:
print('error')
Guys, I solved it. It works with socket and we should close every socket after using it.
For example:
r=urequests.get('https://saitamatechnoo.web.app/')
print(r.status_code)
r.close()

What will the network interrupt handler do when NIC recieve data?

As far as I know, when a packet arrives at the NIC, the DMAC will copy the packet to the kernel space. When the DMAC completes its work, it notifies the CPU, and then the CPU copies the data to the user space. Doing so will cause the memory to be read once and to be written twice. I wrote a simple program to simulate this process. This is the code:
# server.py
import socket
import sys
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = "70.202.0.116"
port = 12306
server.bind((host, port))
server.listen(5)
while True:
conn,addr = server.accept()
print(conn,addr)
while True:
data = conn.recv(4096)
if not data:
print("client has lost")
conn.close()
break
server.close()
# client.py
import socket
import sys
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = "70.202.0.116"
port = 12306
client.connect((host, port))
data = ''
for i in range(4096):
data += 'a'
while True:
client.send(data.encode())
client.close()
My machine has two numa nodes. At the first time, I disabled NIC Multi-Queue by ethtool -L eno1 combined 1, thus there is only one network interrupt left, and set the affanity by ehco 22 > /proc/irq/137/smp_affinity_list. Core 22 is on numa 1. Then I ran server.py. I use pcm-memory to moniter system memory bandwidth, and I got the expected output, the read-write ratio is close to 1:2.
But when I changed the affanity to core 0 which is on numa 0, I got totally different result. The read-write ratio is close to 1:1.
I want to know what does the interrput handler do during this process, why did I get different result?
increase read latency could be because device belongs to different numa_node. Check device where server and client is running belongs to which numa node
# cat /sys/bus/pci/devices/<PCI device>/numa_node

Incorrect SO_REUSEADDR behavior on Linux

There are two uses for SO_REUSEADDR:
binding two servers on the same address (for server performance)
binding a client then a server (for example for hole punching)
It seems that the second one doesn't work on linux (I tested on RedHat and Chromium OS) although it works on macOS.
I made this little code:
import socket
conn = socket.create_connection(("google.fr", 80))
if len(conn.getsockname()) == 2:
family = socket.AF_INET
else:
family = socket.AF_INET6
s = socket.socket(family)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
s.bind(conn.getsockname())
This code works on macOS but fails with OSError: [Errno 98] Address already in use otherwise.
Is there any way to make it work? If not, where does this behavior come from?