how does Operating System map port number to process that use it - sockets

For a simple python server using TCP socket as below, when there comes a TCP packet, and transport layer get port number, how does OS/transport layer know which thread/process to wake up(assuming the thread/process is blocking because of recv() system call)? for the code below, both parent thread and child thread have the connectionsocket file descriptor, how OS know which one to wake up? Thanks
host = 'localhost'
port = 55567
buf = 1024
addr = (host, port)
welcomesocket = socket(AF_INET, SOCK_STREAM)
welcomesocket.bind(addr)
welcomesocket.listen(2)
while 1:
connectionsocket, clientaddr = serversocket.accept()
thread.start_new_thread(handler, (connectionsocket, clientaddr))
serversocket.close()

There is an hash map tracking all used port in the kernel space.
When a packet arrives, kernel lookup the table using the port information in the packet, find the associated socket, and notify it
Here is how linux do it http://lxr.free-electrons.com/source/net/ipv4/udp.c#L489

Related

Binding to UDP socket *from* a specific IP address

I have packets coming from a specific device directly connected to my machine. When I do a tcpdump -i eno3 -n -n, I can see the packets:
23:58:22.831239 IP 192.168.0.3.6516 > 255.255.255.255.6516: UDP, length 130
eno3 is configured as 192.168.0.10/24
When I set the socket the typical way:
gOptions.sockfd = socket(AF_INET, SOCK_DGRAM, 0);
memset((void *)&gOptions.servaddr, 0, sizeof(struct sockaddr_in));
gOptions.servaddr.sin_family = AF_INET;
inet_pton(AF_INET, gOptions.sourceIP, &(gOptions.servaddr.sin_addr));
gOptions.servaddr.sin_port = htons(gOptions.udpPort);
bind(gOptions.sockfd, (struct sockaddr *)&gOptions.servaddr, sizeof(struct sockaddr_in));
And I use the sourceIP of "255.255.255.255" on port "6516" - it connects and reads.
What I want to do, however, is bind such that I am limiting my connection from the source IP - "192.168.0.3". I have figured out how to connect on the device using either device name ("eno3") of the iface of that device ("192.168.0.10") - but that doesn't help as I may have multiple devices connected to "192.168.0.10" that blab on that port, but I only want the packets from 192.168.0.3 for port 6516.
I thought s_addr - part of sin.addr - was the source IP... but it is not.
You can't bind() to a remote IP/port, only to a local IP/port. So, for what you have described, you need to bind() to the IP/port where the packets are being sent to (192.168.0.10:6516).
Now, you have two options to choose from. You can either:
use recvfrom() to receive packets, using its src_addr parameter to be given each sender's IP/port, and then you can discard packets that were not sent from the desired sender (192.168.0.3:6516).
or, use connect() to statically assign the desired sender's IP/port (192.168.0.3:6516), and then you can use recv() (not recvfrom()) to receive packets from only that sender.

LibGDX: Error making a socket connection to *ip-adress*

I want to make 2 devices communicate via sockets.
I use this code for the client socket:
Socket socket = Gdx.net.newClientSocket(Net.Protocol.TCP, adress, 1337, socketHints);
(SocketHints: timeout = 4000)
I get a GdxRuntimeException each time this line is being executed. What is wrong with the socket?
Screenshot of stack trace
You get that message because the socket couldn't be opened.
Note the last line about the return in the API:
newClientSocket:
Socket newClientSocket(Net.Protocol protocol,
java.lang.String host,
int port,
SocketHints hints)
Creates a new TCP client socket that connects to the given host and port.
Parameters:
host - the host address
port - the port
hints - additional SocketHints used to create the socket. Input null to use the default setting provided by the system.
Returns:
GdxRuntimeException in case the socket couldn't be opened
Try doing some debugging to find out why you are getting this error.
Is the port already in use? Are you trying to open more than one connection on the same port? Is the server IP valid? Maybe something else is causing the issue?

Can I write() to a socket just after connect() call, but before TCP connection established?

My experiment showed that I can write to a non-blocking socket just after the connect() call, with no TCP connection established yet, and the written data correctly received by the peer after connection occured (asynchronously). Is this guaranteed on Linux / FreeBSD? I mean, will write() return > 0 when the connection is still in progress? Or maybe I was lucky and the TCP connection was successfully established between the connect() and write() calls?
The experiment code:
int fd = socket (PF_INET, SOCK_STREAM, 0);
fcntl(fd, F_SETFL, O_NONBLOCK)
struct sockaddr_in addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_port = htons(_ip_port.port);
addr.sin_addr.s_addr = htonl(_ip_port.ipv4);
int res = connect(fd, (struct sockaddr*)&addr, sizeof(addr));
// HERE: res == -1, errno == 115 (EINPROGRESS)
int r = ::write(fd, "TEST", 4);
// HERE: r == 4
P.S.
I process multiple listening and connecting sockets (incoming and outgoing connections) in single thread and manage them by epoll. Usually, when I want to create a new outgoing connection, I call non-blocking connect() and wait the EPOLLOUT (epoll event) and then write() my data. But I noticed that I can begin writing before the EPOLLOUT and get appropriate result. Can I trust this approach or should I use my old fashion approach?
P.P.S.
I repeated my experiment with a remote host with latency 170ms and got different results: the write() (just after connect()) returned -1 with errno == EAGAIN. So, yes, my first experiment was not fair (connecting to fast localhost), but still I think the "write() just next to connect()" can be used: if write() returned -1 and EAGAIN, I wait the EPOLLOUT and retry writing. But I agree, this is dirty and useless approach.
Can I write() to a socket just after connect() call, but before TCP connection established?
Sure, you can. It's just likely to fail.
Per the POSIX specification of write():
[ECONNRESET]
A write was attempted on a socket that is not connected.
Per the Linux man page for write():
EDESTADDRREQ
fd refers to a datagram socket for which a peer address has
not been set using connect(2).
If the TCP connect has not completed, your write() call will fail.
At least on Linux, the socket is marked as not writable until the [SYN, ACK] is received from the peer. This means the system will not send any application data over the network until the [SYN, ACK] is received.
If the socket is in non-blocking mode, you must use select/poll/epoll to wait until it becomes writable (otherwise write calls will fail with EAGAIN and no data will be enqueued). When the socket becomes writable, the kernel has usually already sent an empty [ACK] message to the peer before the application has had time to write the first data, which results in some unnecessary overhead due to the API design.
What appears to be working is to after calling connect on a non-blocking socket and getting EINPROGRESS, set the socket to blocking and then start to write data. Then the kernel will internally first wait until the [SYN, ACK] is received from the peer and then send the application data and the initial ACK in a single packet, which will avoid that empty [ACK]. Note that the write call will block until [SYN, ACK] is received and will e.g. return -1 with errno ECONNREFUSED, ETIMEDOUT etc. if the connection fails. This approach however does not work in WSL 1 (Windows Subsystem for Linux), which just fails will EPIPE immediately (no SIGPIPE though).
In any case, not much can be done to eliminate this initial round-trip time due to the design of TCP. If the TCP Fast Open (TFO) feature is supported by both endpoints however, and can accept its security issues, this round-trip can be eliminated. See https://lwn.net/Articles/508865/ for more info.

Can I detect whether an UDP-socket or a connected UDP socket is used?

Can I detect whether a client application uses an UDP-socket or a connected UDP-socket?
If yes, how? If no, why?
As I said in my comment above, code call connect on a UDP socket. That enforces only traffic to/from the connection address is allowed (and all other packets get dropped) and allows you to use send instead of sendto, but the traffic is still UDP.
But you can use the netstat command from the command line to see if the datagram socket has a remote address association:
For example, imagine if the code did this:
// create a datagram socket that listens on port 12345
sock = socket(AF_INET, SOCK_DGRAM, 0);
port = 12345;
addrLocal.sin_family = AF_INET;
addrLocal.sin_port = htons(port);
result = bind(sock, (sockaddr*)&addrLocal, sizeof(addrLocal));
// associate the socket only with packets arriving from 1.2.3.4:6666
addrRemote.sin_family = AF_INET;
addrRemote.sin_port = htons(6666);
addrRemote.sin_addr.s_addr = ipaddress; // e.g. "1.2.3.4"
result = connect(sock, (sockaddr*)&addrRemote, sizeof(addrRemote));
A corresponding netstat -a -u will reveal the following:
ubuntu#ip-10-0-0-15:~$ netstat -u -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
udp 0 0 ip-10-0-0-15:12345 1.2.3.4:6666 ESTABLISHED
The presence of a value that isn't *:* in the Foreign Address column for the UDP socket will reveal if the socket has connection address associated with it.

Recover a TCP connection

I have a simple Python server which can handle multiple clients:
import select
import socket
import sys
host = ''
port = 50000
backlog = 5
size = 1024
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((host,port))
server.listen(backlog)
input = [server,sys.stdin]
running = 1
while running:
inputready,outputready,exceptready = select.select(input,[],[])
for s in inputready:
if s == server:
# handle the server socket
client, address = server.accept()
input.append(client)
elif s == sys.stdin:
# handle standard input
junk = sys.stdin.readline()
running = 0
else:
# handle all other sockets
data = s.recv(size)
if data:
s.send(data)
else:
s.close()
input.remove(s)
server.close()
One client connects to it and they can communicate. I have a third box from where I am sending a RST signal to the server (using Scapy). The TCP state diagram does not say if an endpoint is supposed to try to recover a connection when it sees a RESET. Is there any way I can force the server to recover the connection? (I want it to send back a SYN so that it gets connected to the third client)
Your question doesn't make much sense. TCP just doesn't work like that.
Re "The TCP state diagram does not say if an endpoint is supposed to try to recover a connection when it sees a RESET": RFC 793 #3.4 explicitly says "If the receiver was in any other state [than LISTEN or SYN-RECEIVED], it aborts the connection and advises the user and goes to the CLOSED state.".
An RST won't disturb a connection unless it arrives over that connection. I guess you could plausibly forge one, but you would have to know the current TCP sequence number, and you can't get that from within either of the peers, let alone a third host.
If you succeeded somehow, the connection would then be dead, finished, kaput. Can't see the point of that either.
I can't attach any meaning to your requirement for the server to send a SYN to the third host, in response to an RST from the third host, that has been made to appear as though it came from the second host. TCP just doesn't work anything like this either.
If you want the server to connect to the third host it will just have to call connect() like everybody else. In which case it becomes a client, of course.