sockets, its attributes and SO_REUSEADDR option - sockets

i have a few basic questions:
1.A socket is represented by a protocol, a local ip, local port, remote ip and remote port. Suppose such a connection exists between a client and a server. Now when i bind another client to same local port and ip, it got bound(i used SO_REUSEADDR) but connect operation by second client to the same remote ip and port failed.So, is there no way a third process can share the same socket?
2.When we call listen() on a socket bound to a local port and ip, it listens for connections. When a client connects, it creates a socket (say A). It completes 3 way handshake and then starts a different socket(say B) and also deletes the socket A (Source) .The new client is taken care of by the new socket B. So, what kind of a socket represents a listening socket i.e. what is the remote ip and port and is socket A different than that socket or just addition of remote ip and port to listening socket forms A?
3.I read that SO_REUSEADDR can establish a listening socket on a port if there is no socket listening on that port and ip and all sockets on that port and ip have SO_REUSEADDR option set.But then i also came across a text which said if a client is bound to a port and ip, another client can't bind to it(even if SO_REUSEADDR is used) unless the first client successfully calls connect(). There was no listening socket(it is a client so we there is no call to connect()) on that port and ip in this example. So, why isn't another client allowed?
Thanks in advance.

Correct: there is no way to create two different sockets with the same protocol, local port, local address, remote port, and remote address. There would be nothing to tell which packets belonged to which socket!
A listening socket does not have a remote address and remote port. That's OK, because there are no packets on the wire associated with this socket (yet). Actually, all sockets start out with neither a local nor remote address or port. These properties are only assigned later when bind() (for local) and connect()/accept() (for remote) are called.
Until you call connect() or listen() on a socket, there isn't any different between a server (listening) or client socket. They're the same thing. So it would be more correct here to say that no two sockets are allowed to share the same protocol, local address, and local port if neither has a remote address or port.
This isn't a problem in practice though, because you usually don't call bind() on a client socket, which means there is an implicit bind() to an ephemeral port at connect() time. These typical client sockets can't conflict with a listening socket because they go from having no addresses associated with them to having both local and remote addresses associated with them, skipping the state where they have only a local one.

Related

On successful TCP connection between server and client

RELATED POST
The post here In UNIX forum describes
The server will keep on listeninig on a port number.
The server will accept a clients connect() request using accept(). As soon as the server accepts the client request, the kernel allocates a random port number for the server for further send() and receive(), since the same port number on the server can't be used for sending as well as listening, and the previous port is still listening for new connections
QUESTION
I have a server application S which is constantly listening on port 18333 (this is actually bitcoind testnet). When another client node C connects with it on say 53446 (random port). According to the above post, S will be able to send/receive data of 'C' only from port 53446.
But when I run a bitcoind testnet. This perfectly communicates with other node with only one socket connection in port 18333 without need for another for sending/receiving. Below is snippet and I even verified this
bitcoin-cli -testnet -rpcport=16591 -datadir=/home/user/mytest/1/
{
"id": 1,
"addr": "178.32.61.149:18333"
}
Can anyone help me understand what is the right working in TCP socket connection?
A TCP connection is identified by a socket pair and this is uniquely identified by 4 parameters :
source ip
source port
dest ip
dest port
For every connection that is established to a server the socket is basically cloned and the same port is being used. So for every connection you have a socket using the same server port. So you have n+1 socket using the same port when there are n connections.
The TCP kernel is able to make distinction between all these sockets and connections because the socket is either in the listening state, or it belongs to the socket pair where all 4 parameters are considered.
Your second bullet is therefore wrong because the same port is being used as i explained above.
The server will accept a clients connect() request using accept(). As
soon as the server accepts the client request, the kernel allocates a
random port number for the server for further send() and receive().
On normal TCP traffic this is not the case. If a webserver is listening on port 80, all packets sent back to the client wil be over server port 80 (this can be verified with WireShark for example) - but there will be a different socket for each connection (srcIP:port - dstIP:port). That information is sent in the headers of the network packets - IP and protocol code (TCP, UDP or other) in the IP header, port numbers as part of the TCP or UDP header).
But changing ports can happen when communicating over ftp, where there can be a control port (ususally 21) and a negotiated data port.

In TCP, if the server uses another port to communicate, how will it inform the client?

I'm studying socket programming in C. In TCP communication, a classical situation is that once the server accept() a connect() request from a client, it will fork a new process to handle this communication. Then the child process will use another port to communicate with the client. My question is, how does the server inform the client that it will use another port rather than the original one to do the subsequent communication? Which field in the TCP header and which phase of the handshake can reflect the port change?
For example, process PA on server A is listening to its port 80. Now process PB on client B wants to connect to A's port 80. Once PA accepts PB's connecting request, it will fork a new process PA1 to handle the communication with PB. Am I right till now? Next, will PA1 still use port 80 or another port such as 1234 to communication with PB? If it still uses 80, how can the server A distribute PB's communication to PA1? If it uses another port like 1234, how will the server A inform PB to use 1234 for the subsequent communication?
A TCP connection is uniquely identified by the tuple (source ip, source port, destination ip, destinatin port). These tuple is used by OS to "bind" the TCP connection to a process, meaning to know which process the OS should deliver the TCP package to.
When server socket accepts the TCP connection and fork, that process inherits the original process so it effectively take up the binding of the TCP connection to this newly forked process. The client in the remote machine does not know and does not need to know such thing happens. The whole network keeps seeing the same thing, the package of the same tuple flow through the network.
At this time, the original process will keep listening to new TCP connection. When new TCP connection request arrive, even it is from the same previous machine, the port must be different. In OS's perspective it is a different tuple, therefore it can distinguish the TCP pcakge and deliver to the right process.
You may ask why the client from the remote machine knows it has to use another port to initiate a new connection. This is simply because the client OS knows (or informed by the socket library) that this process is creating a separate new connection. OS will assign another unique port number to the process. That's how it is possible for multiple processes communicating to the same server port without message mess up.
To put it short, the operation of accept and fork in server is just a kind of transferring the ownership of a TCP connection binding to another process. Nothing change in the server port used in this communication.
In TCP communication, a classical situation is that once the server accept() a connect() request from a client, it will fork a new process to handle this communication.
Correct, or start a thread.
Then the child process will use another port to communicate with the client.
No. It will use the same port, via the accepted socket, inherited in the case of a child process.
My question is, how does the server inform the client that it will use another port rather than the original one to do the subsequent communication?
It doesn't, because this isn't the 'classical situation'.
Which field in the TCP header and which phase of the handshake can reflect the port change?
None. It doesn't happen that way. It would be a waste of a port.
For example, process PA on server A is listening to its port 80. Now process PB on client B wants to connect to A's port 80. Once PA accepts PB's connecting request, it will fork a new process PA1 to handle the communication with PB. Am I right till now?
Yes.
Next, will PA1 still use port 80 or another port such as 1234 to communication with PB?
Port 80.
If it still uses 80, how can the server A distribute PB's communication to PA1?
By inheritance of the accepted socket.
If it uses another port like 1234, how will the server A inform PB to use 1234 for the subsequent communication?
Doesn't happen.
The client chooses this port, not the server. The client will choose a port that's not already in use on that particular machine, and use that port to tell its connections apart (just as the server does).
For example say the client has IP address 1.2.3.4 and the server has IP address 4.3.2.1 and listens on port 80. If the client has two connections to that server and port, how will it tell them apart? Simple -- it assigns a different source port to each one. Say one gets port 50001 and one gets port 50002, then the two connections are:
1.2.3.4:50001 -> 4.3.2.1:80
and
1.2.3.4:50002 -> 4.3.2.1:80
The server knows these ports because it gets them from the TCP SYN packets sent from the client to the server. So the client tells the server, not the other way around.

Can I have different types of sockets on same port?

I have a weird query. I have learned that a socket is a combination of IP and Port. So what is socket descriptor? Is it just an integer? What does it do?
Can I have to different socket descriptors on the same port? If yes, then can those be of different types (TCP/UDP)?
I know these are silly questions; I have been blindly using SD for quite a time now :P
TCP and UDP are independent, so you can have TCP and UDP sockets on the same port.
A socket descriptor is to a socket as a file descriptor is to a file.
A TCP connection is actually defined by the tuple: local IP, local port, remote IP, remote port. You can have multiple connections with the same local IP and port, as long as they have different remote IP and/or port.
For instance, a web server uses its local port 80 for all the connections. But each client connection will either come from a different machine (and hence a different remote IP) or different sockets on the same machine (so they'll have the same remote IP but different remote ports).
A socket descriptor is a unique integer returned by the system when you ask it to create a socket with the socket call. Each socket is identifiable by its socket descriptor.
As regards the second part of your question, You will get a different socket descriptor for the same IP+PORT+PROTOCOL, so yes, you can have tcp and udp sockets on the same port, but you will get two different socket descriptors
You should read network programming tutorials like these first: Beej's Network Programming Tutorial

accept() function implementation in Unix

I have looked up in BSD code but got lost somewhere :(
the reason I want to check is this:
TCP RFC (http://www.ietf.org/rfc/rfc793.txt) sec 2.7 states:
"To provide for unique addresses within each TCP, we concatenate an internet address identifying the TCP with a port identifier to create a socket which will be unique throughout all networks connected together. A connection is fully specified by the pair of sockets at the ends."
Does this mean: socket = local (ip + port) ?
If yes, then the accept function of Unix returns a new socket descriptor. Will it mean that a new socket is created (in turn a new port is created) for responding to client requests?
PS: I am a novice in network programming.
[UPDATE] I understood what I read # How does the socket API accept() function work?.
My only doubt is: if socket = (local port +local ip), then a new socket would mean a new port for the same IP. going by this logic, accept returns a new socket (thus a new port is created). so all sending should occur through this new port.
Is what I understand here correct?
You are mostly correct. When you accept(), a new socket is created and the listening socket stays open to allow more incoming connections but the new socket uses the same local port number as the listening socket.
A connection is defined by a 5-tuple: protocol, local-addr, local-port, remote-addr, remote-port.
Therefore, each accepted connection is unique even though they all share the same local port number because the remote ip/port is always different. The listening socket has no remote ip/port and so is also unique.

how TCP port bind

Any body knows how is the port number bound with a socket in detail and how is the port used to forward the packet received in transport layer to a socket which is reading on this port?
thanks.
The application binds to a local IP address and port using the bind() function. The remote IP address and port is determined by the other end of the connection at the time a connection is established.
In the kernel, at the time a tcp connection is established the socket is put into a hash table based on data including the local address, local port, remote address, and remote port. When an incoming tcp segment arrives, these values are extracted from the header and used to look up the corresponding socket in the hash table. In Linux this lookup occurs in the function inet_lookup_established(). A similar function, inet_lookup_listener() is used to look up a listening socket from a different hash table for a new connection; in that case the remote IP address and port are not used.