What is the correct definition of a socket? - sockets

I have read contradictory definitions of what a socket comprise of (mainly in this question).
The first definition is that a socket comprise of the following:
{Source IP Address, Source Port Number}
The second definition is that a socket comprise of the following:
{Source IP Address, Source Port Number, Destination IP Address,
Destination Port Number}
Is there an official document or something that states what the correct definition is?
Also, is the Transport protocol included in the socket?

If you look at the RFCs, e.g. RFC 193, TRANSMISSION CONTROL PROTOCOL, you will see the definition:
Multiplexing:
To allow for many processes within a single Host to use TCP
communication facilities simultaneously, the TCP provides a set of
addresses or ports within each host. Concatenated with the network
and host addresses from the internet communication layer, this forms a
socket. A pair of sockets uniquely identifies each connection. That
is, a socket may be simultaneously used in multiple connections.

The first definition applies to an unconnected TCP or UDP socket.
The second definition applies to a connected TCP or UDP socket.

Related

TCP sockets: Can the transport layer access the network layer header?

In my class, we learned that a TCP socket is uniquely identified by the 4-tuple consisting of source IP, destination IP, source port, and destination port.
Now suppose I have a web server running on port 80. It has 2 TCP sockets established to two clients that have different IP addresses but somehow both use the same source port, say 12345.
We also learned in class that the transport layer header adds only the source/destination ports whereas the network layer adds the source/destination IP addresses.
Now suppose the web server receives two packets, one from each client. As mentioned, the source port, destination port, destination IP addresses are the same in these packets, so the only difference is the source IP address.
However, if demultiplexing is done by the transport layer, how can the source IP address be used to move the packet to the right socket? After all, the source IP address is only part of the network header and, as far as I understand, that header is already stripped off before the packet is passed from the network layer up to the transport layer on the receiving side.

Demultiplexing in TCP/UDP

I know there is an older answer to this question here, though it does not seem to answer my question. If in UDP two people with different IP and different ports send data to the same server (same IP) at the same socket (since in UDP there is only one socket per application - correct me if i am wrong), how does server recognises which person is who?
Does it change anything if the two people use (by luck or not) the same port as source port but with different source IP?
The server can receive UDP datagrams from two different IP/port pairs (IP could be same, port could be same, or both could be different) on the same port. The recvfrom() function returns the source IP/port of the datagram in addition to the data.
As mentioned in the question you referenced, a UDP socket is defined only by the local IP and local port. The remote IP and port can differ for both outgoing and incoming packets.

Why is UDP socket identified by destination IP address and destination port?

According to "Computer networking: a top-down approach", Kurose et al., a UDP socket is fully identified by destination IP and destination port.
Why do we need destination IP here? I thought UDP only need the destination port for the demultiplexing.
The machine may have multiple IPs, and different sockets may be bound to the same port on different IPs. It needs to use the destination IP to know which of these sockets the incoming datagram should be sent to.
In fact, it's quite common to use a different socket for each IP. When sending the reply, we want to ensure that the source IP matches the request's destination IP, so that the client can tell that the response came from the same server it sent to. By using different sockets for each IP, and sending the reply out the same socket that the request came in on, this consistency is maintained. Some socket implementations have an extension to allow setting the source IP at the time the reply is being sent, so they can use a single socket for all IPs, but this is not part of the standard sockets API.
I think that you are confusing UDP with Mulitcast.
Multicast is a broadcast protocol that doesn't need a destination IP address. It only needs a port number because it is delivered to all IP's on the given port.
UDP, by contrast, is only delivered to one IP. This is why it needs that destination IP address.

How TCP/UDP demultiplexing works?

I have the following statement.
"In TCP, the receiver host uses all of source IP, source port, destination IP and destination port to direct datagram to appropriate socket. While in UDP, the receiver only checks destination port number to direct the datagram. "
Is the above statement true?
If yes, does it mean that in TCP the same port can be used for multiple socket in one process, while in UDP only one socket can use on a port in one process? What about sockets in different processes? Can multiple processes use the same port in TCP/UDP? (in programming language: C/C++/Java)
If not, why?
"In TCP, the receiver host uses all of source IP, source port, destination IP and destination port to direct datagram to appropriate socket. While in UDP, the receiver only checks destination port number to direct the datagram. "
Is the above statement true?
Yes.
If yes, does it mean that in TCP the same port can be used for multiple socket in one process,
Yes, under some circumstances.
while in UDP only one socket can use on a port in one process?
No, see below.
What about sockets in different processes? Can multiple processes use the same port in TCP/UDP? (in programming language: C/C++/Java)
Under some circumstances, yes. A UDP port has to be designated as reusable by all processes that want to share it. A TCP port can only be reused by sockets bound to different interfaces: there is no sharing.
What that means is, in TCP, a unique communication "channel" can be described as the four-tuple: (src-ip, src-port, dst-ip, dst-port).
In UDP, all packets destined to a certain port are delivered to the only UDP socket listening on that port, regardless of the source address and port of said packet. I like to think of it as a funnel.

TCP and UDP same ports, different process

I know you can't have two different process using the same port, but what happens if one is using tcp and the other one udp? Can you have two different process each one binding a socket to the same port but different protocol?
The 5-tuple (protocol, source ip, source port, dest ip, dest port) must be unique. That means that not only can you have TCP and UDP using the same port number, but even outgoing connections with the same protocol and local port number, but different destinations.
When listening however, sockets usually must be unique in their protocol, i.e. you can/should not open another TCP socket with the same port number.
TCP ports and UDP ports are not related to each other at all.
Yes. Two sockets can bind same port number but different protocol.
It's not the same port, just happens to have the same number.