I am trying to wrap my head around network sockets. So far my understanding is that a server creates a new socket that is bound to the specific port. Then it listens to this socket to deal with client requests.
I've read this tutorial http://docs.oracle.com/javase/tutorial/networking/sockets/definition.html and it says
If everything goes well, the server accepts the connection. Upon acceptance,
the server gets a new socket bound to the same local port and also has
its remote endpoint set to the address and port of the client. It needs
a new socket so that it can continue to listen to the original socket for
connection requests while tending to the needs of the connected client.
Here are a few things that I don't quite understand
If everything goes well, the server accepts the connection.
Does it mean that a client request successfully arrived at the listening socket?
Upon acceptance, the server gets a new socket bound to the same local port and
also has its remote endpoint set to the address and port of the client
The new socket is created. It also gets bound to the same port but it doesn't listen for incoming requests. After server processed client request resonse is written to this socket and then it gets closed. Is it correct?
Does it mean that request is somehow passed from the first socket to the second socket?
It needs a new socket so that it can continue to listen to the original
socket for connection requests while tending to the needs of the connected client.
So, the new socket is created then that listens for incoming request. Are there different type of sockets? Some kind of "listening" sockets and other?
Why does the server have to create a new listening socket? Why can't it reuse the previous one?
No. It means that an incoming connection arrived at the server.
No. It gets closed if the server closes it. Not otherwise.
No. It means that the incoming connection causes a connection to be fully formed and a socket created at the server to represent the server-end endpoint of it.
(a) No. A new socket is created to receive requests and send responses. (b) Yes. There are passive and active sockets. A passive socket listens for connections. An active socket sends and receives data.
It doesn't have to create a new listening (passive) socket. It has to create a new active socket to be the endpoint of the new connection.
Is new socket created for every request?
Most protocols, for example HTTP with keep-alive, allow multiple requests per connection.
1) An incoming connection has arrived
2) Socket doesn't get closed
3) There is server socket and just socket. Server socket.accept returns a socket object when a client connects
Related
FTP RFC 959 specifies that the data connection is opened by the server from port 20 (default) to a random port in the client and known by the server through a PORT h1,h2,h3,h4,p1,p2 command. This is called Active Mode Transmission.
so that the host is h1.h2.h3.h4 while the port is p1 * 256 + p2.
My Question is: How can the server initialize multiple connections to multiple clients via the same port which is 20 by default?
Imagine client c1 has an established connection with server data port 20 and is transferring data, how can client c2 establish a connection with server if data port is already used by a TCP connection?
A server implementing Berkeley's sockets goes through a couple of phases when accepting connections. A lot of the plumbing is generally handled by the framework or the operating system, I'll try pointing them out. I'll try explaining this below with some pseudo-code.
1: Binding to the listening port
The server first asks the kernel to bind to a specific port to start listening on:
void* socket = bind(20);
2: Accepting a connection
This is probably the point that causes some misconceptions. The server gets a connection through the bound socket, but instead of using the listening port (20) to handle the communication with the new client it requests a new (random) port from the kernel to be used for a new socket connection. This is typically handled by the operating system.
void* clientSocket;
// Block until a client connects. When it does,
// use 'clientSocket' (a new socket) to handle the new client.
socket->accept(clientSocket);
// We'll use 'clientSocket' to communicate with the client.
clientSocket.send(someBuffer, ...);
// 'socket' is free again to accept more connections,
// so we can do it again:
void* clientSocket2;
socket->accept(clientSocket2);
// Of course, this is typically done in a loop that processes new connections all the time.
As a summary, what's happening is that the listener socket (20) is used only for accepting new connections. After a client establishes connection, a new socket is created to handle that specific connection.
You can test this by examining the socket connection you get as a client after establishing connection. You'll see that the remote port is not 20 anymore (it will be a random port chosen by the remote server).
All of this is shared by tcp, ftp and any protocol using the sockets protocol under its hood.
I'm studying socket programming, and the server socket accept() is confusing me. I wrote two scenarios for server socket accept(), please take a look:
When the server socket does accept(), it creates a new (client) socket that is bound to a port that is different from the port the server socket is bound. So socket communication is done via newly bound port, and the server socket (for accept() only) is waiting for another client connection on the originally bound port.
I think this is not quite correct, because (1) a port matches to a single process and (2) socket accept is inside-process matter and single process can have multiple sockets. So thought of a second scenario, based on some of stackoverflow answers:
When a server socket does accept(), it creates a new (client) socket that is not bound to any specific port. When a client communicates with the server, it uses the port that is bound to the server socket (who accept()s connections) and which client socket to actually communicate is resolved by (sourceIP, sourcePort, destIP, destPort) tuple from TCP header(?) at Transmission level (this is also suspicious because I thought socket is somewhat of an application-level object)
This scenario also raises some questions. If the socket communications still use server socket's port, i.e. client sends some messages to the server socket port, doesn't it use the server socket's backlog queue? I mean, how can messages from a client be distinguished between connect() and read() or write()? And how can they be resolved to each client socket in the server, without any port binding?
If one of my scenarios is correct, would that answer to the questions following? Or perhaps, both of my scenarios are wrong. I'd be very thankful if you could guide me to correct answers, or at least, towards some relevant texts to study.
When you create a socket and do a bind on that socket and then a listen, what you have is what is called a listening socket.
When a connection is establised this socket is basically cloned to a new socket, and this socket is called the servicing socket the port to which it bound is still the same as the original port.
But there is an important distinction between this socket and the listening socket from before. Namely it is part of a socket pair.
It is the socket pair that uniquely identifies the connection. so as there are 2 sockets in the picture for a socket pair, there are 2 IP adresses and 2 ports for both ends of the TCP communication channel. During the cloning of the servicing socket, the TCP kernel will allocate what is called a TCB and in it it will store those 2 IP# and 2 ports. The TCB also contains the socket number that belongs to the TCB.
Each time a TCP segment comes in , the TCP header is checked and whether or not it is a SYN, for a SYN you would have connection establishment so that you passed already, but then the kernel is going through its list of listening sockets. If it is a normal TCP packet, not a SYN, both port numbers are in the TCP header and the IP# are part of the IP header, so using this information the kernel is able to find the TCP that belongs to this TCP connection. (For a SYN, this information is also there, but as I said, for a SYN you have to process only the listening sockets)
That is in a nutshell how it works.
This information can be found in UNIX Network Programming: the sockets networking API. In there the link to the sockets is described whereas in other reference material it is usually not described that much in detail, rather the nitty grits of TCP are usually highlighted.
When server socket do accept(), it creates a new (client) socket that is bind to port that is different from the port server socket is bind. So socket communication is done via newly bind port, and server socket (for accept() only) is waiting for another client connection on originally bind port.
No.
I think this is not quite proper answer
It is a wrong answer.
because (1) port matches to a single process
That doesn't mean anything relevant.
and (2) socket accept is inside-process matters
Nor does that. It doesn't appear to mean anything at all actually.
and single process can have multiple sockets.
That's true but it doesn't have any bearing on why your answer is wrong. The reason your answer is wrong is because no second port is used.
When server socket do accept(), it creates a new (client) socket that is not bind to any specific port
No. It creates a second socket that inherits everything from the server socket: port number, buffer sizes, socket options, ... everything except the file descriptor and the LISTENING state, and maybe I forgot something else. It then sets the remote IP:port of the socket to that of the client and puts the socket into ESTABLISHED state.
and when client communicates with the server
The client has already communicated with the server. That's why we are creating this socket.
it uses the port that is bind to server socket (who accept()s connections) and which client socket to actually communicate is resolved by (sourceIP, sourcePort, destIP, destPort) tuple from TCP header(?) at Transmission level
This has already happened.
This is also suspicious because I thought socket is somewhat application-level object)
No it isn't. A socket is a kernel-level object with an application-level file descriptor to identity it.
If the socket communications still use server socket's port, i.e. client sends some messages to server socket port, doesn't it uses server socket's backlog queue?
No. The backlog queue is for incoming connect requests, not for data. Incoming data goes into the socket receive buffer.
I mean, how can messages from client be distinguished between connect() and read() or write()?
Because a connect() request sets special bits in the TCP header. The final part of it can be combined with data.
And how can they be resolved to each client sockets in server, WITHOUT any port binding?
Port binding happens the moment the socket is created in the call to accept(). You invented this difficulty yourself. It isn't real.
If one of my scenario is correct, would answer to the questions following?
Neither of them is correct.
Or possibly I'm making two wrong scenarios, so it would be very thankful for you to provide right answers, or at least some relevant texts to study.
Surely you already have relevant texts to study? If you don't, you should read RFC 793 or W.R. Stevens, TCP/IP Illustrated, volume I, relevant chapters. You have several major misunderstandings here.
From the Linux programmer's manual, as found via man 2 accept. Link
The accept() system call is used with connection-based socket
types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection
request on the queue of pending connections for the listening socket,
sockfd, creates a new connected socket, and returns a new file
descriptor referring to that socket. The newly created socket is not
in the listening state. The original socket sockfd is unaffected by
this call.
So what happens is that you have a listening TCP socket. Someone requests to connect().
You then call accept(). The old listening socket remains in listening mode, while a new socket is created in connected mode. Port is the original listening port.
That does not interfere with the listening socket, because the new socket does not listen for incoming connections.
I have looked up in BSD code but got lost somewhere :(
the reason I want to check is this:
TCP RFC (http://www.ietf.org/rfc/rfc793.txt) sec 2.7 states:
"To provide for unique addresses within each TCP, we concatenate an internet address identifying the TCP with a port identifier to create a socket which will be unique throughout all networks connected together. A connection is fully specified by the pair of sockets at the ends."
Does this mean: socket = local (ip + port) ?
If yes, then the accept function of Unix returns a new socket descriptor. Will it mean that a new socket is created (in turn a new port is created) for responding to client requests?
PS: I am a novice in network programming.
[UPDATE] I understood what I read # How does the socket API accept() function work?.
My only doubt is: if socket = (local port +local ip), then a new socket would mean a new port for the same IP. going by this logic, accept returns a new socket (thus a new port is created). so all sending should occur through this new port.
Is what I understand here correct?
You are mostly correct. When you accept(), a new socket is created and the listening socket stays open to allow more incoming connections but the new socket uses the same local port number as the listening socket.
A connection is defined by a 5-tuple: protocol, local-addr, local-port, remote-addr, remote-port.
Therefore, each accepted connection is unique even though they all share the same local port number because the remote ip/port is always different. The listening socket has no remote ip/port and so is also unique.
If a client listens on a socket, at http://socketplaceonnet.com for example, how does it know that there is new content? I assume the server cannot send data directly to the client, as the client could be behind a router, with no port forwarding so a direct connection is not possible. The client could be a mobile phone which changes it's IP address. I understand that for the client to be a listener, the server doesn't need to know the client's IP.
Thank you
A client socket does not listen for incoming connections, it initiates an outgoing connection to the server. The server socket listens for incoming connections.
A server creates a socket, binds the socket to an IP address and port number (for TCP and UDP), and then listens for incoming connections. When a client connects to the server, a new socket is created for communication with the client (TCP only). A polling mechanism is used to determine if any activity has occurred on any of the open sockets.
A client creates a socket and connects to a remote IP address and port number (for TCP and UDP). A polling mechanism can be used (select(), poll(), epoll(), etc) to monitor the socket for information from the server without blocking the thread.
In the case that the client is behind a router which provides NAT (network address translation), the router re-writes the address of the client to match the router's public IP address. When the server responds, the router changes its public IP address back into the client's IP address. The router keeps a table of the active connections that it is translating so that it can map the server's responses to the correct client.
The TCP Iterative server accepts a client's connection, then processes it, completes all requests from the client,
and disconnects. The TCP iteration server can only process one client's request at a time. Only when all the
requests of the client are satisfied, the server can continue the subsequent requests. If one client occupies the
server, other clients can't work, so TCP servers seldom use the iterated server model.
Let's say I have a server socket listening on port no 5010. When client tries to connect to this server socket using connect() API, server accepts socket connection in accept() API.
accept() API returns a new socket for server/client connection. Now all data transfer between server and client is done using this newly created socket. Does the data transfer happens on same port 5010. If not, how the ports are chosen when new socket is returned as a result of accept() API ?
The connection between the server and the client socket is identified by the tuple (serverAddress, serverPort, clientAddress, clientPort). The server address and server port always stay the same (obviously). The client allocates a (semi-)random "source" port to avoid collisions even if re-using the same address (e.g. when there are multiple clients on the same machine).