I've read a lot of theory about sockets and Client-Server connection on this forums but some points remains blurred or some answers does not satisfy me completely.
Also, i'd like my words to be confirmed, completed or corrected :
1)_ A socket is made out of IP source (IP of the Client), Port source (a port automatically and randomly choosen by the OS between 1024 and 65535), IP destination (127.0.0.1 ? Something i don't get here), Port destination (developper defined-by port for the server) and protocol type.
There may be something wrong in those lines already.
But considering it is true, how can the server differenciate two processes accessing the server from the same machine ? (Understand, how the developper can make any difference if he wants to prevent multiple access from the same machine).
The only difference would be the source port which is auto-filled by the OS. In this case, it would act like it was a totally different machine, right ?
2)_ I heard there was actually a pair of sockets. One generated by the Client, and one by the server.
Is there really a need for the server to have a second socket ? Is this socket a simple replica to keep a copy in the "Client currently connected"-list or is it a different socket, with different values ?
3)_ When does a Client should "disconnect" ? At each query ? At the end of some process ? Other ?
Thanks for enlightenment !
1)_ A socket is made out of IP source (IP of the Client), Port source
(a port automatically and randomly choosen by the OS between 1024 and
65535), IP destination (127.0.0.1 ? Something i don't get here), Port
destination (developper defined-by port for the server) and protocol
type.
I wouldn't say a "socket is made" out of those data points; rather a TCP-connection can be uniquely identified using just those data points:
1. Source IP - the IP address of the client computer
2. Source Port - the port number (on the client computer) that the client is sending packets from and receiving packets on
3. Destination IP - the IP address of the server computer
4. Destination Port - the port number (on the server computer) that the server is sending packets from and receivign packets on
5. Protocol type - what communications-protocol is in use (i.e. either TCP or UDP)
But considering it is true, how can the server differenciate two
processes accessing the server from the same machine ?
It can differentiate them because the 5-tuple (above) will be unique for each of the two connections. In particular, in the TCP packets the server receives from process #2, field #2 (Source Port) will be different from the value it has in the packets received from process #1.
The only difference would be the source port which is auto-filled by
the OS. In this case, it would act like it was a totally different
machine, right ?
The server can act however it was programmed to act -- but in most cases the server will be programmed not to care whether two client connections come from the same physical machine or not. To most servers, a client is a client, and a client's physical location is not that important.
Is there really a need for the server to have a second socket ? Is
this socket a simple replica to keep a copy in the "Client currently
connected"-list or is it a different socket, with different values ?
A socket is a just data structure that lives in a computer's memory to help it keep track of the current state associated with a particular network connection. Since both the client and the server need to keep track of their end of the connection, both the client the server will have their own socket representing their endpoint. (Note the difference between a "TCP-connection", which you can imagine as an imaginary/virtual wire running from one computer to another, and the two "sockets", which would be the imaginary/virtual connectors at the ends of that wire, that attach the wire to the client-program on one end, and the server-program on the other end)
3)_ When does a Client should "disconnect" ? At each query ? At the
end of some process ? Other ?
Whenever it wants to; it's up to the programmer(*). There are startup/shutdown costs to opening and connecting a new socket, but there is also some ongoing memory and CPU overhead to keeping a socket open indefinitely, so the programmer will have to make a design decision about whether he wants to keep sockets open over extended periods, or not.
(*) Note that in a modern OS, if the client program exits or crashes, the socket will be automatically closed and the connection automatically disconnected by the OS.
I have a requirement in which server needs to interact with 2 clients, one residing on local machine and one on remote.
So, initially I was thinking of creating a socket using AF_UNIX for communication with local client (since its faster than AF_INET), and AF_INET in case of communication with remote, and polling between them.
But in case of local client, channel will only be created in the beginning which will exist permanently till the server is running, i.e. single accept, followed by multiple read/writes.
So, can I replace this AF_UNIX with AF_INET, since the connection establishment will be done only once?
Where does performance hits in case of AF_INET? Is it in three-way handshake or somewhere else as well?
Quote from Performance: TCP loopback connection vs Unix Domain Socket:
When the server and client benchmark programs run on the same box, both the TCP/IP loopback and unix domain sockets can be used. Depending on the platform, unix domain sockets can achieve around 50% more throughput than the TCP/IP loopback (on Linux for instance). The default behavior of redis-benchmark is to use the TCP/IP loopback.
However, make sure that the performance gain is worth the tradeoff of complicating the network stack of your application (by using various types of sockets depending on client location).
I connect to a web server supported by an embedded system with Internet Explorer 9. Windows 7 is on the client side.
The web page have many tabs and I browse across until the problem occurs. It takes about one minute to happen.
The embedded system freezes so it not possible to browse and it does not respond to ping. After a moment the embedded system will recover because it is designed to reboot. I joined a Wireshark trace in which you can see 92 connections (use the filter "tcp.stream eq 0" with values [0,91]) and you will see. I have the source code so I know that the embedded system does not support more than 37 simultaneous connections. Is the cause an exhaustion of the resources?
But I have a more basic question and I really more appreciate an answer to it. The web server is at 172.21.1.12 port 80 and the client is at
172.21.9.70 and variable port numbers (see the trace). Because the IP and port on the server side do not change, how many sockets are in use on the server side? The question is important because the more sockets are opened, the more probably there is an exhaustion of the resources.
If the answer is only 1 socket then I must conclude there is no lack of resources because it can support 37.
I also suggest you use the filter ip.addr == 172.21.1.12 in Wireshark.
I thought I could upload the wireshark file. I dont know how to share it with you. Help please?
Dropbox?
Under the caveat that you haven't specified your embedded system, most TCP stacks will create a new socket for each new connection, and the mapping from socket to connection is 1-1.
When a packet arrives to the network stack, it has to associate that packet to the right socket. Usually, this is accomplished by employing a map from the TCP 4-tuple to the socket, where the 4-tuple consists of [local-ip, local-port, remote-ip, remote-port].
A server makes its service available by listening on a fixed local port that is known to clients wanting to use the service. As you understand, this is usually port 80 for a web server, and the software interface for most TCP implementations dedicate a socket for the purpose of allowing the API to perform operations on the network parameters for this service. However, the socket is not fully connected (the last two parts of the 4-tuple are set to a special "not specified" value, usually all bits 0). When a new connection is accepted, a new socket is created where the 4-tuple consists of the local information of the listening socket and the remote information taken from the source address and port of the SYN packet that initiated the TCP connection.
The limit on the number of connections a server can support is based on how the operating system is configured (you say yours limits it to 37). Using the 4-tuple, a single service (that is a fixed local-ip and local-port) will have an absolute limit of (2ADDR_BITS - RESERVED_ADDRS) × (216 - RESERVED_PORTS). For IPv4, the number of bits is 32, while for IPv6, the number of bits is 128.
When creating a connection, the client will specify the destination address and port (which fills out the remote information for the 4-tuple), but usually leave the source information unspecified. The TCP stack will choose an appropriate source address based on routing, and select an available source port (which will become the local information to complete the 4-tuple). In theory, any source port that is not being used by the selected local interface to communicate to the same remote service can be used as the local port. Most stacks will dedicate a set of the higher numbered ports for this purpose (referred to as the ephemeral port range).
I have been working on a project regarding TCP/IP socket connection and message transferring through these sockets. I am connecting to a UNIX server with a specific IP address and establishing socket connections. So far I could manage roughly 16000 connections from 1 host (in this case this is my own pc). And when I try establishing other connections from other hosts (either it is Mac Osx or Windows PC), I reached the same maximum connection number, 16000.
I can have 65536 connections on server side and I literally maintained that. But only when it is 16000 connections in each of 4 different computers. I wonder why I have this and how I can establish more than 16000 connections from only 1 host.
On Windows systems the TCP stack is subject to several registry parameters. They're arcane and poorly documented, and had changed with newer (Vista, Win7, Win8) releases, they also vary between desktop OS and server OS flavors.
Some KBs and MSDN articles cover the subject:
Tuning TCP/IP for Performance (a tad dated).
TCP/IP and NBT configuration parameters
TCP/IP Configuration Parameters
But this article is more to the point for your problem: Avoiding TCP/IP Port Exhaustion. Although is BizTalk related, the topic and solution are generic: increase MaxUserPort and decrease TcpTimedWaitDelay (careful with the later one though). The specifics your system ends up supporting vary, so you have to play with the settings. Make sure your test machines are 64 bit processor, 64 bit OS, and have enough of RAM (>4Gb).
For OS X I hope somebody else will provide the details.
Can two applications on the same machine bind to the same port and IP address? Taking it a step further, can one app listen to requests coming from a certain IP and the other to another remote IP?
I know I can have one application that starts off two threads (or forks) to have similar behavior, but can two applications that have nothing in common do the same?
The answer differs depending on what OS is being considered. In general though:
For TCP, no. You can only have one application listening on the same port at one time. Now if you had 2 network cards, you could have one application listen on the first IP and the second one on the second IP using the same port number.
For UDP (Multicasts), multiple applications can subscribe to the same port.
Edit: Since Linux Kernel 3.9 and later, support for multiple applications listening to the same port was added using the SO_REUSEPORT option. More information is available at this lwn.net article.
Yes (for TCP) you can have two programs listen on the same socket, if the programs are designed to do so. When the socket is created by the first program, make sure the SO_REUSEADDR option is set on the socket before you bind(). However, this may not be what you want. What this does is an incoming TCP connection will be directed to one of the programs, not both, so it does not duplicate the connection, it just allows two programs to service the incoming request. For example, web servers will have multiple processes all listening on port 80, and the O/S sends a new connection to the process that is ready to accept new connections.
SO_REUSEADDR
Allows other sockets to bind() to this port, unless there is an active listening socket bound to the port already. This enables you to get around those "Address already in use" error messages when you try to restart your server after a crash.
Yes.
Multiple listening TCP sockets, all bound to the same port, can co-exist, provided they are all bound to different local IP addresses. Clients can connect to whichever one they need to. This excludes 0.0.0.0 (INADDR_ANY).
Multiple accepted sockets can co-exist, all accepted from the same listening socket, all showing the same local port number as the listening socket.
Multiple UDP sockets all bound to the same port can all co-exist provided either the same condition as at (1) or they have all had the SO_REUSEADDR option set before binding.
TCP ports and UDP ports occupy different namespaces, so the use of a port for TCP does not preclude its use for UDP, and vice versa.
Reference: Stevens & Wright, TCP/IP Illustrated, Volume II.
In principle, no.
It's not written in stone; but it's the way all APIs are written: the app opens a port, gets a handle to it, and the OS notifies it (via that handle) when a client connection (or a packet in UDP case) arrives.
If the OS allowed two apps to open the same port, how would it know which one to notify?
But... there are ways around it:
As Jed noted, you could write a 'master' process, which would be the only one that really listens on the port and notifies others, using any logic it wants to separate client requests.
On Linux and BSD (at least) you can set up 'remapping' rules that redirect packets from the 'visible' port to different ones (where the apps are listening), according to any network related criteria (maybe network of origin, or some simple forms of load balancing).
Yes Definitely. As far as i remember From kernel version 3.9 (Not sure on the version) onwards support for the SO_REUSEPORT was introduced. SO_RESUEPORT allows binding to the exact same port and address, As long as the first server sets this option before binding its socket.
It works for both TCP and UDP. Refer to the link for more details: SO_REUSEPORT
No. Only one application can bind to a port at a time, and behavior if the bind is forced is indeterminate.
With multicast sockets -- which sound like nowhere near what you want -- more than one application can bind to a port as long as SO_REUSEADDR is set in each socket's options.
You could accomplish this by writing a "master" process, which accepts and processes all connections, then hands them off to your two applications who need to listen on the same port. This is the approach that Web servers and such take, since many processes need to listen to 80.
Beyond this, we're getting into specifics -- you tagged both TCP and UDP, which is it? Also, what platform?
You can have one application listening on one port for one network interface. Therefore you could have:
httpd listening on remotely accessible interface, e.g. 192.168.1.1:80
another daemon listening on 127.0.0.1:80
Sample use case could be to use httpd as a load balancer or a proxy.
When you create a TCP connection, you ask to connect to a specific TCP address, which is a combination of an IP address (v4 or v6, depending on the protocol you're using) and a port.
When a server listens for connections, it can inform the kernel that it would like to listen to a specific IP address and port, i.e., one TCP address, or on the same port on each of the host's IP addresses (usually specified with IP address 0.0.0.0), which is effectively listening on a lot of different "TCP addresses" (e.g., 192.168.1.10:8000, 127.0.0.1:8000, etc.)
No, you can't have two applications listening on the same "TCP address," because when a message comes in, how would the kernel know to which application to give the message?
However, you in most operating systems you can set up several IP addresses on a single interface (e.g., if you have 192.168.1.10 on an interface, you could also set up 192.168.1.11, if nobody else on the network is using it), and in those cases you could have separate applications listening on port 8000 on each of those two IP addresses.
Just to share what #jnewton mentioned.
I started an nginx and an embedded tomcat process on my mac. I can see both process runninng at 8080.
LT<XXXX>-MAC:~ b0<XXX>$ sudo netstat -anp tcp | grep LISTEN
tcp46 0 0 *.8080 *.* LISTEN
tcp4 0 0 *.8080 *.* LISTEN
Another way is use a program listening in one port that analyses the kind of traffic (ssh, https, etc) it redirects internally to another port on which the "real" service is listening.
For example, for Linux, sslh: https://github.com/yrutschle/sslh
If at least one of the remote IPs is already known, static and dedicated to talk only to one of your apps, you may use iptables rule (table nat, chain PREROUTING) to redirect incomming traffic from this address to "shared" local port to any other port where the appropriate application actually listen.
Yes.
From this article:
https://lwn.net/Articles/542629/
The new socket option allows multiple sockets on the same host to bind to the same port
Yes and no. Only one application can actively listen on a port. But that application can bequeath its connection to another process. So you could have multiple processes working on the same port.
You can make two applications listen for the same port on the same network interface.
There can only be one listening socket for the specified network interface and port, but that socket can be shared between several applications.
If you have a listening socket in an application process and you fork that process, the socket will be inherited, so technically there will be now two processes listening the same port.
I have tried the following, with socat:
socat TCP-L:8080,fork,reuseaddr -
And even though I have not made a connection to the socket, I cannot listen twice on the same port, in spite of the reuseaddr option.
I get this message (which I expected before):
2016/02/23 09:56:49 socat[2667] E bind(5, {AF=2 0.0.0.0:8080}, 16): Address already in use
If by applications you mean multiple processes then yes but generally NO.
For example Apache server runs multiple processes on same port (generally 80).It's done by designating one of the process to actually bind to the port and then use that process to do handovers to various processes which are accepting connections.
Short answer:
Going by the answer given here. You can have two applications listening on the same IP address, and port number, so long one of the port is a UDP port, while other is a TCP port.
Explanation:
The concept of port is relevant on the transport layer of the TCP/IP stack, thus as long as you are using different transport layer protocols of the stack, you can have multiple processes listening on the same <ip-address>:<port> combination.
One doubt that people have is if two applications are running on the same <ip-address>:<port> combination, how will a client running on a remote machine distinguish between the two? If you look at the IP layer packet header (https://en.wikipedia.org/wiki/IPv4#Header), you will see that bits 72 to 79 are used for defining protocol, this is how the distinction can be made.
If however you want to have two applications on same TCP <ip-address>:<port> combination, then the answer is no (An interesting exercise will be launch two VMs, give them same IP address, but different MAC addresses, and see what happens - you will notice that some times VM1 will get packets, and other times VM2 will get packets - depending on ARP cache refresh).
I feel that by making two applications run on the same <op-address>:<port> you want to achieve some kind of load balancing. For this you can run the applications on different ports, and write IP table rules to bifurcate the traffic between them.
Also see #user6169806's answer.