Are "protocols" just a human alias for *nix ports? - sockets

I'm reading the Beej guide to network programming, and I came across this
int getaddrinfo(const char *node, // e.g. "www.example.com" or IP
const char *service, // e.g. "http" or port number
const struct addrinfo *hints,
struct addrinfo **res);
You give this function three input parameters, and it gives you a
pointer to a linked-list, res, of results.
The node parameter is the host name to connect to, or an IP address.
Next is the parameter service, which can be a port number, like "80",
or the name of a particular service (found in The IANA Port List or
the /etc/services file on your Unix machine) like "http" or "ftp" or
"telnet" or "smtp" or whatever.
Are unix ports and protocols the same thing? For example https is the same as port 443, and http is port 80?

Quick
Ports and protocols are different things, but usually protocol has default port, so default port for http protocol web server is 80.
Explanation
Port is tcp/ip level entity, this is endpoint where binary network requests are sent.
Protocol is application level entity, it is used as language to communicate between client and server.
Basically you can speak any protocol over any port (just make sure that server and client use the same ones). So you can speak http over 12345 port and vice versa use 80 port to speak ftp.
For example when you type in your browser stackoverflow.com - your browser first add http:// before host name and then add :80 after - as default port for http, so actual URL which is accessesd - http://stackoverflow.com:80/, if you type https://stackoverflow.com - browser automatically adds :443 and so on.
But if you try to open https://stackoverflow.com:80/ you will get error - as it is wrong protocol for this port.
More - you can configure your own server to use some different ports, and then you will have to indicate protocol AND port for each request, for example: http://example.com:12345/ or https://example.com:54321/

Services, such as "ftp" and "http" have default ports. In this case, 443 and 80. So they do correspond to ports. I would, however, not call them "the same thing". You could, for example, perform http protocol over a different port if you wish. When you type http://something:8080 in your browser, for example, you are doing http over port 8080. In the case of getaddrinfo, it's allowing you to use the name of the service and it will internally use the default port number as determined by a configuration file (e.g., /etc/services on some Unix systems).

No. A port is a number, a protocol is a specification. They are associated, for convenience, but not identical.
And ports are part of TCP/IP, not Unix. The entire question is basically just a category mistake.

Related

Docker: run multiple container on same tcp ports with different hostname

Is there a way to run multiple docker containers on the same ports? For example, I have used the ports 80/443 (HTTP), 3306 (TCP/MySQL) and 22 (TCP/SSH) in my docker-compose file. Now I want to run this docker-compose for different hostnames on the same ip address on my machine.
- traffic from example1.com (default public ip) => container1
- traffic from example2.com (default public ip) => container2
I have already found a solution only for the HTTP traffic by using an additional nginx/haproxy as a proxy on my machine. But unfortunately, this can't handle other TCP ports.
This isn't possible in the general (non-HTTP) case.
At a lower level, if I connect to 10.20.30.40:3306, the Linux kernel selects a single process that's listening on that port and sends the request there. You're not allowed to bind(2) a second process to the same port. (This is also why you get an error if you try to docker run -p picking a host port that's already in use.)
In the case of HTTP, there's the further detail that the host-name part of the URL is also sent in an HTTP Host: header: the Web browser both does a DNS lookup for e.g. stackoverflow.com and connects to its IP address, and also sends a Host: stackoverflow.com HTTP header. That's the specific mechanism that lets you run a proxy on port 80, and then forward to some other backend service via a virtual-host setup.
That mechanism is very specific to HTTP, though, and doesn't work for other protocols that don't have support for it. I don't think either MySQL or ssh have similar mechanisms in their wire protocol.
(In the particular situation you describe this is probably relatively easy to handle. You wouldn't want to make either your internal database or an sshd visible publicly, so delete their ports: from your docker-compose.yml file, and then just worry about proxying the HTTP service. It's pretty unusual and a complex setup to run sshd in Docker so you also might remove that and simplify your stack a little.)

SIP over double nat

I'm developing a SIP parser in C (client only) and i have doubt about, do i need to bind a socket with a specific port (5060) on double nat?. What i'm sure about it's that is really important in the server side but i'm not really sure about the client side
You don't have to use port 5060 on the client side regardless of the NAT type. There is no any disadvantage if you just pickup a random port. The only recommendation is that once you pickup a port, keep that across sessions to help NAT bypassing a bit in some circumstances and to not overflow NAT's with various binding.
Even on the server side you can use any port, but there is a big disadvantage: users need to type also the port part as the server address (yourdomain:port) if you are not using the standard 5060 port.
Think about it like in case of http. On the web server the standard port is 80. However none of the clients (web browsers) are using port 80 on the client side.

socket programming - why web server still using listen port 80 to communicate with client even after they accepted the connection?

Usually a web server is listening to any incoming connection through port 80. So, my question is that shouldn't it be that in general concept of socket programming is that port 80 is for listen for incoming connection. But then after the server accepted the connection, it will use another port e.g port 12345 to communicate with the client. But, when I look into the wireshark, the server is always using port 80 during the communication. I am confused here.
So what if https://www.facebook.com:443, it has hundreds of thousands of connection to the it at a second. Is it possible for a single port to handle such a large amount of traffic?
A particular socket is uniquely identified by a 5-tuple (i.e. a list of 5 particular properties.) Those properties are:
Source IP Address
Destination IP Address
Source Port Number
Destination Port Number
Transport Protocol (usually TCP or UDP)
These parameters must be unique for sockets that are open at the same time. Where you're probably getting confused here is what happens on the client side vs. what happens on the server side in TCP. Regardless of the application protocol in question (HTTP, FTP, SMTP, whatever,) TCP behaves the same way.
When you open a socket on the client side, it will select a random high-number port for the new outgoing connection. This is required, otherwise you would be unable to open two separate sockets on the same computer to the same server. Since it's entirely reasonable to want to do that (and it's very common in the case of web servers, such as having stackoverflow.com open in two separate tabs) and the 5-tuple for each socket must be unique, a random high-number port is used as the source port. However, each of those sockets will connect to port 80 at stackoverflow.com's webserver.
On the server side of things, stackoverflow.com can already distinguish between those two different sockets from your client, again, because they already have different client-side port numbers. When it sees an incoming request packet from your browser, it knows which of the sockets it has open with you to respond to because of the different source port number. Similarly, when it wants to send a response packet to you, it can send it to the correct endpoint on your side by setting the destination port number to the client-side port number it got the request from.
The bottom line is that it's unnecessary for each client connection to have a separate port number on the server's side because the server can already uniquely identify each client connection by its client IP address and client-side port number. This is the way TCP (and UDP) sockets work regardless of application-layer protocol.
shouldn't it be that in general concept of socket programming is that port 80 is for listen for incoming connection. But then after the server accepted the connection, it will use another port e.g port 12345 to communicate with the client.
No.
But, when I look into the wireshark, the server is always using port 80 during the communication.
Yes.
I am confused here.
Only because your 'general concept' isn't correct. An accepted socket uses the same local port as the listening socket.
So what if https://www.facebook.com:443, it has hundreds of thousands of connection to the it at a second. Is it possible for a single port to handle such a large amount of traffic?
A port is only a number. It isn't a physical thing. It isn't handling anything. TCP is identifying connections based on the tuple {source IP, source port, target IP, target port}. There's no problem as long as the entire tuple is unique.
Ports are a virtual concept, not a hardware ressource, it's no harder to handle 10 000 connection on 1 port than 1 connection each on 10 000 port (it's probably much faster even)
Not all servers are web servers listening on port 80, nor do all servers maintain lasting connections. Web servers in particular are stateless.
Your suggestion to open a new port for further communication is exactly what happens when using the FTP protocol, but as you have seen this is not necessary.
Ports are not a physical concept, they exist in a standardised form to allow multiple servers to be reachable on the same host without specialised multiplexing software. Such software does still exist, but for entirely different reasons (see: sshttp). What you see as a response from the server on port 80, the server sees as a reply to you on a not-so-random port the OS assigned your connection.
When a server listening socket accepts a TCP request in the first time ,the function such as Socket java.net.ServerSocket.accept() will return a new communication socket whoes port number is the same as the port from java.net.ServerSocket.ServerSocket(int port).
Here are the screen shots.

Using port 80 for non http

Is it possible to use port 80 for non http traffic ? For example I'm making a small script that will communicate with a friends computer through the internet, however they must port forward it to get past the router. Is there a problem with using port 80 in the script so it will be let through automatically ? Is there some part of this i don't understand that will not let non http data through ? Please explain :)
there is no problem doing that. in fact, skype's default behaviour is to use port 80 and port 443 to transport voice!
There are a lot of ISPs that actually block port 80, so you might want to try a different port if you are having a problem (still needs to be forwarded)
The firewall on the computer also needs to be set to allow the incoming traffic.
This will work fine, but your friend may still need to setup port forwarding
If your friend's PC is the one listening on Port 80, he will need to setup port forwarding. Otherwise, how would the router/NAT know which computer in the house to bridge the connection to?
But if your friend's PC is the one making the outbound connection, then likely no port forwarding is needed at all on his end.
In other words, port forwarding (for TCP) is only for inbound connections. The router/NAT will automatically setup a port mapping scheme for outbound connections (as it does it with all web traffic).

why webservers use port 80 for real applications?

Just curious. When developing with Casini development server, one has an infinite number of ports. But, the production servers seem to give a particular importance to port 80.
Has that to do with a technical requirement, a convention, or both? I've checked the web but haven't been able to find a clear response so far.
Thanks for helping.
Many services have specifically-assigned ports This allows users to type, for example http://stackoverflow.com and get the website for SO, without needing to enter a port as well. This isn't a technical requirement; however, using a different port requires the user to know an extra piece of information, which must be entered into the URL every time.
When you connect to a server via TCP/IP you specify particular port you connect to. You do not connect to a server and hope that server guesses which port you would like to talk to.
So in most cases you tell browser to use protocol http, say "http://example.com/" then browser uses default port number assigned to that protocol (http) to connect to server "example.com". In this case port is 80. If for example you specify "https://example.com/" then browser looks for default port for https and then connects to port 443 instead.
So if you do not want to tell to every of your users to specify some non-default port for your service (say "http://example.com:60765/") you better use default one.
BTW there is a way to get port number your service listens to by it's protocol name (by asking a service's host's daemon at port 0) but this method seems to be rarely used (if at all).
See also other answers: default protocol numbers are assigned by IANA
It's a convention: you can use whatever port you feel like. You can look at the evolution of RFCs to see when the convention was official (http://www.faqs.org/rfcs/rfc1700.html)
You can see in the RFC 1060 (http://www.faqs.org/rfcs/rfc1060.html ) that it's the ISO Internet Protocol :)
In a production environment your web server is embedded in a server infrastructure (firewalls, proxies) protecting you against attacks from the internet. In such an environment port 80 is normally open for HTTP traffic. If you use this port there is no need to configure your server infrastructure.