Windows Tracert tcp - sockets

I'm trying to implement tracert using tcp on Windows. I'm using Winsock. Socket I use is SOCK_STREAM.
The problem is how do I get the address of the host with the next TTL. As far as I get I cannot use recvfrom function in this case because TCP is a connection based protocol so recv is equal is recvfrom in this context.
I tried to use getpeername but I still get only target node's IP address.
Moreover. Setting even TTL = 0 to the IP packet still results finds its way to the target machine and I get the response.

tracert (or traceroute) doesn't work with TCP but with ICMP (like ping). The TTL should be started by 1 and then incremented by 1 until the destination was reached.
More can be found on http://en.wikipedia.org/wiki/Traceroute#Implementation

Related

How does a TCP server handle multiple Sockets to listening to the same port? [duplicate]

This might be a very basic question but it confuses me.
Can two different connected sockets share a port? I'm writing an application server that should be able to handle more than 100k concurrent connections, and we know that the number of ports available on a system is around 60k (16bit). A connected socket is assigned to a new (dedicated) port, so it means that the number of concurrent connections is limited by the number of ports, unless multiple sockets can share the same port. So the question.
TCP / HTTP Listening On Ports: How Can Many Users Share the Same Port
So, what happens when a server listen for incoming connections on a TCP port? For example, let's say you have a web-server on port 80. Let's assume that your computer has the public IP address of 24.14.181.229 and the person that tries to connect to you has IP address 10.1.2.3. This person can connect to you by opening a TCP socket to 24.14.181.229:80. Simple enough.
Intuitively (and wrongly), most people assume that it looks something like this:
Local Computer | Remote Computer
--------------------------------
<local_ip>:80 | <foreign_ip>:80
^^ not actually what happens, but this is the conceptual model a lot of people have in mind.
This is intuitive, because from the standpoint of the client, he has an IP address, and connects to a server at IP:PORT. Since the client connects to port 80, then his port must be 80 too? This is a sensible thing to think, but actually not what happens. If that were to be correct, we could only serve one user per foreign IP address. Once a remote computer connects, then he would hog the port 80 to port 80 connection, and no one else could connect.
Three things must be understood:
1.) On a server, a process is listening on a port. Once it gets a connection, it hands it off to another thread. The communication never hogs the listening port.
2.) Connections are uniquely identified by the OS by the following 5-tuple: (local-IP, local-port, remote-IP, remote-port, protocol). If any element in the tuple is different, then this is a completely independent connection.
3.) When a client connects to a server, it picks a random, unused high-order source port. This way, a single client can have up to ~64k connections to the server for the same destination port.
So, this is really what gets created when a client connects to a server:
Local Computer | Remote Computer | Role
-----------------------------------------------------------
0.0.0.0:80 | <none> | LISTENING
127.0.0.1:80 | 10.1.2.3:<random_port> | ESTABLISHED
Looking at What Actually Happens
First, let's use netstat to see what is happening on this computer. We will use port 500 instead of 80 (because a whole bunch of stuff is happening on port 80 as it is a common port, but functionally it does not make a difference).
netstat -atnp | grep -i ":500 "
As expected, the output is blank. Now let's start a web server:
sudo python3 -m http.server 500
Now, here is the output of running netstat again:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
So now there is one process that is actively listening (State: LISTEN) on port 500. The local address is 0.0.0.0, which is code for "listening for all ip addresses". An easy mistake to make is to only listen on port 127.0.0.1, which will only accept connections from the current computer. So this is not a connection, this just means that a process requested to bind() to port IP, and that process is responsible for handling all connections to that port. This hints to the limitation that there can only be one process per computer listening on a port (there are ways to get around that using multiplexing, but this is a much more complicated topic). If a web-server is listening on port 80, it cannot share that port with other web-servers.
So now, let's connect a user to our machine:
quicknet -m tcp -t localhost:500 -p Test payload.
This is a simple script (https://github.com/grokit/quickweb) that opens a TCP socket, sends the payload ("Test payload." in this case), waits a few seconds and disconnects. Doing netstat again while this is happening displays the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:54240 ESTABLISHED -
If you connect with another client and do netstat again, you will see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:26813 ESTABLISHED -
... that is, the client used another random port for the connection. So there is never confusion between the IP addresses.
A server socket listens on a single port. All established client connections on that server are associated with that same listening port on the server side of the connection. An established connection is uniquely identified by the combination of client-side and server-side IP/Port pairs. Multiple connections on the same server can share the same server-side IP/Port pair as long as they are associated with different client-side IP/Port pairs, and the server would be able to handle as many clients as available system resources allow it to.
On the client-side, it is common practice for new outbound connections to use a random client-side port, in which case it is possible to run out of available ports if you make a lot of connections in a short amount of time.
A connected socket is assigned to a new (dedicated) port
That's a common intuition, but it's incorrect. A connected socket is not assigned to a new/dedicated port. The only actual constraint that the TCP stack must satisfy is that the tuple of (local_address, local_port, remote_address, remote_port) must be unique for each socket connection. Thus the server can have many TCP sockets using the same local port, as long as each of the sockets on the port is connected to a different remote location.
See the "Socket Pair" paragraph in the book "UNIX Network Programming: The sockets networking API" by
W. Richard Stevens, Bill Fenner, Andrew M. Rudoff at: http://books.google.com/books?id=ptSC4LpwGA0C&lpg=PA52&dq=socket%20pair%20tuple&pg=PA52#v=onepage&q=socket%20pair%20tuple&f=false
Theoretically, yes. Practice, not. Most kernels (incl. linux) doesn't allow you a second bind() to an already allocated port. It weren't a really big patch to make this allowed.
Conceptionally, we should differentiate between socket and port. Sockets are bidirectional communication endpoints, i.e. "things" where we can send and receive bytes. It is a conceptional thing, there is no such field in a packet header named "socket".
Port is an identifier which is capable to identify a socket. In case of the TCP, a port is a 16 bit integer, but there are other protocols as well (for example, on unix sockets, a "port" is essentially a string).
The main problem is the following: if an incoming packet arrives, the kernel can identify its socket by its destination port number. It is a most common way, but it is not the only possibility:
Sockets can be identified by the destination IP of the incoming packets. This is the case, for example, if we have a server using two IPs simultanously. Then we can run, for example, different webservers on the same ports, but on the different IPs.
Sockets can be identified by their source port and ip as well. This is the case in many load balancing configurations.
Because you are working on an application server, it will be able to do that.
I guess none of the answers tells every detail of the process, so here it goes:
Consider an HTTP server:
It asks the OS to bind the port 80 to one or many IP addresses (if you choose 127.0.0.1, only local connections are accepted. You can choose 0.0.0.0 to bind to all IP addresses (localhost, local network, wide area network, both IP versions)).
When a client connects to that port, it WILL lock it up for a while (that's why the socket has a backlog: it queues a number of connection attempts, because they ARE NOT instantaneous).
The OS then chooses a random port and transfer that connection to that port (think of it as a temporary port that will handle all the traffic from now on).
The port 80 is then released for the next connection (first, it will accept the first one in the backlog).
When client or server disconnects, the random port is held open for a while (CLOSE_WAIT in the remote side, TIME_WAIT in the local side). That allows flushing some lost packets along the path. The default time for that state is 2 * MSL seconds (and it WILL consume memory while is waiting).
After that waiting, that random port is free again to receive other connections.
So, TCP cannot even share a port amongst two IP's!
No. It is not possible to share the same port at a particular instant. But you can make your application such a way that it will make the port access at different instant.
Absolutely not, because even multiple connections may shave same ports but they'll have different IP addresses

Obtaining the source IP and port of an INADDR_ANY client socket before the TCP three-way handshake?

I'm on Windows 7, using bind before connect with SO_REUSEADDR, and setting the local address structure to IP address INADDR _ANY and port 0 (zero), in order to let the operating system select the source details for a client socket.
Firstly, I've read that it's not possible to get the source IP before connecting to the server, since it's being chosen at this point and several addresses can be valid. But the port is selected before the connection, so is there a way to get it? (getsockname() looks like to not work).
Secondly, about the source IP, is there a way to get it before a packet is sent to the server? I need the specific time between the moment the OS selected the source IP and the moment it starts the three-way handshake. The connect() function dominates the both.
I'm on Windows 7, using bind before connect with SO_REUSEADDR
Why are you using SO_REUSEADDR in this situation? You don't need it, and it makes no sense for what you are attempting. SO_REUSEADDR should typically only be used for a listening socket, not a connecting socket.
setting the local address structure to IP address INADDR _ANY and port 0 (zero), in order to let the operating system select the source details for a client socket.
It is meaningless to bind() a client socket to INADDR_ANY:0. You can (and should) omit such a bind() completely and leave the socket unbound until connect() is called. The only time you should ever bind() a client socket is if you want to bind it to a specific local IP and/or Port. But you are not doing that in this situation, so get rid of it.
Firstly, I've read that it's not possible to get the source IP before connecting to the server, since it's being chosen at this point and several addresses can be valid.
Correct, unless you bind() to a specific source IP.
But the port is selected before the connection
Both source IP and source port are selected by connect() if the socket is unbound, or bound to source IP INADDR_ANY and/or source port 0. So you have no opportunity to query either value before connect() has selected them.
so is there a way to get it? (getsockname() looks like to not work).
getsockname() is exactly what you need. Just make sure you are not calling it until connect() has successfully connected to the server first. This is stated in the getsockname() documentation:
This call is especially useful when a connect call has been made without doing a bind first; the getsockname function provides the only way to determine the local association that has been set by the system.
...
The getsockname function does not always return information about the host address when the socket has been bound to an unspecified address, unless the socket has been connected with connect or accept (for example, using ADDR_ANY). A Windows Sockets application must not assume that the address will be specified unless the socket is connected. The address that will be used for the socket is unknown unless the socket is connected when used in a multihomed host. If the socket is using a connectionless protocol, the address may not be available until I/O occurs on the socket.
Secondly, about the source IP, is there a way to get it before a packet is sent to the server?
For TCP, you can retrieve the selected source IP using getsockname() immediately after connect() has successfully connected to the server. Not before.
I need the specific time between the moment the OS selected the source IP and the moment it starts the three-way handshake.
It is not possible to determine that detail. No socket application should ever need that detail. Why do you need it?
I'm on Windows 7, using bind before connect with SO_REUSEADDR, and setting the local address structure to IP address INADDR _ANY and port 0 (zero), in order to let the operating system select the source details for a client socket.
Why? That's exactly what happens if you don't call bind() at all, during connect(). Binding a client socket to INADDR_ANY isn't correct in any case. Setting SO_REUSEADDR doesn't make sense either without specifying a non-zero port number. Just remove all this.
Firstly, I've read that it's not possible to get the source IP before connecting to the server, since it's being chosen at this point and several addresses can be valid.
Correct.
But the port is selected before the connection, so is there a way to get it?
Yes. getsockname().
(getsockname() looks like to not work)
Doesn't work how?
Secondly, about the source IP, is there a way to get it before a packet is sent to the server?
You can get it with getsockname() as soon as connect() has succeeded, but this involves sending packets to the server.
I need the specific time between the moment the OS selected the source IP and the moment it starts the three-way handshake. The connect() function dominates the both.
Bad luck.

SSH session - fixed port on the client side

Is it possible to set the fixed port on the client side of the connection?
I connect to the ssh-server using port 22 and the client socket is getting random port to identify the session. An example (output from netstat -atn)
tcp4 0 0 <server>.22 <client>.54117 ESTABLISHED
In this example, client gets port 54117. For the test purposes, I'd like a fixed port to be assigned for the client, let's say 40185.
So I'd love the following output:
tcp4 0 0 <server>.22 <client>.40185 ESTABLISHED
Is it even possible?
You can do it programmaticaly, but the ssh(1) command doesn't allow to do that. The main reason is that you let the kernel select the client port, so you can open more than one ssh(1) session to the same server from different source ports in the same client machine. If you fix the port number in the client and the server, you cannot distinguish the packets belonging to one connection from the ones belonging to the other (same protocol, tcp, same source address, same dest address, same source port and same destination port)
To do it programmaticaly in a client and fix the local port, just call bind(2) system call to fix it, before doing the connect(2) system call (as the server does just before the accept(2) system call)
Be careful in that you cannot have two connections with the same five parameters (source add, source port, tcp protocol, dest port, dest addr)

Communication protocols in UDP

After many hours, I have discovered that the given udp server needs the following steps for a successful communication:
1- Send "Start Message" on a given port
2- Wait to receive from server on any port
3- Then the port dedicated to you to send further data to the server equals the port you have received on it + 1
So I am asking if this kind is a known protocol/handshaking, or it is only special to this server??
PS: All above communication were in udp sockets in C#
PS: Related to a previous question: About C# UDP Sockets
Thanks
There's no special "handshake" for UDP -- each UDP service, if it needs one, specifies its own. Usually, though, a server doesn't expect the client to be able to listen on all of its ports simultaneously. If you mean that the client expects a message from any port on the server, to the port the client sent the start message from, then that makes a lot more sense -- and is very close to how TFTP works. (The only difference i'm seeing so far, is that TFTP doesn't do the "+ 1".)
The server is, effectively, listening on a 'well known port' and then switching subsequent communications to a dedicated port per client. Requiring the client to send to the port + 1 is a little strange
Client 192.168.0.1 - port 12121 ------------------------> Server 192.168.0.2 - port 5050
Client 192.168.0.1 - port 12121 <------------------------ Server 192.168.0.2 - port 23232
Client 192.168.0.1 - port 12121 ------------------------> Server 192.168.0.2 - port 23232 + 1
<------------------------ Server 192.168.0.2 - port 23232
------------------------> Server 192.168.0.2 - port 23232 + 1
The server probably does this so that it doesn't have to demultiplex the inbound client data based on the client's address/port. Doing it this way is a little more efficient (generally) and also has some advantages, depending on the design of the server, as on the server there's a 'dedicated' socket for you which means that if they're doing overlapped I/O then the socket stays the same for the whole period of communications with you which can make it easier and more efficient to associate data with the socket (this way they can probably avoid any lookups or locking to process each datagram). Anyway, enough of that (see here, if you want to know why I do it that way).
From your point of view as a client (and I'm assuming async sockets here) you need to first Bind() your local socket (just use INADDR_ANY and 0 to allow the OS to pick the port for you) then issue a RecvFrom() on the socket (so there's no race between you sending data to the server on this socket and it sending you data back before you issue a recv). Then issue a SendTo() to the 'well known port' of the server. The server will then send you back some data and your RecvFrom() will return you the data and the address that the server sent to you from. You can then take that address, add one to the port, store that address and from then on issue SendTo()s to that new sending address whilst continuing to issue RecvFrom()s for reading the server's data; or you could do something clever with Connect() to bind the remote end of the socket to the server's 'send to address' and simply use Write() and RecvFrom() from then on.

How is it possible to run a traceroute-like program without needing root privileges?

I have seen another program provide traceroute functionality within it but without needing root (superuser) privileges? I've always assumed that raw sockets need to be root, but is there some other way? (I think somebody mentioned "supertrace" or "tracepath"?) Thanks!
Ping the target, gradually increasing the TTL and watching where the "TTL exceeded" responses originate.
Rather than using raw sockets, some applications use a higher numbered tcp or udp port. By directing that tcp port at port 80 on a known webserver, you could traceroute to that server. The downside is that you need to know what ports are open on a destination device to tcpping it.
ping and traceroute use the ICMP protocol. Like UDP and TCP this is accessible through the normal sockets API. Only UDP and TCP port numbers less than 1024 are protected from use, other than by root. ICMP is freely available to all users.
If you really want to see how ping and traceroute work you can download an example C code implementation for them from CodeProject.
In short, they simple open an ICMP socket, and traceroute alters the increments the TTL using setsockopt until the target is reached.
You don't need to use raw sockets to send and receive ICMP packets. At least not on Windows.
If you have a modern Linux distro you can look at the source for traceroute (or tracepath, which came about before traceroute went no setuid) and tcptraceroute. None of those require RAW sockets -- checked on Fedora 9, they aren't setuid and work with default options for the normal user.
Using the code that tcptraceroute does might be esp. useful, as ICMP packets to an address will not necessarily end up at the same place as a TCP connection to port 80, for example.
Doing an strace of traceroute (as a normal user) shows it doing something like:
int opt_on = 1;
int opt_off = 0;
fd = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP)
setsockopt(fd, SOL_IP, IP_MTU_DISCOVER, &opt_off, sizeof int)
setsockopt(fd, SOL_SOCKET, SO_TIMESTAMP, &opt_on, sizeof int)
setsockopt(fd, SOL_IP, IP_RECVTTL, &opt_on, sizeof int)
...and then reading the data out of the CMSG results.