Summary:
I am guessing that the issue here is something to do with how Windows and Linux handle TCP connections, or sockets, but I have no idea what it is. I'm initiating a TCP connection to a piece of custom hardware that someone else has developed and I am trying to understand its behaviour. In doing so, I've created a .Net core 2.2 application; run on a Windows system, I can initiate the connection successfully, but on Linux (latest Raspbian), I cannot.
It appears that it may be because Linux systems do not try to retry/retransmit a SYN after a RST, whereas Windows ones do - and this behaviour seems key to how this peculiar piece of hardware works..
Background:
We have a black box piece of hardware that can be controlled and queried over a network, by using a manufacturer-supplied Windows application. Data is unencrypted and requires no authentication to connect to it and the application has some other issues. Ultimately, we want to be able to relay data from it to another system, so we decided to make our own application.
I've spent quite a long time trying to understand the packet format and have created a library, which targets .net core 2.2, that can be used to successfully communicate with this kit. In doing so, I discovered that the device seems to require a kind of "request to connect" command to be sent, via UDP. Straight afterwards, I am able to initiate a TCP connection on port 16000, although the first TCP attempt always results in a RST,ACK being returned - so a second attempt needs to be made.
What I've developed works absolutely fine on both Windows (x86) and Linux (Raspberry Pi/ARM) systems and I can send and receive data. However, when run on the Raspbian system, there seems to be problems when initiating the TCP connection. I could have sworn that we had it working absolutely fine on a previous build, but none of the previous commits seem to work - so it may well be a system/kernel update that has changed something.
The issue:
When initiating a TCP connection to this device, it will - straight away - reset the connection. It does this even with the manufacturer-supplied software, which itself then immediately re-attempts the connection again and it succeeds; so this kind of reset-once-then-it-works-the-second-time behaviour in itself isn't a "problem" that I have any control over.
What I am trying to understand is why a Windows system immediately re-attempts the connection through a retransmission...
..but the Linux system just gives up after one attempt (this is the end of the packet capture..)
To prove it is not an application-specific issue, I've tried using ncat/netcat on both the Windows system and the Raspbian system, as well as a Kali system on a separate laptop to prove it isn't an ARM/Raspberry issue. Since the UDP "request" hasn't been sent, the connection will never succeed anyway, but this simply demonstrates different behaviour between the OSes.
Linux versions look pretty much the same as above, whereby they send a single packet that gets reset - whereas the Windows attempt demonstrates the multiple retransmissions..
So, does anyone have any answer for this behaviour difference? I am guessing it isn't a .net core specific issue, but is there any way I can set socket options to attempt a retransmission? Or can it be set at the OS level with systemctl commands or something? I did try and see if there are any SocketOptionNames, in .net, that look like they'd control attempts/retries, as this answer had me wonder, but no luck so far.
If anyone has any suggestions as to how to better align this behaviour across platforms, or can explain the reason for this difference is at all, I would very much appreciate it!
Nice find! According to this, Windows´ TCP will retry a connection if it receives a RST/ACK from the remote host after sending a SYN:
... Upon receiving the ACK/RST client from the target host, the client determines that there is indeed no service listening there. In the Microsoft Winsock implementation of TCP, a pending connection will keep attempting to issue SYN packets until a maximum retry value is reached (set in the registry, this value defaults to 3 extra times)...
The value used to limit those retries is set in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpMaxConnectRetransmissions according to the same article. At least in Win10 Pro it doesn´t seem to be present by default.
Although this is a conveniece for Windows machines, an application still should determine its own criteria for handling a failed connect attempt IMO (i. e number of attempts, timeouts etc).
Anyhow, as I said, surprising fact! Living and learning I guess ...
Cristian.
I have a requirement in which server needs to interact with 2 clients, one residing on local machine and one on remote.
So, initially I was thinking of creating a socket using AF_UNIX for communication with local client (since its faster than AF_INET), and AF_INET in case of communication with remote, and polling between them.
But in case of local client, channel will only be created in the beginning which will exist permanently till the server is running, i.e. single accept, followed by multiple read/writes.
So, can I replace this AF_UNIX with AF_INET, since the connection establishment will be done only once?
Where does performance hits in case of AF_INET? Is it in three-way handshake or somewhere else as well?
Quote from Performance: TCP loopback connection vs Unix Domain Socket:
When the server and client benchmark programs run on the same box, both the TCP/IP loopback and unix domain sockets can be used. Depending on the platform, unix domain sockets can achieve around 50% more throughput than the TCP/IP loopback (on Linux for instance). The default behavior of redis-benchmark is to use the TCP/IP loopback.
However, make sure that the performance gain is worth the tradeoff of complicating the network stack of your application (by using various types of sockets depending on client location).
I am working on altera FPGA and have written a client application on uclinux. I have also written server client application in Python. My FPGA Client is able to connect with Python server but it is unable to connect when server is placed on a remote location passing more than one router.
If any one having idea where the probable mistake is and ways to debug the issue
I am trying to run the ZeroMQ's local_thr/remote_thr on SDP (infiniband) compiled on MSVS2012. But it's not connecting.
On IPoIB it is working properly. Operating System is Windows Server 2008 R2. On further investigation I found out that select() calls within ZeroMQ libraries are not working for Asynch accept() and send().
I also created a test application using BSD socket API and used select to accept connection on non blocking socket. But selectis not receiving the event for accept.
Please let me know what can be done to troubleshoot this issue.
I have been working on a project regarding TCP/IP socket connection and message transferring through these sockets. I am connecting to a UNIX server with a specific IP address and establishing socket connections. So far I could manage roughly 16000 connections from 1 host (in this case this is my own pc). And when I try establishing other connections from other hosts (either it is Mac Osx or Windows PC), I reached the same maximum connection number, 16000.
I can have 65536 connections on server side and I literally maintained that. But only when it is 16000 connections in each of 4 different computers. I wonder why I have this and how I can establish more than 16000 connections from only 1 host.
On Windows systems the TCP stack is subject to several registry parameters. They're arcane and poorly documented, and had changed with newer (Vista, Win7, Win8) releases, they also vary between desktop OS and server OS flavors.
Some KBs and MSDN articles cover the subject:
Tuning TCP/IP for Performance (a tad dated).
TCP/IP and NBT configuration parameters
TCP/IP Configuration Parameters
But this article is more to the point for your problem: Avoiding TCP/IP Port Exhaustion. Although is BizTalk related, the topic and solution are generic: increase MaxUserPort and decrease TcpTimedWaitDelay (careful with the later one though). The specifics your system ends up supporting vary, so you have to play with the settings. Make sure your test machines are 64 bit processor, 64 bit OS, and have enough of RAM (>4Gb).
For OS X I hope somebody else will provide the details.