Windows server 2008 send [RST, ACK] packets while several clients ask for tcp connections at the same time(less than 5ms) - sockets

I have a Java Socket Server running on a Windows Server 2008.
When using a multi-threads client to send several TCP connections at the same time, the client always get the "Errno 111 connection refused" error after the establishment of the first connection.
Here's the capture trace of Wireshark (10.1.3.136 is the server, 10.34.10.132 is the client): Trace and the specific red trace goes here:Trace2
So, what's the issue?
If I delay-launch the thread by more than 5ms, or use a centos as the server, the errors disapperar. No exceptions are found in the server trace file.

The issue is that you have filled the backlog queue, whereupon Windows starts issuing resets to further incoming connection requests.
This could be because you specified a small backlog value, but the more likely cause is that your server is simply not accepting connections fast enough: your accept loop is fiddling around doing other things, such as DNS calls or even I/O with the client, all of which should be done in the client's thread. All the accept loop should do is accept sockets and start threads.

Related

Why does the server application send RST after having gone through SYN->SYN,ACK->ACK?

I have a system with server/client applications. The client will send in socket connection request and the server will accept the socket connection when it's working correctly. However, in some situations (most likely due to ungraceful socket disconnection like system shutdown on client side or crash), the client will not be able to reconnect to the server application. The Wireshark capture shows the client will continue to try to connect; but after going through SYN->SYN,ACK->ACK, the server application will send RST. At this point, sometimes the netstat -an will show the connection is in CLOSE_WAIT state and other times would not show this connection. The capture shows 'Acknowledgment Number: Broken TCP. The ackowledge field is nonzero while the ACK flag is not set.
My questions is why the server application would send this RST?

tcp connection issue for unreachable server after connection

I am facing an issue with tcp connection..
I have a number of clients connected to the a remote server over tcp .
Now,If due to any issue i am not able to reach my server , after the successful establishment of the tcp connection , i do not receive any error on the client side .
On client end if i do netstat , it shows me that clients are connected the remote server , even though i am not able to ping the server.
So,now i am in the case where the server shows it is not connected to any client and on another end the client shows it is connected the server.
I have tested this for websocket also with node.js , but the same behavior persists over there also .
I have tried to google it around , but no luck .
Is there any standard solution for that ?
This is by design.
If two endpoints have a successful socket (TCP) connection between each other, but aren't sending any data, then the TCP state machines on both endpoints remains in the CONNECTED state.
Imagine if you had a shell connection open in a terminal window on your PC at work to a remote Unix machine across the Internet. You leave work that evening with the terminal window still logged in and at the shell prompt on the remote server.
Overnight, some router in between your PC and the remote computer goes out. Hours later, the router is fixed. You come into work the next day and start typing at the shell prompt. It's like the loss of connectivity never happened. How is this possible? Because neither socket on either endpoint had anything to send during the outage. Given that, there was no way that the TCP state machine was going to detect a connectivity failure - because no traffic was actually occurring. Now if you had tried to type something at the prompt during the outage, then the socket connection would eventually time out within a minute or two, and the terminal session would end.
One workaround is to to enable the SO_KEEPALIVE option on your socket. YMMV with this socket option - as this mode of TCP does not always send keep-alive messages at a rate in which you control.
A more common approach is to just have your socket send data periodically. Some protocols on top of TCP that I've worked with have their own notion of a "ping" message for this very purpose. That is, the client sends a "ping" message over the TCP socket every minute and the server responds back with "pong" or some equivalent. If neither side gets the expected ping/pong message within N minutes, then the connection, regardless of socket error state, is assumed to be dead. This approach of sending periodic messages also helps with NATs that tend to drop TCP connections for very quiet protocols when it doesn't observe traffic over a period of time.

TCP connection between client and server gone wrong

I establish a TCP connection between my server and client which runs on the same host. We gather and read from the server or say source in our case continuously.
We read data on say 3 different ports.
Once the source stops publishing data or gets restarted , the server/source is not able to publish data again on the same port saying port is already bind. The reason given is that client still has established connection on those ports.
I wanted to know what could be the probable reasons of this ? Can there be issue since client is already listening on these ports and trying to reconnect again and again because we try this reconnection mechanism. I am more looking for reason on source side as the same code in client sides when source and client are on different host and not the same host works perfectly fine for us.
Edit:-
I found this while going through various article .
On the question of using SO_LINGER to send a RST on close to avoid the TIME_WAIT state: I've been having some problems with router access servers (names withheld to protect the guilty) that have problems dealing with back-to-back connections on a modem dedicated to a specific channel. What they do is let go of the connection, accept another call, attempt to connect to a well-known socket on a host, and the host refuses the connection because there is a connection in TIME_WAIT state involving the well-known socket. (Stevens' book TCP Illustrated, Vol 1 discusses this problem in more detail.) In order to avoid the connection-refused problem, I've had to install an option to do reset-on-close in the server when the server initiates the disconnection.
Link to source:- http://developerweb.net/viewtopic.php?id=2941
I guess i am facing the same problem: 'attempt to connect to a well-known socket on a host, and the host refuses the connection'. Probable fix mention is 'option to do reset-on-close in the server when the server initiates the disconnection'. Now how do I do that ?
Set the SO_REUSEADDR option on the server socket before you bind it and call listen().
EDIT The suggestion to fiddle around with SO_LINGER option is worthless and dangerous to your data in flight. Just use SO_RESUSEADDR.
You need to close the socket bound to that port before you restart/shutdown the server!
http://www.gnu.org/software/libc/manual/html_node/Closing-a-Socket.html
Also, there's a timeout time, which I think is 4 minutes, so if you created a TCP socket and close it, you may still have to wait 4 minutes until it closes.
You can use netstat to see all the bound ports on your system. If you shut down your server, or close your server after forking on connect, you may have zombie processes which are bound to certain ports that do not close and remain active, and thus, you can't rebind to the same port. Show some code.

Is TCP Reset (RST) two way?

I have a client-server (Java) application using persistent TCP connections, but sometimes the Server receives java.io.IOException: Connection reset by peer exception when trying to write on the socket, however I don't see any error in the Client log.
This RST is probably caused by an intermediate proxy/router, but if that's the case, should this be seen on the client as well?
If the RST is sent by the client, it can be seen on it using a packet sniffer such as wireshark. However, it won't show up in any user-level sockets since it's sent by the OS as a response to various erroneous inputs (such as connection attempts to a closed port).
If the RST is sent by the network, then it's pretending to be the client to sever the connection. It can do so in one direction, or in both of them. In that case, the client might not see anything, except for a RST sent by the actual server when the client continues to send data to a connection it perceives as open, while the server sees it as closed.
Try capturing the traffic on both the server and the client, see where the resets are coming from.

How can we remove close_wait state of the socket without restarting the server?

We have written an application in which client-server communication is used with the IOCP concept.
Client connects to the server through wireless access points.
When temporary disconnection happens in the network, this can lead a CLOSE_WAIT state.This could indicate that the
client properly closed the connection. But the server still has its socket open.
If there are too many instances of the port (to which the server and client were talking) were in CLOSE_WAIT state then at the highest peak ,server stop functioning thus rejecting the connection.That is totally frustrating.In this case, user has to restart the server to wipe out all the close_wait state by clearing the memory.When server restart,client again try to connect to the server.Server calls accept command again,But before accepting a new connection ,previous connection should be closed at server side,How can we do that ?
How can we remove close_wait state of the socket without restarting the server ?
Is there any alternate way to avoid server restart ?
We also came to know that,If all of the available ephemeral ports are allocated to client applications then the
client experiences a condition known as TCP/IP port exhaustion. When TCP/IP port exhaustion occurs, client port
reservations cannot be made and errors will occur in client applications that attempt to connect to a server via TCP/IP sockets.
if this is happening then we need to increase the upper range of ephemeral ports that are dynamically allocated to client TCP/IP socket connections.
Reference :
http://msdn.microsoft.com/en-us/library/aa560610%28v=bts.10%29.aspx
Let us know if this alternate way is useful or not ?
Thanks in advance.
Regards
Amey
Fix the server code.
The server should be reading with a timeout, and if the timeout expires it should close the socket.