Dealing with intermittent Winsock errors - winsock

My client app gets intermittent winsock errors (10060, 10053) against one particular server we interface with. I have it re-trying the request that failed, but sometimes it fails repeatedly, and I give up after 5 re-tries. Would it be likely to help at all if I closed the socket and created a new one? (I know nothing about the server-side.)

Ok, so the errors that you're getting are:
10060 - WSAETIMEDOUT
10053 - WSAECONNABORTED
When do you get them? What are you doing at the time?
You get a WSAECONNABORTED when the remote end of the connection, or possibly an intermediary router, resets the connection and sends an RST. This could simply be the remote end issuing a non lingering close or it could be the remote end aborting or crashing.
You can't continue doing anything with a connection that has had a WSAECONNABORTED on it as the connection has been aborted and is no more; it is a dead connection, it has passed on...
Context matters immensely as to why you might get a WSAETIMEDOUT exception and the context will dictate if retrying is sensible or not.

One thing I would try is- do tracert to your server.
Often when someone is connected through VPN; you may see this error because your local and remote ip addresses overlap.
e.g. if your local ipaddress range is 192.168.1.xxx and vpn remote range is also 192.168.1.xxx you will also see this error.
sharrajesh

Related

How are read and write socket errors defined in the wrk HTTP benchmarking tool?

I am using the wrk HTTP benchmarking tool to test a server. And I am getting READ, WRITE as well as CONNECTION and TIMEOUT errors.
What I understand is:
CONNECTION errors, are caused by the refusal of a TCP connection.
Which could involve every element in the connection chain (Client,
ISP and Server).
TIMEOUT errors, are caused by the host failing to respond to a
request within a certain time.
But what about READ and WRITE errors?
I would really appreciate, if someone could point me in the direction of a good resource?
So what I understood from this code from the WRK repository is that.
WRITE ERROR’s happen when attempting to write on a connection, but it fails because of a closed socket on the server.
READ ERROR’s happen when attempting to read on a connection, but it fails because of a closed socket on the server.
Happy if anybody can confirm or refute that.

TCP connection between client and server gone wrong

I establish a TCP connection between my server and client which runs on the same host. We gather and read from the server or say source in our case continuously.
We read data on say 3 different ports.
Once the source stops publishing data or gets restarted , the server/source is not able to publish data again on the same port saying port is already bind. The reason given is that client still has established connection on those ports.
I wanted to know what could be the probable reasons of this ? Can there be issue since client is already listening on these ports and trying to reconnect again and again because we try this reconnection mechanism. I am more looking for reason on source side as the same code in client sides when source and client are on different host and not the same host works perfectly fine for us.
Edit:-
I found this while going through various article .
On the question of using SO_LINGER to send a RST on close to avoid the TIME_WAIT state: I've been having some problems with router access servers (names withheld to protect the guilty) that have problems dealing with back-to-back connections on a modem dedicated to a specific channel. What they do is let go of the connection, accept another call, attempt to connect to a well-known socket on a host, and the host refuses the connection because there is a connection in TIME_WAIT state involving the well-known socket. (Stevens' book TCP Illustrated, Vol 1 discusses this problem in more detail.) In order to avoid the connection-refused problem, I've had to install an option to do reset-on-close in the server when the server initiates the disconnection.
Link to source:- http://developerweb.net/viewtopic.php?id=2941
I guess i am facing the same problem: 'attempt to connect to a well-known socket on a host, and the host refuses the connection'. Probable fix mention is 'option to do reset-on-close in the server when the server initiates the disconnection'. Now how do I do that ?
Set the SO_REUSEADDR option on the server socket before you bind it and call listen().
EDIT The suggestion to fiddle around with SO_LINGER option is worthless and dangerous to your data in flight. Just use SO_RESUSEADDR.
You need to close the socket bound to that port before you restart/shutdown the server!
http://www.gnu.org/software/libc/manual/html_node/Closing-a-Socket.html
Also, there's a timeout time, which I think is 4 minutes, so if you created a TCP socket and close it, you may still have to wait 4 minutes until it closes.
You can use netstat to see all the bound ports on your system. If you shut down your server, or close your server after forking on connect, you may have zombie processes which are bound to certain ports that do not close and remain active, and thus, you can't rebind to the same port. Show some code.

Is TCP Reset (RST) two way?

I have a client-server (Java) application using persistent TCP connections, but sometimes the Server receives java.io.IOException: Connection reset by peer exception when trying to write on the socket, however I don't see any error in the Client log.
This RST is probably caused by an intermediate proxy/router, but if that's the case, should this be seen on the client as well?
If the RST is sent by the client, it can be seen on it using a packet sniffer such as wireshark. However, it won't show up in any user-level sockets since it's sent by the OS as a response to various erroneous inputs (such as connection attempts to a closed port).
If the RST is sent by the network, then it's pretending to be the client to sever the connection. It can do so in one direction, or in both of them. In that case, the client might not see anything, except for a RST sent by the actual server when the client continues to send data to a connection it perceives as open, while the server sees it as closed.
Try capturing the traffic on both the server and the client, see where the resets are coming from.

My netty TCP/IP server stops listenning few hours after starting

I have written TCP/IP server using Netty4.0 running on a Linux machine listening to small GPS tracking devices. I have been facing weird problem, which is server stops listening to them in a sudden several hours after I starts it. There is any error log I can see and still server is running. It looks like only channel is not working. When I run a client to do health check, the client socket is still alive and keep sending packet to the server but server does not get it.
If you have any idea how to solve it, please tell me about it. It would be appreciated.
It is impossible to tell without more informations. I would check different things like if there was an OOM exception or with telnet if the server really refuse connections etc. Also jstack may show you if there is some deadlock etc.

What does "connection reset by peer" mean?

What is the meaning of the "connection reset by peer" error on a TCP connection? Is it a fatal error or just a notification or related to the network failure?
It's fatal. The remote server has sent you a RST packet, which indicates an immediate dropping of the connection, rather than the usual handshake. This bypasses the normal half-closed state transition. I like this description:
"Connection reset by peer" is the TCP/IP equivalent of slamming the phone back on the hook. It's more polite than merely not replying, leaving one hanging. But it's not the FIN-ACK expected of the truly polite TCP/IP converseur.
This means that a TCP RST was received and the connection is now closed. This occurs when a packet is sent from your end of the connection but the other end does not recognize the connection; it will send back a packet with the RST bit set in order to forcibly close the connection.
This can happen if the other side crashes and then comes back up or if it calls close() on the socket while there is data from you in transit, and is an indication to you that some of the data that you previously sent may not have been received.
It is up to you whether that is an error; if the information you were sending was only for the benefit of the remote client then it may not matter that any final data may have been lost. However you should close the socket and free up any other resources associated with the connection.
one of the reasons for seeing this error and having trouble connecting to the server is that you enabled the firewall in the UNIX machine and forgot to add a rule to accept ssh connection. search in your WPS provider and you will find a way to connect to you machine and add this rules:
ufw allow ssh && ufw allow 22