Proper way to close a socket to avoid "connection reset by peer" - sockets

I have a client/server app that maintains a socket continuously. When the client signs off, it sends a 'signing off' message to the server and then closes the socket and cleans up. The server cleans up and closes the socket when it receives this message - and does not reply to the message.
On a fairly regular basis, I see "connection reset by peer" errors getting logged by the server without any complaints from end users, and I figure this must be an occasional timing issue in my sign-off sequence. I do see the same errors when end users complain about their connections actually being dropped, so I'm wondering how to tell the difference between those scenarios - or even better, how to prevent the bogus 'connection reset' scenario in the normal case.
I'm guessing that in some cases the server's getting hit by the closed socket before (or during) receipt of the "signing off" message. Is this possible? Is there a proper sequence you're supposed to follow for letting a server know that the client is about to terminate before actually closing the socket? Some way to check that the last message was delivered prior to closing?
Thanks,
Rob

The shutdown(s, SHUT_RDWR) function should solve your problem. There's a more complete explanation in this document.

This usually means that you have either written to a connection that had already been closed by the peer, or closed a connection without reading all the pending incoming data. In other words, an application protocol error.

Related

Difference between closing a socket and closing a network stream (System.Net.Sockets)

I have a proxy server implemented, after sending the final response to client if I directly close the socket (System.Net.Sockets TCPClient.Client.Close()) then client end receives connection aborted error but instead if I use System.Net.Sockets TCPClient.getStream().Close(), it works successfully.I want to understand what's the difference and why is client side receiving an error in the first scenario?
I would say, that Close of sockets is not trivial operation as most people think :)
First of all, you should understand the how the close should be done correctly. Basically, you have to consider that close is a kind of message like any other message sent out your socket. Or other words close() is an information on the other side of communication that the peer finished some kind of work.
Now the important thing to understand that having a TCP socket you can inform the peer that you finished sending or finished listening.
On this page, you can check out how it works in the background (note that ACK and FIN are IP layer messages so even using plain sockets implementation you will never see them): http://www.tcpipguide.com/free/t_TCPConnectionTermination-2.htm
So now the more practical step. Please consider that you have a client and server. The server needs to receive a message and close the connection. Please consider that client is just going to send a message and then closes the connection. If you will also consider that networks need some time to process your communication, you will realize that if you do it quickly, client will close the connection before server received your message. If you can the TCPClient.Client.Close() client will stop listening for anything (that means also for information about that the server closed the connection). So here comes the TCP stack to play (windows does it for you) - in case you will close this way the socket, TCP stack, needs to inform the server site that whatever server has sent goes to dump. So that's why you have an exception.
So the correct way is to:
inform the server that client finished sending any data (FIN)
wait until server confirms that he knows that client will not send any data (ACK)
now server should inform client that will stop sending data (FIN)
now the client can say - "ok I got it, I will not listen anymore" (ACK)
Anyway, the C# TCPClient seems to hide the logic of the background socket closing routine, but if you will not call the close sequence correct way, you'll end up with errors.
I hope that this little bit long explanation will help you understand how it works in the background and finally let you understand why.
It's also a good way to read more about TCP protocol details if you wish to learn more: http://www.tcpipguide.com/free/t_TCPIPTransmissionControlProtocolTCP.htm
I suppose that in order to close connection, you need to send some special bytes sequence. And looks like it is implemented only by tcpclient library , and not implemented by socket library. Probably something like Eof should be sent.
You may check it by some net traffic utilities like tcpdump.
Good luck!

Physical disconnection from network and intermittent socket error 10057

A customer of mine has a Windows application where there is a network connection between two machines. The system is supposed to cope with the connection being lost. It does this by keeping a counter on the client position which is reset every time data is received from the server. If the counter reaches 60 seconds (i.e. we haven't heard from the server for 60 seconds) it performs some expected action to cope with the connection being lost.
The customer has a problem, however, where sometimes the connection will be lost but the client doesn't perform the expected action. Upon investigation, it appears that this is an intermittent problem caused by the client's socket to the server sometimes raising error 10057 (WSAENOTCONN / "Socket is not connected") when the connection is lost. Because the client behaves differently when it gets a socket error the customer doesn't get the desired behaviour when they get this socket error. This is not difficult for me to fix, but I am a bit puzzled by the different behaviour.
To reproduce the problem I'm physically pulling the network cable out of the back of my server machine. The majority of the time, the effect on the client side is that we just don't get any data over the socket, and we don't get an error. Some fraction of the time however error 10057 is raised. Can anyone shed any light on why there is this inconsistency? The client socket is a nonblocking STREAM socket.
I would expect you would get an error only if you try to send something. That is when the TCP connection would discover it can't reach the other end point. This will take a variable amount of time to discover the failure, depending on the network round trip time. There might be a "keep alive" option, that forces the socket to periodically send something to detect failure even when app is idle.
WSAENOTCONN is a bug in your application. It isn't a result of a lost connection. The result of a lost connection is WSAECONNRESET. Your code must have got WSAECONNRESET, and then proceed to use the connection as though it was still valid. Then you get WSAENOTCONN.

Select() is not coming out in client side

I have written one client socket program using linux sockets only. Here is the information giving picture what I am doing in my program
Creating the socket
Making connection with server socket
assigning that socket to read set and exception set for select.
using the select method giving the timeout value NULL in a separate thread
Server is running in one external device.
this program is working fine for reading and all.. Now I am facing problem when I unplug the power cable of that device.
I assumed that when we remove the power cable of the device all the sockets will abruptly closed and connected client sockets will get read event. when we try to read we receive number of bytes read as zero that means connection closed by server.
But in my program when I unplug the power cable of the device, Here in my client program select is not coming out means client socket is not getting any event. I don't understand why..
Any suggestion will be appreciated on how we can come to know that connection is closed by server or any information on whats the sockets behaviour when shutting down the power supply.
I need your help, its very critical.
thank you.
When a remote machine is suddenly cut off the network (network cable unplug or power loss), there is no way it can inform the other side of the connection about that. What is more the client side that performs only reads from a half-open socket (like in your case) won't be able to detect this either.
The only way to know about a connection loss is to to send a packet. Since all data being sent should be acknowledged by the other side, TCP on a client computer will keep retrying to send an unconfirmed portion of data till the number of attempts is exhausted. Then a ETIMEDOUT error should be returned (via a socket that is expecting read events). You can create one more socket for sending these messages periodically to detect a peer disappearance (heart beat connection) on the client side. But all this retries might still take some time.
Another option could be to use SO_KEEPALIVE socket option. After some time a connection has been idle, TCP starts sending probe messages to the server and can detect its disappearance. The default values for idle item are usually enormously huge, so they need to be modified. Some of other parameters that might be related (TCP_KEEPCNT, TCP_KEEPINTVL, TCP_KEEPIDLE). It appears, this option might be implemented differently on different systems or can be simply absent.
I've never personally tried to solve this problem so all this is just a bunch of thoughts that might give some ideas. Here is one more source of ideas.

How to detect when socket connection is lost?

I have a script (I don't have the code example here at the moment but I used IO::Async) which connects to socket on a remote server and listens. Client usually just listens for new data.
Problem is that the client is not able to detect if network problems occur and the socket connection is gone.
I used IO::Async and I also tried it with IO::Socket. Handle is always "connected" after the initial connection is established.
If the network connection is established again the socket connection is naturally still lost because the script has no idea that it should reconnect.
I was thinking to create some kind of "keepAlive" which "pings" (syswrite) the socket every X seconds (if nothing new came through socket) to check whether the connection is still there.
Is this the correct way to do it or is there maybe an another more creative or cleaner solution?
You can set the SO_KEEPALIVE socket option which, for TCP, sends periodic keepalive messages, and may help detect this condition. If this is detected, you will be delivered an EOF condition (most likely causing the containing IO::Async::Stream to fire on_read_eof).
For a better solution you might consider some sort of application-level keepalive message, such as IRC's PING command.
The short answer is there is no default way to automatically detect a dropped socket in perl.
Your approach of pinging would probably work pretty well; you could run a continuous thread in the background that sends ping requests and if it doesn't receive a response the main thread can be notified and a reconnect should be issued.
If you want to get messy you can work with select() to detect keep alive messages; however this may require some OS configuration depending upon your platform.
See this thread for more details: http://www.perlmonks.org/?node_id=566568

TCP recover connection after hardware disconnect

I've got a program that continuously writes to a TCP socket. I want to make sure that if the connection between the client and server is disconnected for any amount of time, the connection can be restablished.
Right now, I can disconnect the wire, and while the write() function loops, it returns one "connection reset by peer" error, and then the value of ULLONG_MAX. Then, once I replug the wire, write() continuously returns "broken pipe" errors. I've tried to close and reopen the connection but I continue to get the "connection reset by peer" error.
Does anyone know how I could either restablish the connection or keep it alive for a certain amount of time (or indefinitely) in the first place?
You cannot re-use file descriptor here, you have to start from scratch again - create new socket(2) and call connect(2) on it.
I'm afraid you have to establish a new connection, and that can only initiated by the client program. You might need some way to ensure it's the same client reconnecting maybe check the IP or exchange a token on first connection, so you can do some different kind of initiation on your connection for first connection and recovery. That solution needs some programming on your account, though..
If TCP is not for some reason the only choice, you might want to think about UDP communication, since there the part that decideds when a connection is lost is left to you. But you'll need to take care of a lot of other thinks (but since you are aiming for a lost and recover communication, that might be more to your needs).