For tcp connection, is there any way to send an ACK packet to the other side without other data (only the ack packet) in Solaris 10.
I know we can do that through TCP Keep alive option, but it's supported in Solaris 10.
The reliable way to detect disconnection is to build a null / ping / echo type message into your application level protocol, and have your application send those at regular intervals. If it doesn't get a timely answer, it can assume the connection has been dropped. Most protocols that are intended to involve long-lived connections include such a message (for example, IRC, IMAP and SSH all do).
(After all, even if you could send bare TCP ACK messages, the other end doesn't have to respond to them, since it has recieved no more data to ACK itself).
If you're just receiving, the TCP stack will send plenty of ACKs without data all by itself. There's no way whatsoever to send an ACK of any kind from an application however.
You first posting states Solaris 10 is supporting TCP keep alives and later that it doesn't ...
Solaris supports setting tcp keepalive globally with the ndd command, eg:
ndd -set /dev/tcp tcp_keepalive_interval 120000
OpenSolaris and Solaris 11 Express support per socket keepalive settings.
You can enable it with SO_KEEPALIVE and tune it with TCP_KEEPALIVE_THRESHOLD and TCP_KEEPALIVE_ABORT_THRESHOLD.
http://docs.oracle.com/cd/E19082-01/819-2254/6n4iaov75/index.html
Related
My understanding was that UDP doesn't form connections; it just blindly sends packets. However, when I run
nc -u -l 10002
and
nc -u 127.0.0.1 10002
simultaneously (and so can send messages back and forth between terminals), lsof reports two open UDP connections:
nc ... UDP localhost:10002->localhost:35311
nc ... UDP localhost:35311->localhost:10002
If I open a third terminal and do nc -u 127.0.0.1 10002 again, to send another message to the original listener, the listener does not receive (or acknowledge, at least) the message, suggesting it is indeed tied to a specific connection.
If I implement a UDP echo server in Java like this and do sorta the same thing (on 10001), I get
java ... UDP *:10001
nc ... UDP localhost:52295->localhost:10001
aka, Java is just listening on 10001, but nc has formed a connection.
Based on my understanding of UDP, I'd expect both sides to behave like the Java version. What's going on? Can I make the Java version do whatever nc is doing? Is there a benefit to doing so?
I'm on Ubuntu 20.04.3 LTS.
UDP sockets can be connected (after a call to connect) or they can be unconnected. In the first case the socket can only exchange data with the connected peer, while in the second case it can exchange data with arbitrary peers. What you see in lsof is if the socket is connected or not.
My understanding was that UDP doesn't form connections; it just blindly sends packets.
That's a different meaning of the term connection here. TCP has always "real" connections, i.e. an association between two endpoints which has a clear start (SYN based handshake) and end (FIN based teardown). TCP sockets used for data exchange are therefor always connected.
UDP can have associations between two endpoints too, i.e. it can have connected sockets. There is no explicit setup and teardown of such a connection though. And UDP sockets don't need to be connected. From looking at the traffic it can therefore not be determined if connected UDP sockets are in use or unconnected.
Can I make the Java version do whatever nc is doing?
Yes, see What does Java's UDP DatagramSocket.connect() do?
.
Is there a benefit to doing so?
An unconnected UDP socket will receive data from any peer and the application has to check for each received datagram where they came from and if they should be accepted. A connected UDP socket will only receive data from the connected peer, i.e. no checks in the application are needed to check this.
Apart from that it might scale better if different sockets are used for communication with different peers. But if only few packets are exchanged with each peer and/or if one need to communicate with lots of peers at the same time, then using multiple connected sockets instead of a single unconnected one might mean too much overhead.
I have the following queries:
1) Does TCP guarantee delivery of packets and thus is thus application level re-transmission ever required if transport protocol used is TCP. Lets say I have established a TCP connection between a client and server, and server sends a message to the client. However the client goes offline and comes back only after say 10 hours, so will TCP stack handle re-transmission and delivering message to the client or will the application running on the server need to handle it?
2) Related to the above question, is application level ACK needed if transport protocol is TCP. One reason for application ACK would be that without it, the application would not know when the remote end received the message. Is there any reason other than that? Meaning is the delivery of the message itself guaranteed?
Does TCP guarantee delivery of packets and thus is thus application level re-transmission ever required if transport protocol used is TCP
TCP guarantees delivery of message stream bytes to the TCP layer on the other end of the TCP connection. So an application shouldn't have to bother with the nuances of retransmission. However, read the rest of my answer before taking that as an absolute.
However the client goes offline and comes back only after say 10 hours, so will TCP stack handle re-transmission and delivering message to the client or will the application running on the server need to handle it?
No, not really. Even though TCP has some degree of retry logic for individual TCP packets, it can not perform reconnections if the remote endpoint is disconnected. In other words, it will eventually "time out" waiting to get a TCP ACK from the remote side and do a few retries. But will eventually give up and notify the application through the socket interface that the remote endpoint connection is in a dead or closed state. Typical pattern is that when a client application detects that it lost the socket connection to the server, it either reports an error to the user interface of the application or retries the connection. Either way, it's application level decision on how to handle a failed TCP connection.
is application level ACK needed if transport protocol is TCP
Yes, absolutely. Most client-server protocols has some notion of a request/response pair of messages. A TCP socket can only indicate to the application if data "sent" by the application is successfully queued to the kernel's network stack. It provides no guarantees that the application on top of the socket on the remote end actually "got it" or "processed it". Your protocol on top of TCP should provide some sort of response indication when ever a message is processed. Use HTTP as a good example here. Imagine if an application would send an HTTP POST message to the server, but there was not acknowledgement (e.g. 200 OK) from the server. How would the client know the server processed it?
In a world of Network Address Translators (NATs) and proxy servers, TCP connections that are idle (no data between each other) can fail as the NAT or proxy closes the connection on behalf of the actual endpoint because it perceives a lack of data being sent. The solution is to have some sort of periodic "ping" and "pong" protocol by which the applications can keep the TCP connection alive in the absences of having no data to send.
Is there a reason why I should use application level heartbeating instead of TCP keepalives to detect stale connections, given that only Windows and Linux machines are involved in our setup?
It seems that the TCP keepalive parameters can't be set on a per-socket basis on Windows or OSX, that's why.
Edit: All parameters except the number of keepalive retransmissions can in fact be set on Windows (2000 onwards) too: http://msdn.microsoft.com/en-us/library/windows/desktop/dd877220%28v=vs.85%29.aspx
I was trying to do this with zeromq, but it just seems that zeromq does not support this on Windows?
From John Jefferies response : ZMQ Pattern Dealer/Router HeartBeating
"Heartbeating isn't necessary to keep the connection alive (there is a ZMQ_TCP_KEEPALIVE socket option for TCP sockets). Instead, heartbeating is required for both sides to know that the other side is still active. If either side does detect that the other is inactive, it can take alternative action."
TCP keepalives serve an entirely different function from application level heartbeating. A keepalive does just that, it keeps the TCP session active rather than allow it to time out after long periods of silence. This is important and good, and (if appropriate) you should use it in your application. But a TCP session dying due to inactivity is only one way that the connection can be severed between a pair of ZMQ sockets. One endpoint could lose power for 90 minutes and be offline, TCP keepalives wouldn't do squat for you in that scenario.
Application level heartbeating is not intended to keep the TCP session active, expecting you to rely on keepalives for that function if possible. Heartbeating is there to tell your application that the connection is in fact still active and the peer socket is still functioning properly. This would tell you that your peer is unavailable so you can behave appropriately, by caching messages, throwing an exception, sending an alert, etc etc etc.
In short:
a TCP keepalive is intended to keep the connection alive (but doesn't protect against all disconnection scenarios)
an app-level heartbeat is intended to tell your application if the connection is alive
Cannot understand the philosophy of setKeepAlive method in Node.js' net sockets. What happens after initialDelay finishes?
This method controls TCP keep-alive functionality on the underlying TCP socket. Check out this article for information on TCP Keepalive. Here's a snippet that explains what initialDelay (the "keepalive timer") does:
2.1. What is TCP keepalive?
The keepalive concept is very simple: when you set up a TCP connection, you associate a set of timers. Some of these timers deal with the keepalive procedure. When the keepalive timer reaches zero, you send your peer a keepalive probe packet with no data in it and the ACK flag turned on. You can do this because of the TCP/IP specifications, as a sort of duplicate ACK, and the remote endpoint will have no arguments, as TCP is a stream-oriented protocol. On the other hand, you will receive a reply from the remote host (which doesn't need to support keepalive at all, just TCP/IP), with no data and the ACK set.
If you receive a reply to your keepalive probe, you can assert that the connection is still up and running without worrying about the user-level implementation. In fact, TCP permits you to handle a stream, not packets, and so a zero-length data packet is not dangerous for the user program.
This procedure is useful because if the other peers lose their connection (for example by rebooting) you will notice that the connection is broken, even if you don't have traffic on it. If the keepalive probes are not replied to by your peer, you can assert that the connection cannot be considered valid and then take the correct action.
We have a .NET 2.0 desktop application which sends and receives network
packets over UDP.
Several users have reported an occasional socket error 10052 which happens
when the code calls socket.BeginReceiveFrom on a the UDP socket.
What does this mean?
The official MS documentation for socket error 10052 says - quote:
"WSAENETRESET (10052) Network dropped connection on reset . The connection
has been broken due to keep-alive activity detecting a failure while the
operation was in progress. It can also be returned by setsockopt if an
attempt is made to set SO_KEEPALIVE on a connection that has already
failed."
This just doesn't make much sense for a UDP socket since UDP is a
connectionless protocol.
I know that another close error code 10054 in connection with UDP sockets
means that an ICMP message "Port Unreachable" was received, and I am
wondering if 10052 might map to another ICMP message?
I have googled this for months, read network books, etc. but can't find
anything.
Please help - what does socket error 10052 on a UDP socket mean?
Thanks in advance
See http://msdn.microsoft.com/en-us/library/ms740120%28v=vs.85%29.aspx, which describes the recvfrom function. It says of WSAENETRESET (which is winsock error 10052):
For a datagram socket, this error indicates that the time to live has expired.
Be sure that TTL value is high enough, when sending UDP datagrams.
If you are using UdpClient class.
Then use the following before sending the datagram:
myUdpClient.Ttl = 255;
Note: 255 is the maximum value for TTL.
There is some network problem if that value is not enough.
WSAE NET RESET suggests that it happens due to a reset of the network interface itself. Your program is sitting there bound to a UDP port, so in a sense it is connected, but to the network interface rather than to a remote peer.
Try starting your program, getting it to the point where this BeginReceiveFrom call is about to be made, then disable your NIC in the Device Manager and re-enable it. Or, with Wi-Fi, drop and reestablish the connection to the WAP. It might even happen by just unplugging the Ethernet cable to your machine, as recent versions of Windows default to killing all sockets connected through that NIC when this happens.
It would explain the rare problem reports from the field. This probably only happens when there is some local networking fault at the hardware level.