What is the smallest SO_SNDBUF and SO_RCVBUF sizes possible on Windows and Linux? - sockets

On Windows and Linux, what are the smallest possible SO_SNDBUF and SO_RCVBUF sizes possible? Is it 1 byte? Does setting these values to 1 achieve the smallest possible? Does the OS delay allocating the RAM until the space is needed?
I realize that this will cause terrible performance for transferring data. I am not trying to transfer data. I am trying to check if a server is listening to a port and if not flag a problem.

$ man 7 socket
SO_SNDBUF
Sets or gets the maximum socket send buffer in bytes. The
kernel doubles this value (to allow space for bookkeeping
overhead) when it is set using setsockopt(2), and this
doubled value is returned by getsockopt(2). The default
value is set by the /proc/sys/net/core/wmem_default file
and the maximum allowed value is set by the /proc/sys/net/core/wmem_max file. The minimum (doubled)
value for this option is 2048.
SO_RCVBUF
Sets or gets the maximum socket receive buffer in bytes. The kernel doubles this value (to allow space for bookkeeping overhead) when it is set using setsockopt(2), and this doubled value is returned by getā€sockopt(2). The default value is set by the /proc/sys/net/core/rmem_default file, and the maximum allowed value is set by the /proc/sys/net/core/rmem_max file. The minimum (doubled) value for this option is 256.

If it's just in the "thousands of ports per hour" as you mentioned in your comment, chances are high your server is already getting an order of magnitude more connections per hour than what your test runner would impose. Just do a "connect", then a "close". Anything else is a micro-optimization.
And if there's any sort of proxy, port mapper, or load balancer involved, then testing the TCP connection itself may not be sufficient. You would want to actually test the application protocol being hosted on that socket. For example, if there is a web server running on port 8000, you should not only make a TCP connection, but actually make an HTTP request and see if you get any kind of response back.

Related

Linux: checking of incoming UDP datagrams

I'm working with special-purpose hardware that is connected on a 10G Ethernet link. I've got some questions on the handling of incoming datagrams, as follows:
What happens if the NIC discovers an incorrect link-level Ethernet CRC? Some searching shows that errors may not be reliably reported (here, for instance). Can I expect to get better stats from more recent kernels (2.6 - 3.10?)
What does the kernel actually check before deciding whether to return a packet to a recv? I'm guessing that for IPv4, the IPv4 header checksum must be correct, but what about the optional UDP header checksum?
Can recv ever return 0 for a UDP/SOCK_DGRAM?
For a non-blocking SOCK_DGRAM socket, does recv always return the entire packet when data is available? I guess it has to, but it's not obvious from the docs.
Thanks.
My knowledge may be out of date here, but historically, packets with FCS errors were not delivered at all and were not counted toward the interface statistics. The Ethernet layer error counts are usually reported by ethtool -S <interface>. The problem has always been that the interface statistics were maintained above the driver level and there was no standard API internally for network drivers to report those statistics. (Also, of course, in the very old days of 10Mb half duplex, collisions happened pretty frequently and Ethernet layer statistics weren't terribly informative as to your own adapter's behavior.)
You should not receive a packet if its IP header checksum is wrong, or if the UDP checksum is wrong when a checksum is provided (i.e. non-zero).
Yes. If you provide a zero length buffer, you will receive the next incoming datagram but then the entire content will be truncated, resulting in a return value of zero. Also, UDP permits zero-length datagrams: so if you receive a datagram with no content, the return value would also be zero. Aside from those two cases, I don't believe you'll get a return value of zero.
Yes, you should get the entire datagram provided there is space in your buffer. Otherwise, no. If you don't provide enough space to hold the entire datagram, the part that doesn't fit is discarded (i.e. your next recv will get a subsequent packet, not the end of the truncated one).

jute.maxbuffer affects only incoming traffic

Does this value only affect incoming traffic? If i set this value to say 4MB on zookeeper server as well as zookeeper client and I start my client, will I still get data > 4MB when I do a request for a path /abc/asyncMultiMap/subs. If /subs has data greater than 4MB is the server going to break it up in chunks <= 4MB and send it in pieces to the client?
I am using zookeeper 3.4.6 on both client (via vertx-zookeeper) and server. I see errors on clients where it complains that packet length is greater than 4MB.
java.io.IOException: Packet len4194374 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:112) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
"This is a server-side setting"
This statement is incorrect, jute.maxbuffer is evaluated on client as well by Record implementing classes that receive InputArchive. Each time a field is read and stored into an InputArchive the value is checked against jute.maxbuffer. Eg ClientCnxnSocket.readConnectResult
I investigated it in ZK 3.4
There is no chunking in the response.
This is a server-side setting. You will get this error if the entirety of the response is greater than the jute.maxbuffer setting. This response limit includes the list of children of znodes as well so even if subs does not have a lot of data but has enough children such that their length of their paths exceeds the max buffer size you will get the error.

vxworks 6.3 active sockets maxs out at 255?

I have a LPD server running on vxworks 6.3. The client application (over which I have no control) is sending me a LPQ query every tenth of a second. After 235 requests, the client receives a RST when trying to connect. After a time device will again accept some queries (about 300), until it again starts sending out RST.
I have confirmed that it is the TCP stack that is causing the RST. There are some things that I have noticed.
1) I can somewhat change the number of sockets that will accepted if I change the number of other applications that are running. For example, I freed up 4 sockets thereby changing the number accepted from 235 to 239.
2) If I send requests to lpr (port 515) and another port (say, port 80), the total number of connections that are accepted before the RST start happening stays constant at 235.
3) There are lots of sockets sitting TIME_WAIT.
4) I have a mock version of the client. If I slow the client down to one request every quarter second, the server doesn't reject the connections.
5) If I slow down the server's responses, I don't have any connections rejected.
So my theory is that there is some share resource (my top guess is total number of socket handles) that VxWorks can have consumed at a given time. I'm also guessing that this number tops out at 255.
Does anyone know how I can get VxWorks to accept more connections, and leave them in TIME_WAIT when closed? I have looked through the kernel configuration and changed all the values that looked remotely likely, but I have not been able change the number.
We know that we could set SO_LINGER but this not an acceptable solution. However, this does prevent the client connections from getting rejected. We have also tried changed the timeout value for SO_LINGER. This does not appear to be supported in VxWorks. It's either on or off.
Thanks!
Gail
To me it sounds like you are making a new connection for every LPQ query, and after the query is done you aren't closing the connection. In my opinion the correct thing to do is to accept one TCP connection and then use that to get all of the LPQ queries, however that may require mods to the client application. To avoid mods to the client, you should just close the TCP connection after each LPQ query.
Furthermore you can set the max number of FDs open in vxworks by adjusting the #define NUM_FILES config.h (or configall.h or one of those files), but that will just postpone an error if you have a FD leak, which you probably do.

RAW socket send: packet loss

During the RAW-socket based packet send testing, I found very irritating symptoms.
With a default RAW socket setting (especially for SO_SNDBUF size),
the raw socket send 100,000 packets without problem but it took about 8 seconds
to send all the packets, and the packets are correctly received by the receiver process.
It means about 10,000 pps (packets per second) is achieved by the default setting.
(I think it's too small figure contrary to my expectation.)
Anyway, to increase the pps value, I increased the packet send buffer size
by adjusting the /proc/sys/net/core/{wmem_max, wmem_default}.
After increasing the two system parameters, I have identified the irritating symptom.
The 100,000 packets are sent promptly, but only the 3,000 packets are
received by the receiver process (located at a remote node).
At the sending Linux box (Centos 5.2), I did netstat -a -s and ifconfig.
Netstat showed that 100,000 requests sent out, but the ifconfig shows that
only 3,000 packets are TXed.
I want to know the reason why this happens, and I also want to know
how can I solve this problem (of course I don't know whether it is really a problem).
Could anybody give me some advice, examples, or references to this problem?
Best regards,
bjlee
You didn't say what size your packets were or any characteristics of your network, NIC, hardware, or anything about the remote machine receiving the data.
I suspect that instead of playing with /proc/sys stuff, you should be using ethtool to adjust the number of ring buffers, but not necessarily the size of those buffers.
Also, this page is a good resource.
I have just been working with essentially the same problem. I accidentally stumbled across an entirely counter-intuitive answer that still doesn't make sense to me, but it seems to work.
I was trying larger and larger SO_SNDBUF buffer sizes, and losing packets like mad. By accidentally overrunning my system defined maximum, it set the SO_SNDBUF size to a very small number instead, but oddly enough, I no longer had the packet loss issue. So I intentionally set SO_SNDBUF to 1, which again resulted in a very small number (not sure, but I think it actually set it to something like 1k), and amazingly enough, still no packet loss.
If anyone can explain this, I would be most interested in hearing it. In case it matters, my version of Linux is RHEL 5.11 (yes, I know, I'm a bit behind the times).

setsockopt with SO_SENDBUF/SO_RECVBUF is not working

I was using setsockopt with SO_SENDBUF/SO_RECVBUF for setting the send/receive buffer of TCP with 256*1024 bytes.
But when I see in wireshark, I can see that the TCP's "Window Size" is shown as only 1525.
Also wmem_max and rmem_max are set with values 131071(126 kb).So ideally I was expecting it to be at least 128 kbps.
can anyone please help with this ?
Is this can also be a problem of wireshark where it is showing wrong "Window size".
You need to set that size on the listening socket at the server, before any accepts(), and at the client you need to set it on the socket before you connect it. That way you are allowing TCP's 'window scaling' option to take effect, which can only happen during the connect handshake. After the connection is established it is too late. That way the TCP receive window can be as large as the receive buffer, assuming various other conditions hold.
However unless you have an extraordinarily high-latency network with extraordinary bandwidth, 256k may be too large a size. There is no point whatsoever in setting it higher than the bandwidth-delay product, which can be calculated as the bandwidth in bytes/second times the delay in seconds, giving a result in bytes.