How does performance of TCP and Unix Domain Sockets scale with number of processes and payload size? - sockets

Say I've a server on the same machine as some workers, each of which is talking to it. They could be talking over TCP or Unix Domain Sockets. How does the performance scale with number of workers and message size?
When I speak of performance, I'm looking for not only mean latencies, but also p90 and p99 latencies.

As for TCP you can measure its performance youself (like this). Do a few tests.
Set the length of buffer to read or write and run iperf in server mode and also run a few iperf processes in client mode. Then change the length of buffer to read or write.

Related

network redirection of 2 physical serial ports

I have a question about the best way to redirect over a TCP-IP connection a serial stream and I have some restrictions:
The network connection might be unstable
Communication is one way only
Communication has to be as real-time as possible to avoid buffers to get bigger and bigger serial speed is different, RX side faster than TX side
I've played with socat but most of the examples are for pty virtual serial ports and I haven't managed to make them work with a pair of physical serial ports.
ser2net demon on lede/openwrt seems to be unstable
Had a look to pyserial but I can find only a server example to client example and my python coding skills are terrible.

What is connections per host mongodb?

I followed this tutorial and there is configuration connections per host.
What is this?
connectionsPerHost are the amount of physical connections a single Mongo client instance (it's singleton so you usually have one per application) can establish to a mongod/mongos process. At time of writing the java driver will establish this amount of connections eventually even if the actual query throughput is low (in order words you will see the "conn" statistic in mongostat rise until it hits this number per app server).
There is no need to set this higher than 100 in most cases but this setting is one of those "test it and see" things. Do note that you will have to make sure you set this low enough so that the total amount of connections to your server do not exceed
Found here How to configure MongoDB Java driver MongoOptions for production use?

Is transmitting a file over multiple sockets faster than just using one socket?

In this old project (from 2002), It says that if you split a file into multiple chunks and then transmit each chunk using a different socket, it will arrive much faster than transmitting it as a whole using one socket. I also remember reading (many years ago) that some download manager also uses this technique. How accurate is this?
Given that a single TCP connection with large windows or small RTT can saturate any network link, I don't see what benefit you expect from multiple TCP sessions. Each new piece will begin with slow-start and so have a lower transfer-rate than an established connection would have.
TCP already has code for high-throughput, high-latency connections ("window scale option") and dealing with packet loss. Attempting to improve upon this with parallel connections will generally have a negative effect by having more failure cases and increased packet loss (due to congestion which TCP on a single connection can manage).
Multiple TCP sessions is only beneficial if you're doing simultaneous fetches from different peers and the network bottleneck is outside your local network (like bittorrent) or the server is doing bandwidth limitations per connection (at which point you're optimizing for the server, not TCP or the network).

httperf for bechmarking web-servers

I am using httperf to benchmark web-servers. My configuration, i5 processor and 4GB RAM. How to stress this configuration to get accurate results...? I mean I have to put 100% load on this server(12.04 LTS server).
you can use httperf like this
$httperf --server --port --wsesslog=200,0,urls.log --rate 10
Here the urls.log contains the different uri/path to be requested. Check the documention for details.
Now try to change the rate value or session value, then see how many RPS you can achieve and what is the reply time. Also in mean time monitor the cpu and memory utilization using mpstat or top command to see if it is reaching 100%.
What's tricky about httperf is that it is often saturating the client first, because of 1) the per-process open files limit, 2) TCP port number limit (excluding the reserved 0-1024, there are only 64512 ports available for tcp connections, meaning only 1075 max sustained connections for 1 minute), 3) socket buffer size. You probably need to tune the above limit to avoid saturating the client.
To saturate a server with 4GB memory, you would probably need multiple physical machines. I tried 6 clients, each of which invokes 300 req/s to a 4GB VM, and it saturates it.
However, there are still other factors impacting hte result, e.g., pages deployed in your apache server, workload access patterns. But the general suggestions are:
1. test the request workload that is closest to your target scenarios.
2. add more physical clients to see if the changes of response rate, response time, error number, in order to make sure you are not saturating the clients.

Massive holding open of TCP/IP sockets

I would like to create a server infrastructure that allows 500 clients to connect all at the same time and have an indefinite connection. The plan is to have clients connect to TCP/IP sockets on the server and have them stay connected, with the server sending out data to clients randomly and clients sending data to the server randomly, similar to a small MMOG, but with scarcely any data. I came up with this plan in comparison to TCP polling every 15-30 seconds from each client.
My question is, in leaving these connections open, will this cause a massive sum of server bandwidth usage at idle? Is this the best approach without getting into the guts of TCP?
TCP uses no bandwidth when idle, except maybe a few bytes every so often (default is 2 hours) if "keep-alive" is enabled.
500 connections is nothing, though epoll() is a good choice to reduce system overhead. 5000 connections might start to become an issue.
Bandwidth isn't your major concern, but there's a limit to the number of connections you can have open (though it is quite high).
But if polling every 15 seconds would be fast enough, I'd consider it a waste to keep the connection open.