Using HA-Proxy 1.4.18 I am using balance source as the option to balance a tcp stream to 2 servers. However from an admittedly very small sample set of connections it appears that they all just go to the one server - the server listed first in the haproxy config.
listen videos *:1935
balance source
mode tcp
server server1 192.168.0.1:1935
server server2 192.168.0.2:1935
I have not seen it split the load onto the 2 boxes. This does work when I use balance roundrobin however for this particular application I cannot use this method.
Any ideas for an otherwise persistent session loadbalanced between these 2 machines from the clients?
Cheers
How did you test the balance ?, the doc says :
The source IP address is hashed and divided by the total
weight of the running servers to designate which server will
receive the request. This ensures that the same client IP
address will always reach the same server as long as no
server goes down or up. If the hash result changes due to the
number of running servers changing, many clients will be
directed to a different server. This algorithm is generally
used in TCP mode where no cookie may be inserted. It may also
be used on the Internet to provide a best-effort stickiness
to clients which refuse session cookies. This algorithm is
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type"
If you tested with just 2 different IP source you maybe fall in a particular case.
Related
One of the problems with reverse proxies handling multiple requests on behalf of clients is, after a while under heavy load, the number of outgoing connections from envoy node to backend nodes will run out of ephemeral ports
Assuming that we have assigned multiple ip addreses/hostnames to envoy node, is there a way to inform envoy use these ip addresses/hostnames in a round robin fashion when making connections to backends?
References:
https://blog.box.com/blog/ephemeral-port-exhaustion-and-web-services-at-scale/
https://making.pusher.com/ephemeral-port-exhaustion-and-how-to-avoid-it/
https://www.nginx.com/blog/overcoming-ephemeral-port-exhaustion-nginx-plus/
https://github.com/kubernetes/kubernetes/issues/27398
The most promising option is to find a way to enable TCP multiplexing between your proxy/LB and backend servers.
What is TCP Multiplexing?
TCP multiplexing is a technique used primarily by load balancers and application delivery controllers (but also by some stand-alone web application acceleration solutions) that enables the device to "reuse" existing TCP connections. This is similar to the way in which persistent HTTP 1.1 connections work in that a single HTTP connection can be used to retrieve multiple objects, thus reducing the impact of TCP overhead on application performance.
TCP multiplexing allows the same thing to happen for TCP-based applications (usually HTTP / web) except that instead of the reuse being limited to only 1 client, the connections can be reused over many clients, resulting in much greater efficiency of web servers and faster performing applications.
Another good explanation about TCP multiplexing can be found here.
Another option is adding more proxy instances to the pool behind the L4 network Load Balancer and set connection limit for one instance to reasonable value.
Each proxy would carry a certain amount of load without a problem. If you need to handle periodic bursts in load, you may want to set auto scaling strategy to the proxy pool.
I understand why a server would need sockets for incoming data, but I do not understand why it is necessary that a socket connecting to another computer needs a source port.
While others have mentioned the exact reason why, let me illustrate the point by giving you an example:
Say you want to ssh to your server. OK, you ssh in and do some stuff. Then you tail a log file. So now you don't have access to the console anymore. No problem you think, I'll ssh again...
With one port number, if you ssh again that second connection will be a mirror of the first since the server won't know that there are two connections (no source port number to tell the difference) so you're out of luck.
With two port numbers you can ssh a second time to get a second console.
Say you browse a website, say Stackoverflow. You're reading a question but you think you've seen it before. You open a new tab in your browser to stackoverflow to do a search.
With only one port number the server have no way of knowing which packet belongs to which socket on the client so opening a second page will not be possible (or worse, both pages receive mixed data from each other).
With two port numbers the server will see two different connections from the client and send the correct data to the correct tab.
So you need two port numbers for client to tell what data is coming from what server and for the server to tell what data is coming from which socket from the client.
A TCP connection is defined in terms of the source and destination IP addresses and port numbers.
Otherwise for example you could never distinguish between two connections to the same server from the same client host.
Check out this link:
http://compnetworking.about.com/od/basiccomputerarchitecture/g/computer-ports.htm
Ultimately, they allow different applications and services to share the same networking resources. For example, your browser probably uses port 80, but your email application may use port 25.
TCP communication is two-way. A segment being sent from the server, even if it is in response to a segment from the client, is an incoming segment as seen from the client. If a client opens multiple connections to the same port on the server (such as when you load multiple StackOverflow pages at once), both the server and the client need to be able to tell the TCP segments from the different connections apart; this is done by looking at the combination of source port and destination port.
The server consists of several services with which a user interacts: profiles, game logics, physics.
I heard that it's a bad practice to have multiple client connections to the same server.
I'm not sure whether I will use UDP or TCP.
The services are realtime, they should reply as fast as possible so I don't want to include any additional rerouting if there are no really important reasons. So are there any reasons to rerote traffic through one external endpoint service to specific internal services in my case?
This seems to be multiple questions in one package. I will try to answer the ones I can identify as separate...
UDP vs TCP: You're saying "real-time", this usually means UDP is the right choice. However, that means having to deal with lost packets and possible re-ordering of packets. But, using UDP leaves a couple of possible delay-decreasing tricks open.
Multiple connections from a single client to a single server: This consumes resources (end-points, as it were) on both the client (probably ignorable) and on the server (possibly a problem, possibly ignorable). The advantage of using separate connections for separate concerns (profiles, physics, ...) is that when you need to separate these onto separate servers (or server farms), you don't need to update the clients, they just need to connect to other end-points, using code that's already tested.
"Re-router" (or "load balancer") needed: Probably not going to be an issue initially. However, it will probably become an issue later. Depending on your overall design and server OS, using UDP may actually become an asset here. UDP packet arrives at the load balancer, dispatched to the right backend and that could then in theory send back a reply with the source IP of the load balancer.
An alternative would be to have a "session broker". The client makes an initial connection to a well-known endpoint, says "I am a client, tell me where my profile, physics, what-have0-you servers are", the broker considers the current load, possibly the location of the client and other things that may make sense and the client then connects to the relevant backends on its own. The downside of this is that it's harder (not impossible, but harder) to silently migrate an ongoing session to a new backend, when there's a load-balancer in the way, this can be done essentially-transparently.
When using sockets for IPC, you can get the system to pick a random free port as described in this question here:
On localhost, how to pick a free port number?
There is a norm that you put the process ID in a ".pid" file so that you for example easy can find the apache process id and in this way kill it.
But what is the best practice way to exchange port number, when the OS picks a random port for you to listen on?
To inform about the port number you can use any other transport mechanism, which can be a file on the shared disk, pigeon mail, SMS, third-party server, dynamically updated DNS entry etc. The parties must have something common to share, then they can communicate. I omit port scanning here for the obvious reason.
There's one interesting aspect about not random ports but "floating" port number: if you don't want to keep the constant port but can choose the listening port within certain range, then you can use the algorithm for calculating actual port number based on date or day of week or other periodical or predictable information. This way the client knows where to look the server.
One more option is that during communication started on one port, the server and the client will agree where the server will wait for the client to have the next session.
I have two app servers which are behind an haproxy load balancer. Is there a configuration available using which I can split the traffic between the two as per my requirement. Like sending x% of the requests to server A and rest to server B.
You have several options, but I do not think that you can directly do what you want to do. Dividing the traffic close to 50/50 is as easy as setting the load balancing algorithm to "round robin". What it sounds like you want to do is be able to send 15% of traffic to server A and then 85% to server B. In order to do this simply set a cookie on the client (some random number between 1-100 for example) and then send all traffic with a cookie value of less than 16 to server A and the rest to server B.