Our stateful service holds session in operative memory. It takes about a minute to save session or get it from memory, so we use Redis to detect on which node the session is currently loaded to lock requests to other nodes.
But HAProxy sometimes switches sessions between nodes. When it happens, for example if session was on operative memory on first node and we are switching to the second node, the request waits, first node needs to save its state and the second one needs to restore it. While it happens, HAProxy probably thinks that the node is down, so HAProxy starts switching other requests, and the same process happens for other requests. We increased HAProxy waiting timeout but it didn't help. How can we make HAProxy switch this request and all forthcoming requests from specific session to specific node?
Something like
303
Location: 192.168.1.2
Okay so your application requires session stickiness.
When you have already cookies in place for session handling then I suggest to use also cookie stickiness within HAProxy.
In short here a config snipplet from this blog post https://www.haproxy.com/blog/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/
backend bk_web
balance roundrobin
cookie SERVERID insert indirect nocache
server s1 192.168.10.11:80 check cookie s1
server s2 192.168.10.21:80 check cookie s2
#Baptiste have a rather detailed explanation for the setup HAproxy 1.5.8 How do I configure Cookie based stickiness?
When you use more the one HAProxy servers can you sync the state via the peers protocol which is described in this blog post.
https://www.haproxy.com/blog/introduction-to-haproxy-stick-tables/
Related
I have a web application spread over multiple servers and the incoming traffic is handled by HAProxy in order to balance the load. When we do the distribution, we do it at night because the users are much less and therefore we are less in service. To make the distribution we use the following strategy:
we shut down half of the servers
we deploy on servers that are turned off
we reactivate the servers that are turned off
we perform the same procedure on the other servers
The problem is that in any case I turn off the servers we close connections to users. Is there a better strategy for doing this? How could I improve this and avoid disservices and maybe be able to make distributions even during the day?
I hope I was clear. Thanks
I strongly suggest to use health checks for the servers.
Using HAProxy as an API Gateway, Part 3 [Health Checks]
You should have a URL ("/health") which you can use for health check of the backend server and add option redispatch to the config.
Now when you want to maintain the backend server just "remove" the "/health" URL and haproxy automagically routes the user to the other available servers.
I'm load testing a Tomcat web application with 4 nodes. Those nodes are configured through Nginx with ip_hash:
ip_hash;
server example:8888 weight=2 max_fails=3 fail_timeout=10s;
server example:8888 weight=4 max_fails=3 fail_timeout=10s;
server example:8888 weight=2 max_fails=3 fail_timeout=10s;
server example:8888 weight=2 max_fails=3 fail_timeout=10s;
Anyway, I use Gatling for load and performance testing but everytime when I start a test all traffic is routed to one node.. Only when I change the load balance node to least_conn of round robin then the traffic is divided. But this application needs a persistent node to do the work.
Is there any way to let Gatling route the traffic to all 4 nodes during a run? Maybe with a setup configuration? I'm using this setUp right now:
setUp(scenario1.inject(
atOnceUsers(50),
rampUsers(300) over (1800 seconds),
).protocols(httpConf)
)
Thank you!
ip_hash;
Specifies that a group should use a load balancing method where requests are distributed between servers based on client IP addresses.
You should use sticky:
Enables session affinity, which causes requests from the same client to be passed to the same server in a group of servers.
Edit:
Right, I didn't see that it's for nginx plus only :(
I found this post (maybe it helps...):
https://serverfault.com/questions/832790/sticky-sessions-with-nginx-proxy
Reference to: https://bitbucket.org/nginx-goodies/nginx-sticky-module-ng
There is also a version of the module for older versions of nginx:
http://dgtool.treitos.com/2013/02/nginx-as-sticky-balancer-for-ha-using.html
Reference to: https://code.google.com/archive/p/nginx-sticky-module/
I'm facing a configuration issue with HAProxy (1.8).
Context:
In a HAProxy config, I have a several severs in a backend, and an additional backup server in case the other servers are down.
Once a client gets an answer from a server, it must stick to this server for its next queries.
For some good reasons, I can't use a cookie for this concern, and I had to use a stick-table instead.
Problem:
When every "normal" server is down, clients are redirected to the backup server, as expected.
BUT the stick-table is then filled with an association between the client and the id of the backup server.
AND when every "normal" server is back, the clients which are present in the stick table and associated with the id of the backup server will continue to get redirected to the backup server instead of the normal ones!
This is really upsetting me...
So my question is: how to prevent HAProxy to stick clients to a backup server in a backend?
Please find below a configuration sample:
defaults
option redispatch
frontend fe_test
bind 127.0.0.1:8081
stick-table type ip size 1m expire 1h
acl acl_test hdr(host) -i whatever.domain.com
...
use_backend be_test if acl_test
...
backend be_test
mode http
balance roundrobin
stick on hdr(X-Real-IP) table fe_test
option httpchk GET /check
server test-01 server-01.lan:8080 check
server test-02 server-02.lan:8080 check
server maintenance 127.0.0.1:8085 backup
(I've already tried to add a lower weight to the backup server, but it didn't solve this issue.)
I read in the documentation that the "stick-on" keyword has some "if/unless" options, and maybe I can use it to write a condition based on the backend server names, but I have no clue about the syntax to use, or even if it is possible.
Any idea is welcome!
So silly of me! I was so obsessed by the stick table configuration that I didn't think to look in the server options...
There is a simple keyword that perfectly solves my problem: non-stick
Never add connections allocated to this sever to a stick-table. This
may be used in conjunction with backup to ensure that stick-table
persistence is disabled for backup servers.
So the last line of my configuration sample simply becomes:
server maintenance 127.0.0.1:8085 backup non-stick
...and everything is now working as I expected.
So if this is the backend config:
backend main
mode http
balance leastconn
cookie serverid insert indirect nocache
stick-table type string len 36 size 1m expire 8h
stick on cookie(JSESSIONID)
option httpchk HEAD /web1 HTTP/1.0
http-check expect ! rstatus ^5
server monintdevweb 10.333.33.33:443 check cookie check ssl verify none #web1
server monintdevweb2 10.222.22.122:443 check cookie check ssl verify none #web2
server localmaint 10.100.00.105:9042 backup #maint
option log-health-checks
option redispatch
timeout connect 1s
timeout queue 5s
timeout server 3600s
It seems it always sends all users to web1. i.e. its not evenly load balancing using leastconn algorithm specified. I tried with stick table using src IP and that does what i want - ie persist sessions but each new session should get balanced between servers. Is it not possible to have that using cookies?
Another problem with cookies i noticed was that if I were to bring down services on the web1, all users get redirected to web2 and then redirected back to web1 when web1 is restored!! What real life scenario would find this behavior be useful?
After scouring the internet for a whole day and going thru every link that talks about load balancing and sticky sessions, I found the answer immediately after i posted the question. I need to use the "application ID" which will help load balancer differentiate between each user session so it can continue to load balance requests. I am not using IIS/asp.net so thats why it didn't hit me earlier. But here is the config that works..
change these lines..
cookie serverid insert indirect nocache
stick-table type string len 36 size 1m expire 8h
stick on cookie(JSESSIONID)
to:
stick-table type string len 36 size 1m expire 8h
stick on cookie(DWRSESSION)
where DWRSESSION is the my application session ID.
we’ve got a strange little problem we’re experiencing for months now:
The load on our cluster (http, long lasting keepalive connections with a lot of very short (<100ms) requests) is distributed very uneven.
All servers are configured the same way but some connections that push through thousands of requests per second just end up being sent to only one server.
We tried both load balancing strategies but that does not help.
It seems to be strictly keepalive related.
The misbehaving backend has the following settings:
option tcpka
option http-pretend-keepalive
Is the option http-server-close made to cover that issue?
If I get it right it will close and re-open a lot of connections which means load to the systems? Isn't there a way to keep the connections open but evenly balance the traffic anyway?
I tried to enable that option but it kills all of our backends when under load.
HAProxy currently only support keep-alive HTTP-connections toward the client, not the server. If you want to be able to inspect (and balance) each HTTP request, you currently have to use one of the following options
# enable keepalive to the client
option http-server-close
# or
# disable keepalive completely
option httpclose
The option http-pretend-keepalive doesn't change the actual behavior of HAProxy in regards of connection handling. Instead, it is intended as a workaround for servers which don't work well when they see a non-keepalive connection (as is generated by HAProxy to the backend server).
Support for keep-alive towards the backend server is scheduled to be in the final HAProxy 1.5 release. But the actual scope of that might still vary and the final release date is sometime in the future...
Just FYI, it's present in the latest release 1.5-dev20 (but take the fixes with it, as it shipped with a few regressions).