Getting 502 errors with SH termination code randomly from HAproxy - haproxy

I am randomly getting HAProxy 502 errors when I send request to my backend via haproxy I get 502 errors.
I have put retry on 502.But that does not solve the problem.
I have installed HAProxy on a Ubuntu 18.04 system with 4 cores.
My backend is a Flask + WSGI server which is able to serve requests directly but when I send requests via Haproxy I randomly get 502 errors.
I am sending 5 or more simultaneous requests using python requests library.
The following is the termination code I get from haproxy logs
502 209 - - SH--
0/0/0/-1/0 which correspond to %TR/%Tw/%Tc/%Tr/%Ta
The following is the configuration.
global
maxconn 256
defaults
mode http
timeout connect 10s
timeout client 30s
timeout server 30s
timeout check 10s
log global
option httplog
frontend mywebapp
bind *:7801
mode http
option httplog
default_backend webservers
backend webservers
mode http
balance roundrobin
retry-on 502
retries 3
server server1 10.92.89.57:7705 check
server server2 10.92.89.58:7705 check
server server3 10.92.89.59:7705 check
I have tried solutions related to HAProxy 502 errors on stack overflow with the following experiments.
I tried experimenting with timeouts of connect,client,server and check,maxconn and these parameters tune.maxrewrite,tune.http.maxhdr,tune.bufsize.
The retry on parameter does not retry the 502 request and I still get the error.
I disabled health checks.
I was initially working with docker installation and later I shifted to native installation of HAProxy but the error still remains.

Related

HAProxy: Random Layer4 timeouts in when using httpchk

I'm currently investigating an issue of random "Layer4" timeouts being reported by health checks used in HAPRoxy. The actual backend server being checked is proven to be up and responding at the time of these errors, as other trafic to the server is flowing through.
Thus making me suspect there might be issues caused by our configuration.
Server health is currently configured as follow:
option httpchk GET /health HTTP/1.1\r\nHost:\ Haproxy\r\nConnection:\ close
http-check expect string OK
server server1 server1.internal.example.com check check-ssl port 443 verify none inter 3s fall 2 backup
Trying to understand the docs, I see the "http-check connect" and "linger" options being mentioned. Would the "connect" directive make any actual difference to how the connection for the health check is set up compared to our current conf?
Any other feedback/observartions on the above config is welcome.

how to prevent 502 status code as response by haproxy as load balancer

I have 3 server:
server (A)= a nginx(port 80) as reverse proxy to kestler (5000 port)
server (B)= a nginx(port 80) as reverse proxy to kestler (5000 port)
server (C)= a HAProxy as load balancer for port 80 of server (A) and (B)
and server A & B are quite similar.
every things works very well and haproxy forwards requests to server (A) & (B), but if kestrel in one of servers (e.g. A) be killed, nginx respond 502 bad gateway error and haproxy not detect this issue and still redirect requests to it, and this is mistake! it must redirect requests to server (B) in this time.
global
log 127.0.0.1 local2 info
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
option redispatch
retries 3
timeout connect 5s
timeout client 50s
timeout server 50s
stats enable
stats hide-version
stats auth admin:admin
stats refresh 10s
stats uri /stat?stats
frontend http_front
bind *:80
mode http
option httpclose
option forwardfor
reqadd X-Forwarded-Proto:\ http
default_backend http_back
backend http_back
balance roundrobin
mode http
cookie SERVERID insert indirect nocache
server ServerA 192.168.1.2:80 check cookie ServerA
server ServerB 192.168.1.3:80 check cookie ServerB
How Can I resolve this issue?
thanks very much.
You are only checking whether nginx is running, not whether the application is healthy enough to use.
In the backend, add option httpchk.
option httpchk GET /some/path HTTP/1.1\r\nHost:\ example.com
Replace some path with a path that will prove whether the application is usable on that server if it returns 200 OK (or any 2xx or 3xx response), and replace example.com with the HTTP Host header the application expects.
option httpchk
By default, server health checks only consist in trying to establish a TCP connection. When option httpchk is specified, a complete HTTP request is sent once the TCP connection is established, and responses 2xx and 3xx are
considered valid, while all other ones indicate a server failure, including the lack of any response.
This will mark the server as unhealthy if the app is not healthy, so HAProxy will stop sending traffic to it. You will want to configure a check interval for each server using inter and downinter and fastinter options on each server entey to specify how often HAProxy should perform the check.

Send request with self signed certificates to backend

The Haproxy documentation (http://cbonte.github.io/haproxy-dconv/1.7/intro.html#3.3.2) lists as a basic feature:
authentication with the backend server lets the backend server it's really the expected haproxy node that is connecting to it
I have been attempting to do just that and have been unable to. So here's the question:
How do I send a request off to a backend with self signed certificates for authentication. The front-end request that uses this backend, is just http.
Here's my haproxy.cfg file:
global
maxconn 4096
daemon
log 127.0.0.1 local0
defaults
log global
option dontlognull
retries 3
option redispatch
maxconn 2000
timeout connect 5s
timeout client 15min
timeout server 15min
frontend public
bind *:8213
use_backend api if { path_beg /api/ }
default_backend web
backend web
mode http
server blogweb1 127.0.0.1:4000
backend api
mode tcp
acl clienthello req.ssl_hello_type 1
tcp-request inspect-delay 5s
tcp-request content accept if clienthello
server blogapi 127.0.0.1:8780
I eventually got this to start working. I believe what was throwing me off was the fact that after doing a haproxy -f <configFile> -st it didn't actually close the process like I thought it would. So, none of my changes/updates took. I kill -9 the tens of haproxy service and reran the command (haproxy -f ) and now it's working.
Now, this is a hypothesis, albeit one I am very confident in. I will still present my final configuration just in case someone will glean something from here. I used https://www.haproxy.com/doc/aloha/7.0/deployment_guides/tls_layouts.html. That link answers the question I had of "how do you authenticate to the backend using ssl" like the docs say you can.
global
maxconn 4096
daemon
log 127.0.0.1 local0
defaults
log global
option dontlognull
retries 3
option redispatch
maxconn 2000
timeout connect 5s
timeout client 15min
timeout server 15min
frontend public
bind *:443
mode http
use_backend api if { path_beg /api/ }
backend api
mode http
option httplog
server blogapi 127.0.0.1:4430 ssl ca-file <caFile.Pem> crt <clientCert.pem> verify required

HAProxy - Request getting Broadcast to every server

I am hosting two different application versions on same servers on different ports. In basic version i expect that following configuration should send request in RoundRobin fashion to different ports. But what i am observing is the request is getting broadcasted to ALL of my server endpoints. Meaning in below example my main request to port 8080 gets FWD to both www.myappdemo.com:5001 and www.myappdemo.com:5002... although the response send by proxy is ALWAYS from www.myappdemo.com:5001.
Can anyone tell what is wrong here?
global
debug
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:8080
default_backend servers
backend servers
balance roundrobin
server svr_50301 www.myappdemo.com:5001 maxconn 32 check
server svr_50302 www.myappdemo.com:5002 maxconn 32 check
i can advise you to enable logs and web interface, after that you can provide us more logs and you can check in web interface also if haproxy detects you second server(svr_50302) to be alive.
Reference to HAProxy 1.5 Doc's :
Web Interface - http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-stats%20admin
Good info how to enable login - http://webdevwonders.com/haproxy-load-balancer-setup-including-logging-on-debian/
Best Regards,
Dani

HAProxy random HTTP 503 errors

We've setup 3 servers:
Server A with Nginx + HAproxy to perform load balancing
backend server B
backend server C
Here is our /etc/haproxy/haproxy.cfg:
global
log /dev/log local0
log 127.0.0.1 local1 notice
maxconn 40096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 50000
clitimeout 50000
srvtimeout 50000
stats enable
stats uri /lb?stats
stats realm Haproxy\ Statistics
stats auth admin:admin
listen statslb :5054 # choose different names for the 2 nodes
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth admin:admin
listen Server-A 0.0.0.0:80
mode http
balance roundrobin
cookie JSESSIONID prefix
option httpchk HEAD /check.txt HTTP/1.0
server Server-B <server.ip>:80 cookie app1inst2 check inter 1000 rise 2 fall 2
server Server-C <server.ip>:80 cookie app1inst2 check inter 1000 rise 2 fall 3
All of the three servers have a good amount of RAM and CPU cores to handle requests
Random HTTP 503 errors are shown when browsing: 503 Service Unavailable - No server is available to handle this request.
And also on server's console:
Message from syslogd#server-a at Dec 21 18:27:20 ...
haproxy[1650]: proxy Server-A has no server available!
Note that 90% times of the time there is no errors. These errors happens randomly.
I had the same issue. After days of pulling my hair out I found the issue.
I had two HAProxy instances running. One was a zombie that somehow never got killed during maybe an update or a haproxy restart. I noticed this when refreshing the /haproxy stats page and the PID would change between two different numbers. The page with one of the numbers had absurd connection stats. To confirm I did
netstat -tulpn | grep 80
Or
sudo lsof -i:80
and saw two haproxy processes listening to port 80.
To fix the issue I did a "kill xxxx" where xxxx is the pid with the suspicious statistics.
Adding my answer here for anyone else who encounters this exact same problem but none of the listed solutions above are applicable. Please note that my answer does not apply to the original code listed above.
For anyone else who may have this problem, check your config and see if you might have mistakenly put the same "bind" line in multiple sections of your config. Haproxy does not check this during startup, and I plan to submit this as a recommended validation check to the developers. In my case, I have 3 different sections of the config, and I mistakenly put the same IP binding in two different places. It was about a 50/50 shot on whether or not the correct section would be used or the incorrect section was used. Even when the correct section was used, about half of the requests still got a 503.
It is possible your servers share, perhaps, a common resource that is timing out at certain times, and that your health check requests are being made at the same time (and thus pulling the backend servers out at the same time).
You can try using the HAProxy option spread-checks to randomize health checks.
I had the same issue, due to 2 HAProxy services running in the linux box, but with different name/pid/resources. Unless i stop the unwanted one, the required instances throws 503 error randomly, say 1 in 5 times.
Was trying to use single linux box for multiple URL routing but looks a limitation in haproxy or the config file of haproxy i have defined.
Hard to say without more details, but is it possible you are exceeding the configured maxconn for each backend? The Stats UI shows these stats on both the frontend and on individual backends.
I resolved my intermittent 503s with HAProxy by adding option http-server-close to backend. Looks like uWSGI (which is upstream) is not doing well with keep-alive. Not sure what's really behind the problem, but after adding this option, haven't seen single 503 since.
don't use the "bind" line in multiple sections of your haproxy.cfg
for example, this would be wrong
frontend stats
bind *:443 ssl crt /etc/ssl/certs/your.pem
frontend Main
bind *:443 ssl crt /etc/ssl/certs/your.pem
fix like this
frontend stats
bind *:8443 ssl crt /etc/ssl/certs/your.pem
frontend Main
bind *:443 ssl crt /etc/ssl/certs/your.pem