Haproxy SSL termination : Layer4 connection problem, info: "Connection refused" - haproxy

I was trying to implement SSL termination with HAProxy.
This is how my haproxy.cfg looks like
frontend Local_Server
bind *:443 ssl crt /home/vagrant/ingress-certificate/k8s.pem
mode tcp
reqadd X-Forwarded-Proto:\ https
default_backend k8s_server
backend k8s_server
mode tcp
balance roundrobin
redirect scheme https if !{ ssl_fc }
server web1 100.0.0.2:8080 check
I have generated the self signed certificate which k8s.pem.
My normal URL (without https) is working perfectly fine .i.e. - http://100.0.0.2/hello
But when i try to access the same url with HTTPS .i.e.- https://100.0.0.2/hello i get 404 and when i checked my haproxy logs i can see following message
Jul 21 18:10:19 node1 haproxy[10813]: Server k8s_server/web1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 21 18:10:19 node1 haproxy[10813]: Server k8s_server/web1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Any suggestions which i can incorporate in my haproxy.cfg ?
PS - The microservice which i am trying to access is deployed under kubernetes cluster with service exposed as ClusterIP

Related

HAProxy Backend Layer7 Invalid Response

I am trying to load balance two server using HAProxy v1.8 but in my case the backends are domain names instead of IP addresses.
My HAProxy config looks like this:
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
pidfile /var/run/rh-haproxy18-haproxy.pid
user haproxy
group haproxy
daemon
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
spread-checks 21
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
# An alternative list with additional directives can be obtained from
# https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 10000
balance roundrobin
frontend https-443
bind *:443
mode http
option httplog
acl ACL_global.domain.com hdr(host) -i global.domain.com
use_backend www-443-app if ACL_global.domain.com
backend www-443-app
balance roundrobin
mode http
option httpchk GET /health
option forwardfor
http-check expect status 200
server backendnode1 app1.domain.com:443 check
server backendnode2 app2.domain.com:443 check
frontend health-443
bind *:8443
acl backend_dead nbsrv(www-443-app) lt 1
monitor-uri /haproxy_status
monitor fail if backend_dead
listen stats # Define a listen section called "stats"
bind :9000 # Listen on localhost:9000
mode http
stats enable # Enable stats page
stats hide-version # Hide HAProxy version
stats realm Haproxy\ Statistics # Title text for popup window
stats uri /haproxy_stats # Stats URI
stats auth haproxy:passwd # Authentication credentials
However, the health check is not passing. When I checked the stat page it says: Layer7 invalid response.
I checked if I can connect to the backend domains from my HAProxy server and I am successfully able to do so.
curl -X GET -I https://app1.domain.com/health
HTTP/2 200
cache-control: no-cache, private, max-age=0
content-type: application/json
expires: Thu, 01 Jan 1970 00:00:00 UTC
pragma: no-cache
x-accel-expires: 0
x-content-type-options: nosniff
x-xss-protection: 1; mode=block
date: Wed, 28 Jul 2021 12:05:09 GMT
content-length: 18
x-envoy-upstream-service-time: 0
endpoint: health
version: 1.0.0
server: istio-envoy
Is there something that I am missing in my configuration or something that I need to change to make this work?
You're missing ssl keyword for server lines. You may also want to set sni
backend foo
default-server ssl check verify none
server backendnode1 app1.domain.com:443 sni str('app1.domain.com')
server backendnode2 app2.domain.com:443 sni str('app2.domain.com')
You should also decide if you want to verify SSL certificates of your backend servers. Can you trust the connection? Is it your network? Haproxy encourages you to verify, but requires supplying CA certificate for them to verify. You can also add verifyhost and check-sni settings if you verify certificate:
backend foo
default-server ssl check verify required
server backendnode1 app1.domain.com:443 sni str('app1.domain.com') check-sni 'app1.domain.com' verifyhost 'app1.domain.com' ca-file /path/to/CA1.pem
server backendnode2 app2.domain.com:443 sni str('app2.domain.com') check-sni 'app2.domain.com' verifyhost 'app2.domain.com' ca-file /path/to/CA2.pem

Haproxy backend stays down and is never brought up again

It works fine up until the moment the remote server becomes unavailable for some time. In which case the server goes down in the logs and is never brought up again. Config is quite simple:
defaults
retries 3
timeout connect 5000
timeout client 3600000
timeout server 3600000
log global
option log-health-checks
listen amazon_ses
bind 127.0.0.2:1234
mode tcp
no option http-server-close
default_backend bk_amazon_ses
backend bk_amazon_ses
mode tcp
no option http-server-close
server amazon email-smtp.us-west-2.amazonaws.com:587 check inter 30s fall 120 rise 1
Here are the logs when the problem occurs:
Jul 3 06:45:35 jupiter haproxy[40331]: Health check for server bk_amazon_ses/amazon failed, reason: Layer4 timeout, check duration: 30004ms, status: 119/120 UP.
Jul 3 06:46:35 jupiter haproxy[40331]: Health check for server bk_amazon_ses/amazon failed, reason: Layer4 timeout, check duration: 30003ms, status: 118/120 UP.
Jul 3 06:47:35 jupiter haproxy[40331]: Health check for server bk_amazon_ses/amazon failed, reason: Layer4 timeout, check duration: 30002ms, status: 117/120 UP.
...
Jul 3 08:44:36 jupiter haproxy[40331]: Health check for server bk_amazon_ses/amazon failed, reason: Layer4 timeout, check duration: 30000ms, st
atus: 0/1 DOWN.
Jul 3 08:44:36 jupiter haproxy[40331]: Server bk_amazon_ses/amazon is DOWN. 0 active and 0 backup servers left. 0 sessions active, 0 requeued,
0 remaining in queue.
Jul 3 08:44:36 jupiter haproxy[40331]: backend bk_amazon_ses has no server available!
And that's it. Nothing but operator intervention brings the server back to life. I also tried removing the check part and what follows it - still the same thing happens. Can't haproxy be configured to try indefinitely and not mark a server DOWN? Thanks.

Getting IP floating right for HAproxy/Keepalived

This is all new to me. Please bear with me. Sorry if I don't phrase my problem correctly. I'd like to set ip floating to use HAproxy and Keepalived.
I have the following ips:
LB1: 192.168.1.27 #first load balancer
LB2: 192.168.1.32 #second load balancer
www1: 192.168.1.28 #first web server
www2: 192.168.1.29 #second web server
floating ip: 192.168.1.200
When I turn off load balancer 1 (LB1), traffic is not getting redirected to floating ip. I don't know what to check so I can make this whole setup run successfully. I suspect that LB2 is the problem (apart from me:)) as the floating ip doesn't do its job when LB1 is down.
I also followed the checking process section on the link below '8. Verify proper failover' but to no avail:
How to create Floating IP and use it to configure HAProxy.
Individually, the 5 ips work fine only when HAproxy is on. That's all I can say.
Could you please help me? Thanks.
EDIT
My config:
HAPROXY INSTALL on lb1
global
log 127.0.0.1 local2
daemon
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
retries 3
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
listen stats
bind :1936
stats enable
stats hide-version
stats realm Loadbalanced\ Servers
stats uri /haproxy?stats
stats auth haproxy:haproxy
frontend http-in
bind *:80
default_backend webservers
backend webservers
balance roundrobin
option httpchk GET /haproxy_check
stick-table type ip size 20k peers mypeer
server www1 192.168.1.28:80 cookie LSW_WWW1 check inter 500 fall 3 rise 2
server www2 192.168.1.29:80 cookie LSW_WWW2 check inter 500 fall 3 rise 2
peers mypeer
peer lb1hostname 192.168.1.27:1024
peer lb2hostname 192.168.1.32:1024 backup
HAPROXY install on lb2:
global
log 127.0.0.1 local2
daemon
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
retries 3
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
listen stats
bind :1936
stats enable
stats hide-version
stats realm Loadbalanced\ Servers
stats uri /haproxy?stats
stats auth haproxy:haproxy
frontend http-in
bind *:80
default_backend webservers
backend webservers
balance roundrobin
option httpchk GET /haproxy_check
server www1 192.168.1.28:80 cookie LSW_WWW1 check inter 500 fall 3 rise 2
server www2 192.168.1.29:80 cookie LSW_WWW2 check inter 500 fall 3 rise 2
KEEPALIVED ON LB1
vrrp_script chk_haproxy {
script "pidof haproxy"
interval 2}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 200
virtual_ipaddress {
192.168.1.200
}
track_script {
chk_haproxy
}}
KEEPALIVED ON LB2
vrrp_script chk_haproxy {
script "pidof haproxy"
interval 2}
vrrp_instance VI_1 {
interface eth0
state BACKUP
virtual_router_id 51
priority 100
virtual_ipaddress {
192.168.1.200
}
track_script {
chk_haproxy
}}
On LB1 and LB2, in Keepalived:
nano /etc/sysctl.conf
added net.ipv4.ip_nonlocal_bind=1
sysctl -p
on lb1:
sudo service keepalived alived stop
then checked if lb2 is working. It is!!! But when Haproxy on LB1 is down, traffic still doesn't get redirected to LB2.
It could indicate problem with arp update when failover ip, check arp table on your devices if floating ip points to lb2 mac addr

Haproxy + percona 5.7 xtradb error

i configure
Hello, I configure haproxy by digitalocean manual, roundrobin for percona 5.7 bases, but on the haproxy server, when I try to connect to the database I getting error.
On the haproxy server:
mysql -h 127.0.0.1 -u haproxy_root -p -e "SHOW DATABASES"
And i get error:
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 2
Haproxy config:
lobal
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
#log loghost local0 info
maxconn 1024
#chroot /usr/share/haproxy
user haproxy
group haproxy
daemon
#debug
#quiet
defaults
log global
mode http
option tcplog
option dontlognull
retries 3
option redispatch
maxconn 1024
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
listen galera_cluster
bind 127.0.0.1:3306
mode tcp
option httpchk
balance leastconn
server galera-node01 192.168.0.101:3306 check port 9200
server galera-node02 192.168.0.102:3306 check port 9200
server galera-node03 192.168.0.103:3306 check port 9200
If I connect directly to the database 192.168.0.101, everything works, I get a response from the database, but when I make the request through to haproxy 127.0.0.1 I get this error:
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading
initial communication packet', system error: 2
My xinetd config on mysql:
# default: on
# description: mysqlchk
service mysqlchk
{
# this is a config for xinetd, place it in /etc/xinetd.d/
disable = no
flags = REUSE
socket_type = stream
type = UNLISTED
port = 9200
wait = no
user = nobody
server = /usr/bin/clustercheck
server_args = percona percona
log_on_failure += USERID
only_from = 0.0.0.0/0
#
# Passing arguments to clustercheck
# <user> <pass> <available_when_donor=0|1> <log_file> <available_when_readonly=0|1> <defaults_extra_file>"
# Recommended: server_args = user pass 1 /var/log/log-file 0 /etc/my.cnf.local"
# Compatibility: server_args = user pass 1 /var/log/log-file 1 /etc/my.cnf.local"
# 55-to-56 upgrade: server_args = user pass 1 /var/log/log-file 0 /etc/my.cnf.extra"
#
# recommended to put the IPs that need
# to connect exclusively (security purposes)
per_source = UNLIMITED
}
If i telnet to PXC node on port 9200, i got:
telnet 192.168.0.101 9200
Trying 192.168.0.101...
Connected to 192.168.0.101.
Escape character is '^]'.
HTTP/1.1 503 Service Unavailable
Content-Type: text/plain
Connection: close
Content-Length: 57
Percona XtraDB Cluster Node is not synced or non-PRIM.
Connection closed by foreign host.
The most common reason for this is that all nodes in the cluster is down. If you have enabled your HAProxy stats, check that all nodes are up. If not, you mysqlchk service is likely not being able to connect to the cluster nodes properly.
Check your mysqlchk xinetd service should have the proper server_args configured. Once these are set, restart xinetd, and telnet to port 9200 to validate
[root#node02 log]# cat /etc/xinetd.d/mysqlchk
# default: on
# description: mysqlchk
service mysqlchk
{
# this is a config for xinetd, place it in /etc/xinetd.d/
...
server = /usr/bin/clustercheck
server_args = percona percona
...
# Passing arguments to clustercheck
# <user> <pass> <available_when_donor=0|1> <log_file> <available_when_readonly=0|1> <defaults_extra_file>"
# Recommended: server_args = user pass 1 /var/log/log-file 0 /etc/my.cnf.local"
}
UPDATE:
A more thorough procedure, including making sure mysqlchk is configured, can be found here https://www.percona.com/doc/percona-xtradb-cluster/5.6/howtos/virt_sandbox.html

haproxy as loadbalancer for kube-apiserver

I've set up a kubernetes cluster with three masters. The kube-apiserver should be stateless. To properly access them from the worker nodes, I've configured an haproxy which is configured to provide the ports (8080) of the apiserver.
frontend http_front_8080
bind *:8080
stats uri /haproxy?stats
default_backend http_back_8080
backend http_back_8080
balance roundrobin
server m01 192.168.33.21:8080 check
server m02 192.168.33.22:8080 check
server m03 192.168.33.23:8080 check
But when I run the nodes with the loadbalancers ip as the address of the apiserver I'll receive this errors:
Apr 20 12:35:07 n01 kubelet[3383]: E0420 12:35:07.308337 3383 reflector.go:271] pkg/kubelet/kubelet.go:240: Failed to watch *api.Service: too old resource version: 4001 (4041)
Apr 20 12:36:48 n01 kubelet[3383]: E0420 12:36:48.321021 3383 reflector.go:271] pkg/kubelet/kubelet.go:240: Failed to watch *api.Service: too old resource version: 4011 (4041)
Apr 20 12:37:31 n01 kube-proxy[3408]: E0420 12:37:31.381042 3408 reflector.go:271] pkg/proxy/config/api.go:47: Failed to watch *api.Service: too old resource version: 4011 (4041)
Apr 20 12:41:42 n01 kube-proxy[3408]: E0420 12:41:42.409604 3408 reflector.go:271] pkg/proxy/config/api.go:47: Failed to watch *api.Service: too old resource version: 4011 (4041)
If I change the loadbalancers IP to one of the masters nodes it works as expected (without these error messages above).
Am I something missing in my haproxy configuration which is vital for running this config?
I had the same issue as you. I assume the watch requires some sort of state on the api server side.
The solution is to change the configuration so all the requests from a client go to the same server using balance source. I assume you only have multiple api servers so kubernetes is highly available (instead of load balancing).
frontend http_front_8080
bind *:8080
stats uri /haproxy?stats
default_backend http_back_8080
backend http_back_8080
balance source
server m01 192.168.33.21:8080 check
server m02 192.168.33.22:8080 check
server m03 192.168.33.23:8080 check