HAproxy not routing from virtual IP - haproxy

I am currently trying to configure HAProxy to route between two servers using a virtual IP.
For testing I created two instances, 172.16.4.130 and 172.16.4.131. I am then creating a virtual IP address of 172.16.4.99, using keepalived which will be bridging the two servers. Both of these servers are running apache2, which is hosting a simple index.html landing page for testing. All of the above is running.
When I go to 172.16.4.99, the page does not load, nor am I redirected to either one of the index.html pages. I can however, ping this IP address. I feel like this is a simple configuration issue, and since I am not very experienced with HAproxy, I would like some assistance. Below are my haproxy.cfg files, as well as keepalived.
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
#log loghost local0 info
maxconn 4096
#debug
#quiet
user haproxy
group haproxy
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen webfarm 172.16.4.99:80
mode http
stats enable
stats auth user:password
balance roundrobin
cookie JSESSIONID prefix
option httpclose
option forwardfor
option httpchk HEAD /check.txt HTTP/1.0
server webA 172.16.4.130:8080 cookie A check
server webB 172.16.4.131:8080 cookie B check
keepalived.conf on 172.16.4.130
vrrp_script chk_haproxy { # Requires keepalived-1.1.13
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
weight 2 # add 2 points of prio if OK
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 101 # 101 on master, 100 on backup
virtual_ipaddress {
172.16.4.99
}
track_script {
chk_haproxy
}
}
keepalived.conf on 172.16.4.131:
vrrp_script chk_haproxy { # Requires keepalived-1.1.13
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
weight 2 # add 2 points of prio if OK
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 100 # 101 on master, 100 on backup
virtual_ipaddress {
172.16.4.99
}
track_script {
chk_haproxy
}
}

I have made similar structure to balancing transactions for the MYSQL. I can reach the MYSQL server behind virtual IP. Maybe my config helps you.
https://serverfault.com/questions/857241/haproxy-dont-balancing-requests-between-nodes-of-galera-cluster
It would be greate if it helps you.

Related

haproxy direct syslog data flow with log-forward

I’ve been working on creating a new syslog setup and have run into an issue, that i cannot find a solution for, so i thought maybe someone here could help me out.
I have a setup with 2 syslog servers and 2 haproxy nodes(in HA with keepalived). i have 2 endpoints on configured on the haproxy nodes: “endpoint_X” and “endpoint_Y” for different types of logs. I would like to control the flow of syslog messages, so that when syslog is send to “endpoint_X”:514 its send to syslog01 and when “endpoint_Y”:514 its send to syslog02. this is normally done with the use of ACL’s for normal frontends. But for syslog I use HAproxy’s “log-forward” function, where ACL’s is not supported for.
Below is an example of my config:
ring syslog01
description " "
format rfc3164
maxlen 1200
size 357913941
server syslog01 XXXXX_01:514 source YYYYY check
timeout client 90s
timeout connect 10s
timeout server 90s
timeout check 10s
ring syslog02
description " "
format rfc3164
maxlen 1200
size 357913941
server syslog02 XXXXX_02:514 source YYYYYY check
timeout client 90s
timeout connect 10s
timeout server 90s
timeout check 10s
log-forward syslog
bind 0.0.0.0:514
bind [::]:514
dgram-bind 0.0.0.0:514
dgram-bind [::]:514
log ring#syslog01 local0
log ring#syslog02 local0
does anyone have an idea if there is something i can do to get around this issue , so i can control the data flow in log-forward, other than using differen ports? I use haproxy version 2.6
i have tried some like the following, but as stated ACL does not work with log-forward:
acl acl_endpoint_X hdr(host) -i endpoint_X
acl acl_endpoint_X hdr(host) -i endpoint_Y
log ring#syslog01 local0 if endpoint_X hdr(host)
log ring#syslog02 local0 if endpoint_Y hdr(host)

Haproxy agent-check DRAIN(agent) status still accepts new connections

I am trying to perform a load balancing for my Node.js application which uses websockets. I need haproxy to stop load balance new connections on a server, which has reached its maximum number of connections, keeping the existing ones intact in the same time.
I do this by performing an agent-check for each of my servers. If a server cannot accept new connections it responds with "drain" to an agent-check. If a server is able to respond to a new connection, it responds with "ready" to an agent-check.
Here is my haproxy.cfg configuration file:
global
daemon
maxconn 240000
log /dev/log local0 debug
log /dev/log local1 notice
tune.ssl.default-dh-param 2048
defaults
mode http
log global
option httplog
option dontlognull
option dontlog-normal
option http-server-close
option redispatch
timeout connect 20000ms
timeout http-request 1m
timeout client 2100000ms
timeout server 2100000ms
timeout queue 30s
timeout check 5s
timeout http-keep-alive 180s
timeout tunnel 3600s
timeout tarpit 60s
frontend stats
bind *:8084
stats enable
stats uri /stats
stats refresh 10s
stats admin if TRUE
frontend test
mode http
bind *:5000
default_backend ws
backend ws
mode http
fullconn 100000
balance roundrobin
cookie SERVERID insert indirect nocache
server 1 backend1:9999 check agent-check agent-port 8080 cookie 1 inter 500 fall 1 rise 2
And here is how I respond to haproxy agent-check in my Node.js app:
const healthCheckServer = net.createServer((c) => {
let data = '';
if (currentConn < MAX_CONN) {
data += 'ready';
} else {
data += 'drain';
}
c.write(data + '\r\n');
c.destroy();
});
healthCheckServer.listen(8080, '0.0.0.0');
When number of connections to my application is reaching its maximum, haproxy correctly changes server status to DRAIN (agent) (I can observe this in haproxy web dashboard). The problem is, that new connections are still accepted by the application.
I am new to haproxy, so can someone point me to where I am wrong?
Found out that when server is drained by agent (status is set to DRAIN (agent)) and if server is the only one existing in the backend it will still accept new connections.
When there are multiple servers present and each server is drained the behavior is just like expected: haproxy returns HTTP 503.
UPDATE:
Turned out that I was looking in the wrong direction all the time.
First I had to mark my backend which processes WebSocket connections as a non-http (remove mode http line). My guess is haproxy incorrectly counts current sessions for http backend when using WebSockets. Removing mode http solved my problem.
Second, returning maxconn:<conn> in agent-check looks like much simple and more idiomatic way to limit the number of concurrent connections.
Sources:
https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#5.2-agent-check
https://www.haproxy.com/blog/websockets-load-balancing-with-haproxy/
I hope this will help someone.

HAProxy environment refusing to connect

I have an installation with 2 webservices behind a load balancer with HAProxy. While on service run by 3 servers responds quite fine, the other service with just one server doesn't.
So basically here's what should happen:
loadbalancer --> rancherPlatformAdministration if certain url is used
loadbalancer --> rancherServices for all other requests
Here's my haproxy.cfg:
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend http-in
bind *:80
# Define hosts
acl host_rancherAdmin hdr(host) -i admin.mydomain.tech
use_backend rancherPlatformAdministration if host_rancherAdmin
default_backend rancherServices
backend rancherServices
balance roundrobin
server rancherserver91 192.168.20.91:8080 check
server rancherserver92 192.168.20.92:8080 check
server rancherserver93 192.168.20.93:8080 check
backend rancherPlatformAdministration
server rancherapi01 192.168.20.20:8081 check
wget --server-response foo.mydomain.tech answers with a 401 which is respected behaviour as I am not providing a username nor a password. I can also open up foo.mydomain.tech with my browser an log in. So this part works as I said before.
wget --server-response 192.168.20.20:8081 (yes, this Tomcat really is running under 8081) locally from the loadbalancer responds with 200 and thus works just fine, while trying wget --server-response admin.mydomain.tech results in the following:
--2018-06-10 20:51:56-- http://admin.mydomain.tech/
Aufl"osen des Hostnamens admin.mydomain.tech (admin.mydomain.tech)... <PUBLIC IP>
Verbindungsaufbau zu admin.mydomain.tech (admin.mydomain.tech)|<PUBLIC IP>|:80 ... verbunden.
HTTP-Anforderung gesendet, auf Antwort wird gewartet ...
HTTP/1.0 503 Service Unavailable
Cache-Control: no-cache
Connection: close
Content-Type: text/html
2018-06-10 20:51:56 FEHLER 503: Service Unavailable.
I am pretty sure I am missing something here; I am aware of the differences in forwarding the request as a layer 4 or a layer 7 request – which seems to work just fine. I am providing mode http so I am on layer7...
Any hints on what's happening here or on how I can debug this?
Turns out that in my case the selinux was the showstopper – after putting it to permissive mode by setenforce 0, it just worked...
Since this change is not restart-persistent, I had to follow the instructions found here: https://www.tecmint.com/disable-selinux-temporarily-permanently-in-centos-rhel-fedora/

Why does HAProxy show that a server's check URL is a 404 when running curl on this URL is successful?

I'm setting up HAProxy to load-balance a resource between 3 back-ends. Here is the HAProxy config : (In the following snippets I replaced the actual domain name by example.net)
global
log 127.0.0.1 local2
log-send-hostname
maxconn 2000
pidfile /var/run/haproxy.pid
stats socket /var/run/haproxy.sock mode 600 level admin
stats timeout 30s
daemon
# SSL ciphers
...
defaults
mode http
option forwardfor
option contstats
option http-server-close
option log-health-checks
option redispatch
timeout connect 5000
timeout client 10000
timeout server 10000
...
frontend front
bind *:443 ssl crt /usr/local/etc/haproxy/front.pem
reqadd X-Forwarded-Proto:\ https if { ssl_fc }
stats uri /haproxy?stats
option httpclose
option forwardfor
default_backend back
balance source
backend back
balance roundrobin
option httpchk GET /healthcheck HTTP/1.0
server server1 xxx.xxx.xxx.xxx:80 check inter 5s fall 2 rise 1
server server2 yyy.yyy.yyy.yyy:8003 check backup
server mysite example.net:80 check backup
The issue is the following: even though the first 2 servers respond correctly, the domain-based one always shows as a 404:
What is counter-intuitive to me is that if I use curl to access this same healthcheck, I get an HTTP 200 (like I would expect to see in the HAProxy stats) :
curl -I http://example.net/healthcheck
HTTP/1.1 200 OK
When I ping my site, I get:
# ping example.net
PING example.net (217.160.0.195) 56(84) bytes of data.
64 bytes from 217-160-0-195.elastic-ssl.ui-r.com (217.160.0.195): icmp_seq=1 ttl=50 time=45.7 ms
Is it because the IP of my domain is shared with other domains (1&1 shared hosting) that HAProxy can't access it? Why is that and how to make HAProxy reach it correctly?

Haproxy 503 Service Unavailable . No server is available to handle this request

How does haproxy deal with static file , like .css, .js, .jpeg ? When I use my configure file , my brower says :
503 Service Unavailable
No server is available to handle this request.
This my config :
global
daemon
group root
maxconn 4000
pidfile /var/run/haproxy.pid
user root
defaults
log global
option redispatch
maxconn 65535
contimeout 5000
clitimeout 50000
srvtimeout 50000
retries 3
log 127.0.0.1 local3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
listen dashboard_cluster :8888
mode http
stats refresh 5s
balance roundrobin
option httpclose
option tcplog
#stats realm Haproxy \ statistic
acl url_static path_beg -i /static
acl url_static path_end -i .css .jpg .jpeg .gif .png .js
use_backend static_server if url_static
backend static_server
mode http
balance roundrobin
option httpclose
option tcplog
stats realm Haproxy \ statistic
server controller1 10.0.3.139:80 cookie controller1 check inter 2000 rise 2 fall 5
server controller2 10.0.3.113:80 cookie controller2 check inter 2000 rise 2 fall 5
Does my file wrong ? What should I do to solve this problem ? ths !
What I think is the cause:
There was no default_backend defined. 503 will be sent by HAProxy---this will appear as NOSRV in the logs.
Another Possible Cause
Based on one of my experiences, the HTTP 503 error I receive was due to my 2 bindings I have for the same IP and port x.x.x.x:80.
frontend test_fe
bind x.x.x.x:80
bind x.x.x.x:443 ssl blah
# more config here
frontend conflicting_fe
bind x.x.x.x:80
# more config here
Haproxy configuration check does not warn you about it and netstat doesn't show you 2 LISTEN entries, that's why it took a while to realize what's going on.
This can also happen if you have 2 haproxy services running. Please check the running processes and terminate the older one.
Try making the timers bigger and check that the server is reachable.
From the HAproxy docs:
It can happen from many reasons:
The status code is always 3-digit. The first digit indicates a general status :
- 1xx = informational message to be skipped (eg: 100, 101)
- 2xx = OK, content is following (eg: 200, 206)
- 3xx = OK, no content following (eg: 302, 304)
- 4xx = error caused by the client (eg: 401, 403, 404)
- 5xx = error caused by the server (eg: 500, 502, 503)
503 when no server was available to handle the request, or in response to
monitoring requests which match the "monitor fail" condition
When a server's maxconn is reached, connections are left pending in a queue
which may be server-specific or global to the backend. In order not to wait
indefinitely, a timeout is applied to requests pending in the queue. If the
timeout is reached, it is considered that the request will almost never be
served, so it is dropped and a 503 error is returned to the client.
if you see SC in the logs:
SC The server or an equipment between it and haproxy explicitly refused
the TCP connection (the proxy received a TCP RST or an ICMP message
in return). Under some circumstances, it can also be the network
stack telling the proxy that the server is unreachable (eg: no route,
or no ARP response on local network). When this happens in HTTP mode,
the status code is likely a 502 or 503 here.
Check ACLs, check timeouts... and check the logs, that's the most important...