Haproxy doesnt check backend in tcp mode - haproxy

there is a problem with checking health status in haproxy 1.5.
In sample-backend or sample-backend2 haproxy doesnt check status in tcp mode, it always checks in L7 mode even if i specify tcp mode. And it always UP.
Here is my config:
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode tcp
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend haproxy_in
mode http
bind 172.25.0.33:80
option forwardfor header X-Real-IP
acl host_static hdr_beg(host) -i sample.
use_backend sample-backend if host_static
acl host_static hdr_beg(host) -i af-ws.
use_backend sample-backend if host_static
default_backend haproxy_http
frontend haproxy_in_htpps
bind 172.25.0.33:443
mode tcp
use_backend haproxy_https
backend haproxy_https
balance roundrobin
mode tcp
option httpchk OPTIONS /manager/html
http-check expect status 401
option forwardfor
server web1 172.25.0.35:443 check addr 172.25.0.35 port 8085 inter 5000
server web2 172.25.0.36:443 check addr 172.25.0.36 port 8085 inter 5000
backend haproxy_http
balance roundrobin
mode http
option httpchk
option forwardfor
server web1 172.25.0.35:80 check
server web2 172.25.0.36:80 check backup
backend sample-backend
mode http
balance roundrobin
# option httpchk get /?action=Ping
# option forwardfor
option tcp-check
# server web4 172.25.0.38:80 check addr 172.25.0.38 port 8888 inter 5000
# server web3 172.25.0.37:80 check addr 172.25.0.37 port 8888 inter 5000
server test 10.41.41.240:8888 check addr 10.41.41.240 port 8888 inter 5000
server test1 172.25.0.37:9999 check addr 172.25.0.37 port 9999 inter 5000
backend sample-backend2
mode tcp
balance roundrobin
# option httpchk get /?action=Ping
# option forwardfor
option tcp-check
# server web4 172.25.0.38:80 check addr 172.25.0.38 port 8888 inter 5000
# server web3 172.25.0.37:80 check addr 172.25.0.37 port 8888 inter 5000
server test2 10.41.41.240:8888 check addr 10.41.41.240 port 8888 inter 5000
server test3 172.25.0.37:9999 check addr 172.25.0.37 port 9999 inter 5000
Where is my mistake ? Thanks!

My version was 1.5.4 , so it has bug in tcpcheck. Update to latest version and it works.

Related

Connection to server java.net.SocketException: Socket closed

I'm trying to connect to my google vps server, but constantly getting error :
java.net.SocketTimeoutException: timeout
java.net.SocketException: Socket closed
I've created system service on my server, which listens on the port 8080. I've forwarded the default http traffic to port 8080 and made sure ports 80 & 8080 are open:
iptables -t nat -I PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 8080
sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 8080 -j ACCEPT
I've also saved the iptable rules :
sudo apt-get install iptables-persistent
I've checked if the service actually listens on the port by sudo netstat -tunlp:
tcp6 0 0 :::8080 :::* LISTEN 5789/java -> it does
This is my retrofitBuilder in app, which is trying to connect to server with standard http port :
return Retrofit.Builder()
.baseUrl("http://34.118.22.134/")
.addConverterFactory(MoshiConverterFactory.create())
.build()
.create()
}
When testing the service locally, it works as expected,also the service on the servers works fine.
When I do sudo ss -ltnp, I see that port 80 is not in "listening state", only port 8080 and several others are. I dont want to use uwf to enable it because that will disrupt the SSH connection.
The postman can't reach server as well and it's throwing 500-internal server error.
I do not manipulate sockets in code in any way.

Haproxy SSL(https) health checks without terminating ssl

so I can't figure out a proper way to do the SSL check, I am not using certificates, just need to check against a HTTPS websites url (google.com/ for example)
Trying multiple combinations at a time, without success. Maybe someone has a similar configuration,
backends using -
> check-sni google.com sni ssl_fc_sni
returns - reason: Layer7 wrong status, code: 301, info: "Moved Permanently"
check port 80 check-ssl -
reason: Layer6 invalid response, info: "SSL handshake failure"
All others just timing out. Here's the complete configuration file-
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
ssl-server-verify none
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
# An alternative list with additional directives can be obtained from
# https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend myfront
bind *:8000
mode tcp
tcp-request inspect-delay 5s
default_backend backend1
listen stats
bind :444
stats enable
stats uri /
stats hide-version
stats auth test:test
backend Backends
balance roundrobin
option forwardfor
option httpchk
http-check send hdr host google.com meth GET uri /
http-check expect status 200
#http-check connect
#http-check send meth GET uri / ver HTTP/1.1 hdr host haproxy.1wt.eu
#http-check expect status 200-399
#http-check connect port 443 ssl sni haproxy.1wt.eu
#http-check send meth GET uri / ver HTTP/1.1 hdr host haproxy.1wt.eu
#http-check expect status 200-399
#http-check connect port 443 ssl sni google.com
#http-check send meth GET uri / ver HTTP/1.1 hdr host google.com
default-server fall 10 rise 1
server Node1011 192.168.0.2:1011 check inter 15s check-ssl check port 443
server Node1012 192.168.0.2:1012 check inter 15s check-ssl check port 443
server Node1015 192.168.0.2:1015 check inter 15s check port 443
server Node1017 192.168.0.2:1017 check inter 15s check-ssl check-sni google.com sni ssl_fc_sni
server Node1018 192.168.0.2:1018 check inter 15s check-ssl check-sni google.com sni ssl_fc_sni
server Node1019 192.168.0.2:1019 check inter 15s check-sni google.com sni ssl_fc_sni
server Node1020 192.168.0.2:1020 check inter 15s check port 443 check-ssl
server Node1021 192.168.0.2:1021 check inter 15s check port 443 check-ssl
server Node1027 192.168.0.2:1027 check inter 15s check port 80
server Node1028 192.168.0.2:1028 check inter 15s check port 80
server Node1029 192.168.0.2:1029 check inter 15s check port 80
server Node1030 192.168.0.2:1030 check inter 15s check port 80 check-ssl
server Node1031 192.168.0.2:1031 check inter 15s check port 80 check-ssl
server Node1033 192.168.0.2:1033 check inter 15s check port 80 check-ssl verify none
server Node1034 192.168.0.2:1034 check inter 15s check port 80 check-ssl verify none
server Node1035 192.168.0.2:1035 check inter 15s check-ssl
server Node1036 192.168.0.2:1036 check inter 15s check-ssl
server Node1048 192.168.0.2:1048 check inter 15s check-ssl verify none
server Node1049 192.168.0.2:1049 check inter 15s check-ssl verify none
P.s Found a website, which explains just what I'm trying to do(https://hodari.be/posts/2020_09_04_configure_sni_for_haproxy_backends/), but that doesn't work either, my haproxy version is 2.2.3
P.s.s I am literally trying to check against www.google.com , just to be clear.
Thank you!
That's really not an error. If you do a curl to https://google.com it does do a 301 redirect to https://www.google.com/. I snipped out some protocol details below for brevity, but you get the idea.
Either change your expect to 301, or use www.google.com.
paul:~ $ curl -vv https://google.com
* Rebuilt URL to: https://google.com/
* Trying 172.217.1.206...
-[snip]-
> GET / HTTP/2
> Host: google.com
> User-Agent: curl/7.58.0
> Accept: */*
>
-[snip]-
< HTTP/2 301
< location: https://www.google.com/
< content-type: text/html; charset=UTF-8
< date: Mon, 18 Jan 2021 03:42:04 GMT
< expires: Wed, 17 Feb 2021 03:42:04 GMT
< cache-control: public, max-age=2592000
< server: gws
< content-length: 220
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
<
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
So, if you want to avoid the 301, use the www.google.com value in your config as thus:
http-check send hdr host www.google.com meth GET uri /

How do GCP load balancers route traffic to GKE services?

I'm relatively new (< 1 year) to GCP, and I'm still in the process of mapping the various services onto my existing networking mental model.
Once knowledge gap I'm struggling to fill is how HTTP requests are load balanced to services running in our GKE clusters.
On a test cluster, I created a service in front of pods that serve HTTP:
apiVersion: v1
kind: Service
metadata:
name: contour
spec:
ports:
- port: 80
name: http
protocol: TCP
targetPort: 8080
- port: 443
name: https
protocol: TCP
targetPort: 8443
selector:
app: contour
type: LoadBalancer
The service is listening on node ports 30472 and 30816.:
$ kubectl get svc contour
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
contour LoadBalancer 10.63.241.69 35.x.y.z 80:30472/TCP,443:30816/TCP 41m
A GCP network load balancer is automatically created for me. It has its own public IP at 35.x.y.z, and is listening on ports 80-443:
Curling the load balancer IP works:
$ curl -q -v 35.x.y.z
* TCP_NODELAY set
* Connected to 35.x.y.z (35.x.y.z) port 80 (#0)
> GET / HTTP/1.1
> Host: 35.x.y.z
> User-Agent: curl/7.62.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Mon, 07 Jan 2019 05:33:44 GMT
< server: envoy
< content-length: 0
<
If I ssh into the GKE node, I can see the kube-proxy is listening on the service nodePorts (30472 and 30816) and nothing has a socket listening on ports 80 or 443:
# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:20256 0.0.0.0:* LISTEN 1022/node-problem-d
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 1221/kubelet
tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 1369/kube-proxy
tcp 0 0 0.0.0.0:5355 0.0.0.0:* LISTEN 297/systemd-resolve
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 330/sshd
tcp6 0 0 :::30816 :::* LISTEN 1369/kube-proxy
tcp6 0 0 :::4194 :::* LISTEN 1221/kubelet
tcp6 0 0 :::30472 :::* LISTEN 1369/kube-proxy
tcp6 0 0 :::10250 :::* LISTEN 1221/kubelet
tcp6 0 0 :::5355 :::* LISTEN 297/systemd-resolve
tcp6 0 0 :::10255 :::* LISTEN 1221/kubelet
tcp6 0 0 :::10256 :::* LISTEN 1369/kube-proxy
Two questions:
Given nothing on the node is listening on ports 80 or 443, is the load balancer directing traffic to ports 30472 and 30816?
If the load balancer is accepting traffic on 80/443 and forwarding to 30472/30816, where can I see that configuration? Clicking around the load balancer screens I can't see any mention of ports 30472 and 30816.
I think I found the answer to my own question - can anyone confirm I'm on the right track?
The network load balancer redirects the traffic to a node in the cluster without modifying the packet - packets for port 80/443 still have port 80/443 when they reach the node.
There's nothing listening on ports 80/443 on the nodes. However kube-proxy has written iptables rules that match packets to the load balancer IP, and rewrite them with the appropriate ClusterIP and port:
You can see the iptables config on the node:
$ iptables-save | grep KUBE-SERVICES | grep loadbalancer
-A KUBE-SERVICES -d 35.x.y.z/32 -p tcp -m comment --comment "default/contour:http loadbalancer IP" -m tcp --dport 80 -j KUBE-FW-D53V3CDHSZT2BLQV
-A KUBE-SERVICES -d 35.x.y.z/32 -p tcp -m comment --comment "default/contour:https loadbalancer IP" -m tcp --dport 443 -j KUBE-FW-J3VGAQUVMYYL5VK6
$ iptables-save | grep KUBE-SEP-ZAA234GWNBHH7FD4
:KUBE-SEP-ZAA234GWNBHH7FD4 - [0:0]
-A KUBE-SEP-ZAA234GWNBHH7FD4 -s 10.60.0.30/32 -m comment --comment "default/contour:http" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZAA234GWNBHH7FD4 -p tcp -m comment --comment "default/contour:http" -m tcp -j DNAT --to-destination 10.60.0.30:8080
$ iptables-save | grep KUBE-SEP-CXQOVJCC5AE7U6UC
:KUBE-SEP-CXQOVJCC5AE7U6UC - [0:0]
-A KUBE-SEP-CXQOVJCC5AE7U6UC -s 10.60.0.30/32 -m comment --comment "default/contour:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CXQOVJCC5AE7U6UC -p tcp -m comment --comment "default/contour:https" -m tcp -j DNAT --to-destination 10.60.0.30:8443
An interesting implication is the the nodePort is created but doesn't appear to be used. That matches this comment in the kube docs:
Google Compute Engine does not need to allocate a NodePort to make LoadBalancer work
It also explains why GKE creates an automatic firewall rule that allows traffic from 0.0.0.0/0 towards ports 80/443 on the nodes. The load balancer isn't rewriting the packets, so the firewall needs to allow traffic from anywhere to reach iptables on the node, and it's rewritten there.
To understand LoadBalancer services, you first have to grok NodePort services. The way those work is that there is a proxy (usually actually implemented in iptables or ipvs now for perf but that's an implementation detail) on every node in your cluster, and when create a NodePort service it picks a port that is unused and sets every one of those proxies to forward packets to your Kubernetes pod. A LoadBalancer service builds on top of that, so on GCP/GKE it creates a GCLB forwarding rule mapping the requested port to a rotation of all those node-level proxies. So the GCLB listens on port 80, which proxies to some random port on a random node, which proxies to the internal port on your pod.
The process is a bit more customizable than that, but that's the basic defaults.

Network connectivity/DNS issues on a GKE 1.10 kubernetes cluster

I'm running into DNS issues on a GKE 1.10 kubernetes cluster. Occasionally pods start without any network connectivity. Restarting the pod tends to fix the issue.
Here's the result of the same few commands inside a container without network, and one with.
BROKEN:
kc exec -it -n iotest app1-b67598997-p9lqk -c userapp sh
/app $ nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve
/app $ cat /etc/resolv.conf
nameserver 10.63.240.10
search iotest.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal google.internal
options ndots:5
/app $ curl -I 10.63.240.10
curl: (7) Failed to connect to 10.63.240.10 port 80: Connection refused
/app $ netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:8001 0.0.0.0:* LISTEN 1/python
tcp 0 0 ::1:50051 :::* LISTEN 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 :::* LISTEN 1/python
WORKING:
kc exec -it -n iotest app1-7d985bfd7b-h5dbr -c userapp sh
/app $ nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve
Name: www.google.com
Address 1: 74.125.206.147 wk-in-f147.1e100.net
Address 2: 74.125.206.105 wk-in-f105.1e100.net
Address 3: 74.125.206.99 wk-in-f99.1e100.net
Address 4: 74.125.206.104 wk-in-f104.1e100.net
Address 5: 74.125.206.106 wk-in-f106.1e100.net
Address 6: 74.125.206.103 wk-in-f103.1e100.net
Address 7: 2a00:1450:400c:c04::68 wk-in-x68.1e100.net
/app $ cat /etc/resolv.conf
nameserver 10.63.240.10
search iotest.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal google.internal
options ndots:5
/app $ curl -I 10.63.240.10
HTTP/1.1 404 Not Found
date: Sun, 29 Jul 2018 15:13:47 GMT
server: envoy
content-length: 0
/app $ netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:15000 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:15001 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:8001 0.0.0.0:* LISTEN 1/python
tcp 0 0 10.60.2.6:56508 10.60.48.22:9091 ESTABLISHED -
tcp 0 0 127.0.0.1:57768 127.0.0.1:50051 ESTABLISHED -
tcp 0 0 10.60.2.6:43334 10.63.255.44:15011 ESTABLISHED -
tcp 0 0 10.60.2.6:15001 10.60.45.26:57160 ESTABLISHED -
tcp 0 0 10.60.2.6:48946 10.60.45.28:9091 ESTABLISHED -
tcp 0 0 127.0.0.1:49804 127.0.0.1:50051 ESTABLISHED -
tcp 0 0 ::1:50051 :::* LISTEN 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 :::* LISTEN 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 ::ffff:127.0.0.1:49804 ESTABLISHED 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 ::ffff:127.0.0.1:57768 ESTABLISHED 1/python
These pods are identical, just one was restarted.
Does anyone have advice about how to analyse and fix this issue?
Some steps to try:
1) ifconfig eth0 or whatever the primary interface is.
Is the interface up? Are the tx and rx packet counts increasing?
2)If interface is up, you can try tcpdump as you are running the nslookup command that you posted. See if the dns request packets are getting sent out.
3) See which node the pod is scheduled on, when network connectivity gets broken. Maybe it is on the same node every time? If yes, are other pods on that node running into similar problem?
I also faced the same problem, and I simply worked around it for now by switching to the 1.9.x GKE version (after spending many hours trying to debug why my app wasn't working).
Hope this helps!

proxy Memcache_Servers has no server available

proxy Memcache_Servers has no server available, when I start the haproxy.service:
[root#ha-node1 log]# systemctl restart haproxy.service
Message from syslogd#localhost at Aug 2 10:49:23 ...
haproxy[81665]: proxy Memcache_Servers has no server available!
The configuration in my haproxy.cfg:
listen Memcache_Servers
bind 45.117.40.168:11211
balance roundrobin
mode tcp
option tcpka
server ha-node1 ha-node1:11211 check inter 10s fastinter 2s downinter 2s rise 30 fall 3
server ha-node2 ha-node2:11211 check inter 10s fastinter 2s downinter 2s rise 30 fall 3
server ha-node3 ha-node3:11211 check inter 10s fastinter 2s downinter 2s rise 30 fall 3
At last, I found the ip in my hosts is like below:
[root#ha-node1 sysconfig]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.8.101 ha-node1 ha-node1.aa.com
192.168.8.102 ha-node2 ha-node2.aa.com
192.168.8.103 ha-node3 ha-node3.aa.com
45.117.40.168 ha-vhost devops.aa.com
192.168.8.104 nfs-backend backend.aa.com
But in my /etc/sysconfig/memcached, the ip is not the host ip before, so I changed to the ip in the hosts:
Now I restart the memcached and haproxy, it works normal now.