HAProxy peering in kubernetes - kubernetes

Background
Due to our application needs to use sticky tables for a custom header, we decided to use HAProxy, our layout looks as follows:
Nginx Ingress -> HAproxy service -> headless services of stateful application
So far stickiness works fine, but there is a scenario where if handled by the other HAproxy replica, it fails. We are trying to use peers to address this problem.
I use bitnami helm chart to deploy it, this is my values file:
metadata:
chartName: bitnami/haproxy
chartVersion: 0.3.7
service:
type: ClusterIP
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080
- name: https
protocol: TCP
port: 443
targetPort: 8080
- name: peers
protocol: TCP
port: 10000
targetPort: 10000
containerPorts:
- name: http
containerPort: 8080
- name: https
containerPort: 8080
- name: peers
containerPort: 10000
configuration: |
global
log stdout format raw local0 debug
defaults
mode http
option httplog
timeout client 10s
timeout connect 5s
timeout server 10s
timeout http-request 10s
log global
resolvers default
nameserver dns1 172.20.0.10:53
hold timeout 30s
hold refused 30s
hold valid 10s
resolve_retries 3
timeout retry 3s
peers hapeers
peer $(MY_POD_IP):10000 # I attempted to do something like this
peer $(REPLICA_2_IP):10000 #
frontend stats
bind *:8404
stats enable
stats uri /
stats refresh 10s
frontend myfrontend
mode http
option httplog
bind *:8080
default_backend webservers
backend webservers
mode http
log stdout local0 debug
stick-table type string len 64 size 1m expire 1d peers hapeers
stick on req.hdr(MyHeader)
server s1 headless-service-1:8080 resolvers default check port 8080 inter 5s rise 2 fall 20
server s2 headless-service-2:8080 resolvers default check port 8080 inter 5s rise 2 fall 20
server s3 headless-service-3:8080 resolvers default check port 8080 inter 5s rise 2 fall 20
replicaCount: 2
extraEnvVars:
- name: LOG_LEVEL
value: debug
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
From what I read in HAProxy documentation, it requires the peers IP's, which in this case are the replicas IPs. However, the configmap does not allow injecting IPs from the HAProxy replicas.
I also thought of using a initContainer to modify the haproxy.cfg at deployment time with the correct IPs, but the volume is read-only and I would have to alter a fork of the chart to customize it.
If anyone has an idea of a different approach or workaround, I would appreciate the comments. Thanks!

...the configmap does not allow injecting IPs from the HAProxy replicas.
HAProxy's configuration supports environment variables. Eg. peer $(MY_POD_IP):10000 => peer ${MY_POD_IP}:10000

Related

K8S with Traefik and HAP not getting real client IP on pods

I have an external LB using HAProxy in order to have a HA k8s cluster. My cluster is in K3s from Rancher and it's using Traefik LB internally.
I'm currently facing an issue where in my pods I'm getting the Traefik IP instead of the real client IP.
HAP Configuration:
# Ansible managed
defaults
maxconn 1000
mode http
log global
option dontlognull
log stdout local0 debug
option httplog
timeout http-request 5s
timeout connect 5000
timeout client 2000000
timeout server 2000000
frontend k8s
bind *:6443
bind *:80
bind *:443
mode tcp
option tcplog
use_backend masters-k8s
backend masters-k8s
mode tcp
balance roundrobin
server master01 master01.k8s.int.ntw
server master02 master02.k8s.int.ntw
# end Ansible managed
Traefik Service:
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: traefik
meta.helm.sh/release-namespace: kube-system
service.beta.kubernetes.io/do-loadbalancer-enable-proxy-protocol: "true"
labels:
app: traefik
app.kubernetes.io/managed-by: Helm
chart: traefik-1.81.0
heritage: Helm
release: traefik
spec:
clusterIP: 10.43.250.142
clusterIPs:
- 10.43.250.142
externalTrafficPolicy: Local
ports:
- name: http
nodePort: 32232
port: 80
protocol: TCP
targetPort: http
- name: https
nodePort: 30955
port: 443
protocol: TCP
targetPort: https
selector:
app: traefik
release: traefik
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- ip: 10.0.1.1
- ip: 10.0.1.11
- ip: 10.0.1.12
- ip: 10.0.1.2
With this configurations I never get my real IP on the pods, during some research I saw that people recommend to use send-proxy in the HAP like this:
server master01 master01.k8s.int.ntw check send-proxy-v2
server master02 master02.k8s.int.ntw check send-proxy-v2
But when I do so, all my cluster communication returns ERR_CONNECTION_CLOSED.
If I'm looking at it correctly, this means its going from the HAP to the cluster and the cluster is somewhere rejecting the traffic.
Any clues what I'm missing here?
Thanks
Well you have two options.
use proxy protocol
use X-Forwarded-For header
Option 1: proxy protocol
This option requires that both sites HAProxy and Traefik uses the proxy protocol, that's the reason why the people recommend "send-proxy-v2".
This requires also that all other clients which want to connect to Traefik MUST also use the proxy protocol, if the clients does not use the proxy protocol then you will get what you get a communication error.
As you configured HAProxy in TCP mode is this the only option to get the client ip to Traefik.
Option 2: X-Forwarded-For
I personally would use this option because it makes it possible to connect to traefik with any http(s) client.
This requires that you use the HTTP mode in haproxy and some more parameters like option forwardfor.
# Ansible managed
defaults
maxconn 1000
mode http
log global
option dontlognull
log stdout local0 debug
option httplog
timeout http-request 5s
timeout connect 5s
timeout client 200s
timeout server 200s
# send client ip in the x-forwarded-for header
option forwardfor
frontend k8s
bind *:6443 v4v6 alpn h2,http/1.1 ssl ca-file /etc/haproxy/letsencryptauthorityx3.pem crt /etc/ssl/haproxy/
bind *:80 v4v6 alpn h2,http/1.1 ssl ca-file /etc/haproxy/letsencryptauthorityx3.pem crt /etc/ssl/haproxy/
bind *:443 v4v6 alpn h2,http/1.1 ssl ca-file /etc/haproxy/letsencryptauthorityx3.pem crt /etc/ssl/haproxy/
use_backend masters-k8s
backend masters-k8s
balance roundrobin
server master01 master01.k8s.int.ntw check
server master02 master02.k8s.int.ntw check
# end Ansible managed
In the File "/etc/haproxy/letsencryptauthorityx3.pem" are the CA's for the backends, in the directory "/etc/ssl/haproxy/" are the certificates for the frontends.
Please take a look into the documentation about the crt keyword.
You have also to configure traefik to allow the header from haproxy forwarded-headers

kubernetes expose services with Traefik 2.x as ingress with CRD

What do i have working
I have a Kubernetes cluster as follow:
Single control plane (but plan to extend to 3 control plane for HA)
2 worker nodes
On this cluster i deployed (following this doc from traefik https://docs.traefik.io/user-guides/crd-acme/):
A deployment that create two pods :
traefik itself: which will be in charge of routing with exposed port 80, 8080
whoami:a simple http server thats responds to http requests
two services
traefik service:
whoami servic:
One traefik IngressRoute:
What do i want
I have multiple services running in the cluster and i want to expose them to the outside using Ingress.
More precisely i want to use the new Traefik 2.x CDR ingress methods.
My ultimate goal is to use new traefiks 2.x CRD to expose resources on port 80, 443, 8080 using IngressRoute Custom resource definitions
What's the problem
If i understand well, classic Ingress controllers allow exposition of every ports we want to the outside world (including 80, 8080 and 443).
But with the new traefik CDR ingress approach on it's own it does not exports anything at all.
One solution is to define the Traefik service as a loadbalancer typed service and then expose some ports. But you are forced to use the 30000-32767 ports range (same as nodeport), and i don't want to add a reverse proxy in front of the reverse proxy to be able to expose port 80 and 443...
Also i've seed from the doc of the new igress CRD (https://docs.traefik.io/user-guides/crd-acme/) that:
kubectl port-forward --address 0.0.0.0 service/traefik 8000:8000 8080:8080 443:4443 -n default
is required, and i understand that now. You need to map the host port to the service port.
But mapping the ports that way feels clunky and counter intuitive. I don't want to have a part of the service description in a yaml and at the same time have to remember that i need to map port with kubectl.
I'm pretty sure there is a neat and simple solution to this problem, but i can't understand how to keep things simple. Do you guys have an experience in kubernetes with the new traefik 2.x CRD config?
apiVersion: v1
kind: Service
metadata:
name: traefik
spec:
ports:
- protocol: TCP
name: web
port: 80
targetPort: 8000
- protocol: TCP
name: admin
port: 8080
targetPort: 8080
- protocol: TCP
name: websecure
port: 443
targetPort: 4443
selector:
app: traefik
have you tried to use tragetPort where every request comes on 80 redirect to 8000 but when you use port-forward you need to always use service instead of pod
You can try to use LoadBalancer service type for expose the Traefik service on ports 80, 443 and 8080. I've tested the yaml from the link you provided in GKE, and it's works.
You need to change the ports on 'traefik' service and add a 'LoadBalancer' as service type:
kind: Service
metadata:
name: traefik
spec:
ports:
- protocol: TCP
name: web
port: 80 <== Port to receive HTTP connections
- protocol: TCP
name: admin
port: 8080 <== Administration port
- protocol: TCP
name: websecure
port: 443 <== Port to receive HTTPS connections
selector:
app: traefik
type: LoadBalancer <== Define the type load balancer
Kubernetes will create a Loadbalancer for you service and you can access your application using ports 80 and 443.
$ curl https://35.111.XXX.XX/tls -k
Hostname: whoami-5df4df6ff5-xwflt
IP: 127.0.0.1
IP: 10.60.1.11
RemoteAddr: 10.60.1.13:55262
GET /tls HTTP/1.1
Host: 35.111.XXX.XX
User-Agent: curl/7.66.0
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.60.1.1
X-Forwarded-Host: 35.111.XXX.XX
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-66dd84c65c-4c5gp
X-Real-Ip: 10.60.1.1
$ curl http://35.111.XXX.XX/notls
Hostname: whoami-5df4df6ff5-xwflt
IP: 127.0.0.1
IP: 10.60.1.11
RemoteAddr: 10.60.1.13:55262
GET /notls HTTP/1.1
Host: 35.111.XXX.XX
User-Agent: curl/7.66.0
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.60.1.1
X-Forwarded-Host: 35.111.XXX.XX
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-66dd84c65c-4c5gp
X-Real-Ip: 10.60.1.1
Well after some time i've decided to put an haproxy in front of the kubernetes Cluster. It's seems to be the only solution ATM.

Kubernetes network policy egress ports

I have the following network policy for restricting access to a frontend service page:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
namespace: namespace-a
name: allow-frontend-access-from-external-ip
spec:
podSelector:
matchLabels:
app: frontend-service
ingress:
- from:
- ipBlock:
cidr: 0.0.0.0/0
ports:
- protocol: TCP
port: 443
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
ports:
- protocol: TCP
port: 443
My question is: can I enforce HTTPS with my egress rule (port restriction on 443) and if so, how does this work? Assuming a client connects to the frontend-service, the client chooses a random port on his machine for this connection, how does Kubernetes know about that port, or is there a kind of port mapping in the cluster so the traffic back to the client is on port 443 and gets mapped back to the clients original port when leaving the cluster?
You might have a wrong understanding of the network policy(NP).
This is how you should interpret this section:
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
ports:
- protocol: TCP
port: 443
Open port 443 for outgoing traffic for all pods within 0.0.0.0/0 cidr.
The thing you are asking
how does Kubernetes know about that port, or is there a kind of port
mapping in the cluster so the traffic back to the client is on port
443 and gets mapped back to the clients original port when leaving the
cluster?
is managed by kube-proxy in following way:
For the traffic that goes from pod to external addresses, Kubernetes simply uses SNAT. What it does is replace the pod’s internal source IP:port with the host’s IP:port. When the return packet comes back to the host, it rewrites the pod’s IP:port as the destination and sends it back to the original pod. The whole process is transparent to the original pod, who doesn’t know the address translation at all.
Take a look at Kubernetes networking basics for a better understanding.

gRPC & HTTP servers on GKE Ingress failing healthcheck for gRPC backend

I want to deploy a gRPC + HTTP servers on GKE with HTTP/2 and mutual TLS. My deployment have both a readiness probe and liveness probe with custom path. I expose both the gRPC and HTTP servers via an Ingress.
deployment's probes and exposed ports:
livenessProbe:
failureThreshold: 3
httpGet:
path: /_ah/health
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /_ah/health
port: 8443
scheme: HTTPS
name: grpc-gke
ports:
- containerPort: 8443
protocol: TCP
- containerPort: 50052
protocol: TCP
NodePort service:
apiVersion: v1
kind: Service
metadata:
name: grpc-gke-nodeport
labels:
app: grpc-gke
annotations:
cloud.google.com/app-protocols: '{"grpc":"HTTP2","http":"HTTP2"}'
service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2", "http": "HTTP2"}'
spec:
type: NodePort
ports:
- name: grpc
port: 50052
protocol: TCP
targetPort: 50052
- name: http
port: 443
protocol: TCP
targetPort: 8443
selector:
app: grpc-gke
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grpc-gke-ingress
annotations:
kubernetes.io/ingress.allow-http: "false"
#kubernetes.io/ingress.global-static-ip-name: "grpc-gke-ip"
labels:
app: grpc-gke
spec:
rules:
- http:
paths:
- path: /_ah/*
backend:
serviceName: grpc-gke-nodeport
servicePort: 443
backend:
serviceName: grpc-gke-nodeport
servicePort: 50052
The pod does exist, and has a "green" status, before creating the liveness and readiness probes. I see regular logs on my server that both the /_ah/live and /_ah/ready are called by the kube-probe and the server responds with the 200 response.
I use a Google managed TLS certificate on the load balancer (LB). My HTTP server creates a self-signed certificate -- inspired by this blog.
I create the Ingress after I start seeing the probes' logs. After that it creates an LB with two backends, one for the HTTP and one for the gRPC. The HTTP backend's health checks are OK and the HTTP server is accessible from the Internet. The gRPC backend's health check fails thus the LB does not route the gRPC protocol and I receive the 502 error response.
This is with GKE master 1.12.7-gke.10. I also tried newer 1.13 and older 1.11 masters. The cluster has HTTP load balancing enabled and VPC-native enabled. There are firewall rules to allow access from LB to my pods (I even tried to allow all ports from all IP addresses). Delaying the probes does not help either.
Funny thing is that I deployed nearly the same setup, just the server's Docker image is different, couple of months ago and it is running without any issues. I can even deploy new Docker images of the server and everything is great. I cannot find any difference between these two.
There is one another issue, the Ingress is stuck on the "Creating Ingress" state for days. It never finishes and never sees the LB. The Ingress' LB never has a front-end and I always have to manually add an HTTP/2 front-end with a static IP and Google managed TLS certificate. This should be happening only for cluster which were created without "HTTP load balancing", but it happens in my case every time for all my "HTTP load balancing enabled" clusters. The working deployment is in this state for months already.
Any ideas why the gRPC backend's health check could be failing even though I see logs that the readiness and liveness endpoints are called by kube-probe?
EDIT:
describe svc grpc-gke-nodeport
Name: grpc-gke-nodeport
Namespace: default
Labels: app=grpc-gke
Annotations: cloud.google.com/app-protocols: {"grpc":"HTTP2","http":"HTTP2"}
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"cloud.google.com/app-protocols":"{\"grpc\":\"HTTP2\",\"http\":\"HTTP2\"}",...
service.alpha.kubernetes.io/app-protocols: {"grpc":"HTTP2", "http": "HTTP2"}
Selector: app=grpc-gke
Type: NodePort
IP: 10.4.8.188
Port: grpc 50052/TCP
TargetPort: 50052/TCP
NodePort: grpc 32148/TCP
Endpoints: 10.0.0.25:50052
Port: http 443/TCP
TargetPort: 8443/TCP
NodePort: http 30863/TCP
Endpoints: 10.0.0.25:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
and the health check for the gRPC backend is an HTTP/2 GET using path / on port 32148. Its description is "Default kubernetes L7 Loadbalancing health check." where as the description of the HTTP's back-end health check is "Kubernetes L7 health check generated with readiness probe settings.". Thus the health check for the gRPC back-end is not created from the readiness probe.
Editing the health check to point to port 30863 an changing the path to readiness probe fixes the issue.
GKE ingress just recently started supporting full gRPC support in beta (whereas HTTP2 ro HTTP1.1 conversion was used in the past). To use gRCP though, you need to add an annotation to the ingress "cloud.google.com/app-protocols: '{"http2-service":"HTTP2"}'".
Refer to this how-to doc for more detais.
Editing the health check to point to the readiness probe's path and changed the port to the one of the HTTP back-end fixed this issue (look for the port in the HTTP back-end's health check. it is the NodePort's.). It runs know without any issues.
Using the same health check for the gRPC back-end as for the HTTP back-end did not work, it was reset back to its own health check. Even deleting the gRPC back-end's health check did not help, it was recreated. Only editing it to use a different port and path has helped.

Accessing service health checks ports after configuring istio

So we're deploying istio 1.0.2 with global mtls and so far it's gone well.
For health checks we've added separate ports to the services and configured them as per the docs:
https://istio.io/docs/tasks/traffic-management/app-health-check/#mutual-tls-is-enabled
Our application ports are now on 8080 and health checks ports are on 8081.
After doing this Kubernetes is able to do health checks and the services appear to be running normally.
However our monitoring solution cannot hit the health check port.
The monitoring application also sits in kubernetes and is currently outside the mesh. The above doc says the following:
Because the Istio proxy only intercepts ports that are explicitly declared in the containerPort field, traffic to 8002 port bypasses the Istio proxy regardless of whether Istio mutual TLS is enabled.e
This is how we have it configured. So in our case 8081 should be outside the mesh:
livenessProbe:
failureThreshold: 3
httpGet:
path: /manage/health
port: 8081
scheme: HTTP
initialDelaySeconds: 180
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: <our-service>
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /manage/health
port: 8081
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
However we can't access 8081 from another pod which is outside the mesh.
For example:
curl http://<our-service>:8081/manage/health
curl: (7) Failed connect to <our-service>:8081; Connection timed out
If we try from another pod inside the mesh istio throws back a 404, which is perhaps expected.
I tried to play around with destination rules like this:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: <our-service>-health
spec:
host: <our-service>.namepspace.svc.cluster.local
trafficPolicy:
portLevelSettings:
- port:
number: 8081
tls:
mode: DISABLE
But that just kills all connectivity to the service, both internally and through the ingress gateway.
According to the official Istio Documentation port 8081 will not get through Istio Envoy, hence won’t be accessible for the other Pods outside your service mesh, because Istio proxy determines only the value of containerPort transmitting through the Pod's service.
In case you build Istio service mesh without TLS authentication between Pods, there is an option to use the same port for the basic network route to the Pod's service and the readiness/liveness probes.
However, if you use port 8001 for both regular traffic and liveness
probes, health check will fail when mutual TLS is enabled because the
HTTP request is sent from Kubelet, which does not send client
certificate to the liveness-http service.
Assuming that Istio Mixer provides a three Prometheus endpoints, you can consider using Prometheus as the main monitoring tool in order to collect and analyze the mesh metrics.