GCP HTTP(S) load balancer ignoring GKE readinessProbe specification - kubernetes

I’ve already seen this question; AFAIK I’m doing everything in the answers there.
Using GKE, I’ve deployed a GCP HTTP(S) load balancer-based ingress for a kubernetes cluster containing two almost identical deployments: production and development instances of the same application.
I set up a dedicated port on each pod template to use for health checks by the load balancer so that they are not impacted by redirects from the root path on the primary HTTP port. However, the health checks are consistently failing.
From these docs I added a readinessProbe parameter to my deployments, which the load balancer seems to be ignoring completely.
I’ve verified that the server on :p-ready (9292; the dedicated health check port) is running correctly using the following (in separate terminals):
➜ kubectl port-forward deployment/d-an-server p-ready
➜ curl http://localhost:9292/ -D -
HTTP/1.1 200 OK
content-length: 0
date: Wed, 26 Feb 2020 01:21:55 GMT
What have I missed?
A couple notes on the below configs:
The ${...} variables below are filled by the build script as part of deployment.
The second service (s-an-server-dev) is almost an exact duplicate of the first (with it’s own deployment) just with -dev suffixes on the names and labels.
Deployment
apiVersion: "apps/v1"
kind: "Deployment"
metadata:
name: "d-an-server"
namespace: "default"
labels:
app: "a-an-server"
spec:
replicas: 1
selector:
matchLabels:
app: "a-an-server"
template:
metadata:
labels:
app: "a-an-server"
spec:
containers:
- name: "c-an-server-app"
image: "gcr.io/${PROJECT_ID}/an-server-app:${SHORT_SHA}"
ports:
- name: "p-http"
containerPort: 8080
- name: "p-ready"
containerPort: 9292
readinessProbe:
httpGet:
path: "/"
port: "p-ready"
initialDelaySeconds: 30
Service
apiVersion: "v1"
kind: "Service"
metadata:
name: "s-an-server"
namespace: "default"
spec:
ports:
- port: 8080
targetPort: "p-http"
protocol: "TCP"
name: "sp-http"
selector:
app: "a-an-server"
type: "NodePort"
Ingress
apiVersion: "networking.k8s.io/v1beta1"
kind: "Ingress"
metadata:
name: "primary-ingress"
annotations:
kubernetes.io/ingress.global-static-ip-name: "primary-static-ipv4"
networking.gke.io/managed-certificates: "appname-production-cert,appname-development-cert"
spec:
rules:
- host: "appname.example.com"
http:
paths:
- backend:
serviceName: "s-an-server"
servicePort: "sp-http"
- host: "dev.appname.example.com"
http:
paths:
- backend:
serviceName: "s-an-server-dev"
servicePort: "sp-http-dev"

I think what's happening here is GKE ingress is not at all informed of port 9292. You are referring sp-http in the ingress which refers to port 8080.
You need to make sure of below:
1.The service's targetPort field must point to the pod port's containerPort value or name.
2.The readiness probe must be exposed on the port matching the servicePort specified in the Ingress.

Related

BackendConfig with Ingress gives UNHEALTHY backend

I am learning how to use an ingress to expose my application GKE v1.19.
I followed the tutorial on GKE docs for Service, Ingress, and BackendConfig to get to the following setup. However, my backend services still become UNHEALTHY after some time. My aim is to overwrite the default "/" health check path for the ingress controller.
I have the same health checks defined in my deployment.yaml file under livenessProbe and readinessProbe and they seem to work fine since the Pod enters running stage. I have also tried to curl the endpoint and it returns a 200 status.
I have no clue why are my service is marked as unhealthy despite them being accessible from the NodePort service I defined directly. Any advice or help would be appreciated. Thank you.
I will add my yaml files below:
deployment.yaml
....
livenessProbe:
httpGet:
path: /api
port: 3100
initialDelaySeconds: 180
readinessProbe:
httpGet:
path: /api
port: 3100
initialDelaySeconds: 180
.....
backendconfig.yaml
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: backend-config
namespace: ns1
spec:
healthCheck:
checkIntervalSec: 30
port: 3100
type: HTTP #case-sensitive
requestPath: /api
service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
cloud.google.com/backend-config: '{"default": "backend-config"}'
name: service-ns1
namespace: ns1
labels:
app: service-ns1
spec:
type: NodePort
ports:
- protocol: TCP
port: 3100
targetPort: 3100
selector:
app: service-ns1
ingress.yaml
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: ns1-ingress
namespace: ns1
annotations:
kubernetes.io/ingress.global-static-ip-name: ns1-ip
networking.gke.io/managed-certificates: ns1-cert
kubernetes.io/ingress.allow-http: "false"
spec:
rules:
- http:
paths:
- path: /api/*
backend:
serviceName: service-ns1
servicePort: 3100
The ideal way to use the ‘BackendConfig’ is when the serving pods for your service contains multiple containers, if you're using the Anthos Ingress controller or if you need control over the port used for the load balancer's health checks, then you should use a BackendConfig CDR to define health check parameters. Refer to the 1.
When a backend service's health check parameters are inferred from a serving Pod's readiness probe, GKE does not keep the readiness probe and health check synchronized. Hence any changes you make to the readiness probe will not be copied to the health check of the corresponding backend service on the load balancer as per 2.
In your scenario, the backend is healthy when it follows path ‘/’ but showing unhealthy when it uses path ‘/api’, so there might be some misconfiguration in your ingress.
I would suggest you to add the annotations: ingress.kubernetes.io/rewrite-target: /api
so the path mentioned in spec.path will be rewritten to /api before the request is sent to the backend service.

GKE Ingress with container-native load balancing does not detect health check (Invalid value for field 'resource.httpHealthCheck')

I am running a cluster on Google Kubernetes Engine and I am currently trying to switch from using an Ingress with external load balancing (and NodePort services) to an ingress with container-native load balancing (and ClusterIP services) following this documentation: Container native load balancing
To communicate with my services I am using the following ingress configuration that used to work just fine when using NodePort services instead of ClusterIP:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: mw-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: mw-cluster-ip
networking.gke.io/managed-certificates: mw-certificate
kubernetes.io/ingress.allow-http: "false"
spec:
rules:
- http:
paths:
- path: /*
backend:
serviceName: billing-frontend-service
servicePort: 80
- path: /auth/api/*
backend:
serviceName: auth-service
servicePort: 8083
Now following the documentation, instead of using a readinessProbe as a part of the container deployment as a health check I switched to using ClusterIP services in combination with BackendConfig instead. For each deployment I am using a service like this:
apiVersion: v1
kind: Service
metadata:
labels:
app: auth
name: auth-service
namespace: default
annotations:
cloud.google.com/backend-config: '{"default": "auth-hc-config"}'
spec:
type: ClusterIP
selector:
app: auth
ports:
- port: 8083
protocol: TCP
targetPort: 8083
And a Backend config:
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: auth-hc-config
spec:
healthCheck:
checkIntervalSec: 10
port: 8083
type: http
requestPath: /auth/health
As a reference, this is what the readinessProbe used to look like before:
readinessProbe:
failureThreshold: 3
httpGet:
path: /auth/health
port: 8083
scheme: HTTP
periodSeconds: 10
Now to the actual problem. I deploy the containers and services first and they seem to startup just fine. The ingress however does not seem to pick up the health checks properly and shows this in the Cloud console:
Error during sync: error running backend syncing routine: error ensuring health check: googleapi: Error 400: Invalid value for field 'resource.httpHealthCheck': ''. HTTP healthCheck missing., invalid
The cluster as well as the node pool are running GKE version 1.17.6-gke.11 so the annotation cloud.google.com/neg: '{"ingress": true}' is not necessary. I have checked and the service is annotated correctly:
Annotations: cloud.google.com/backend-config: {"default": "auth-hc-config"}
cloud.google.com/neg: {"ingress":true}
cloud.google.com/neg-status: {"network_endpoint_groups":{"8083":"k8s1-2078beeb-default-auth-service-8083-16a14039"},"zones":["europe-west3-b"]}
I have already tried to re-create the cluster and the node-pool with no effect. Any ideas on how to resolve this? Am I missing an additional health check somewhere?
I found my issue. Apparently the BackendConfig's type attribute is case-sensitive. Once I changed it from http to HTTP it worked after I recreated the ingress.

GKE External Load Balancer Configuration, NEG's are empty, health checks are not working

I'm working on a deployment in GKE that is my first one, so I'm pretty new to the concepts, but I understand where they're going with the tools, just need the experience to be confident.
First, I have a cluster that has about five services, two of which I want to expose via external load balancer. I've defined an annotation for Gcloud to set these up under load balancing, and that seems to be working, I've also setup an annotation to setup a network endpoint groups for the services. Here's how one is configured as in the deployment and service manifests.
---
#api-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose convert -f ./docker-compose.yml
kompose.version: 1.21.0 ()
creationTimestamp: null
labels:
io.kompose.service: api
name: api
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: api
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose convert -f ./docker-compose.yml
kompose.version: 1.21.0 ()
creationTimestamp: null
labels:
io.kompose.service: api
spec:
containers:
- args:
- bash
- -c
- node src/server.js
env:
- name: NODE_ENV
value: production
- name: TZ
value: America/New_York
image: gcr.io/<PROJECT_ID>/api
imagePullPolicy: Always
name: api
ports:
- containerPort: 8087
resources: {}
restartPolicy: Always
serviceAccountName: ""
status: {}
---
#api-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
cloud.google.com/load-balancer-type: "Internal"
cloud.google.com/neg: '{"ingress": true}'
creationTimestamp: null
labels:
io.kompose.service: api
name: api
spec:
type: LoadBalancer
ports:
- name: "8087"
port: 8087
targetPort: 8087
status:
loadBalancer: {}
I think I may be missing some kind of configuration here, but I'm unsure.
I've also seen where I can define Liveness checks in the yaml by adding
livenessProbe:
httpGet:
path: /healthz
port: 8080
I also have my ingress configured like this:
---
# master-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: master-application-ingress
annotations:
ingress.kubernetes.io/secure-backends: "true"
spec:
rules:
- http:
paths:
- path: /api
backend:
serviceName: api
servicePort: 8087
- http:
paths:
- path: /ui
backend:
serviceName: ui
servicePort: 80
and I've seen it where it just needs the port, for TCP checks, but I've already defined these in my application, and in the load balancer. I guess I want to know where I should be defining these checks.
Also, I have an issue with the NEG's created by the annotation being empty, or is this normal with manifest created NEG's?
The health check is created based on your readinessProbe, not livenessProbe. Make sure to have a readinessProbe configured in your pod spec before creating the ingress resource.
As for the empty NEG, this might be due to a mismatch of the Health Check. The NEG will rely on the readiness gate feature (explained here), since you only have the livenessProbe defined, it is entirely possible the health check is misconfigured and thus failing.
You should also have an internal IP for the internal LB you created, can you reach the pods that way? If both are failing, the Health Check is likely the issue since the NEG is not adding pods to the group that it sees as not ready
Now you can also create BackendConfig as a separate Kubernetes declaration.
My example:
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: cms-backend-config
namespace: prod
spec:
healthCheck:
checkIntervalSec: 60
port: 80
type: HTTP #case-sensitive
requestPath: /your-healthcheck-path
connectionDraining:
drainingTimeoutSec: 60
I don't have any readiness/liveness probes at all defined explicitly and everything works. I also noticed there are still glitches between GKE and the rest of GCP sometimes. I remember needing to re-create both my deployments and ingress from scratch at some point after I played around with different options for quite a while.
What I also did, and that might have been the main reason I started seeing endpoints in the automatically registered NEGs, is added a default backend to ingress to not have a separate default registered with Load Balancer:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: prod-ingress
namespace: prod
annotations:
kubernetes.io/ingress.allow-http: "false"
kubernetes.io/ingress.global-static-ip-name: load-balancer-ip
networking.gke.io/managed-certificates: my-certificate
spec:
backend:
serviceName: my-service
servicePort: 80
rules:
- host: "example.com"
http:
paths:
- path: /
backend:
serviceName: my-service
servicePort: 80

GCE Ingress not picking up health check from readiness probe

When I create a GCE ingress, Google Load Balancer does not set the health check from the readiness probe. According to the docs (Ingress GCE health checks) it should pick it up.
Expose an arbitrary URL as a readiness probe on the pods backing the Service.
Any ideas why?
Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: frontend-prod
labels:
app: frontend-prod
spec:
selector:
matchLabels:
app: frontend-prod
replicas: 3
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: frontend-prod
spec:
imagePullSecrets:
- name: regcred
containers:
- image: app:latest
readinessProbe:
httpGet:
path: /healthcheck
port: 3000
initialDelaySeconds: 15
periodSeconds: 5
name: frontend-prod-app
- env:
- name: PASSWORD_PROTECT
value: "1"
image: nginx:latest
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 5
name: frontend-prod-nginx
Service:
apiVersion: v1
kind: Service
metadata:
name: frontend-prod
labels:
app: frontend-prod
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
protocol: TCP
name: http
selector:
app: frontend-prod
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: frontend-prod-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: frontend-prod-ip
spec:
tls:
- secretName: testsecret
backend:
serviceName: frontend-prod
servicePort: 80
So apparently, you need to include the container port on the PodSpec.
Does not seem to be documented anywhere.
e.g.
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
Thanks, Brian! https://github.com/kubernetes/ingress-gce/issues/241
This is now possible in the latest GKE (I am on 1.14.10-gke.27, not sure if that matters)
Define a readinessProbe on your container in your Deployment.
Recreate your Ingress.
The health check will point to the path in readinessProbe.httpGet.path of the Deployment yaml config.
Update by Jonathan Lin below: This has been fixed very recently. Define a readinessProbe on the Deployment. Recreate your Ingress. It will pick up the health check path from the readinessProbe.
GKE Ingress health check path is currently not configurable. You can go to http://console.cloud.google.com (UI) and visit Load Balancers list to see the health check it uses.
Currently the health check for an Ingress is GET / on each backend: specified on the Ingress. So all your apps behind a GKE Ingress must return HTTP 200 OK to GET / requests.
That said, the health checks you specified on your Pods are still being used ––by the kubelet to make sure your Pod is actually functioning and healthy.
Google has recently added support for CRD that can configure your Backend Services along with healthchecks:
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: backend-config
namespace: prod
spec:
healthCheck:
checkIntervalSec: 30
port: 8080
type: HTTP #case-sensitive
requestPath: /healthcheck
See here.
Another reason why Google Cloud Load Balancer does not pick-up GCE health check configuration from Kubernetes Pod readiness probe could be that the service is configured as "selectorless" (the selector attribute is empty and you manage endpoints directly).
This is the case with e.g. kube-lego: see https://github.com/jetstack/kube-lego/issues/68#issuecomment-303748457 and https://github.com/jetstack/kube-lego/issues/68#issuecomment-327457982.
Original question does have selector specified in the service, so this hint doesn't apply. This hints serves visitors that have the same problem with a different cause.

Kubernetes Ingress Path only works with /

I have configured a kubernetes ingress service but it only works when the path is /
I have tried all manner of different values for the path including:
/*
/servicea
/servicea/
/servicea/*
This is my ingress configuration (that works)
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: boardingservice
annotations:
ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: my.url.com
http:
paths:
- path: /
backend:
serviceName: servicea-nodeport
servicePort: 80
This is my nodeport service
- apiVersion: v1
kind: Service
metadata:
name: servicea-nodeport
spec:
type: NodePort
ports:
- port: 80
targetPort: 8081
nodePort: 30124
selector:
app: servicea
And this is my deployment
- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: servicea
spec:
replicas: 1
template:
metadata:
name: ervicea
labels:
app: servicea
spec:
containers:
- image: 350329402011.dkr.ecr.eu-west-2.amazonaws.com/servicea
name: servicea
ports:
- containerPort: 8080
protocol: TCP
- image: 350329402011.dkr.ecr.eu-west-2.amazonaws.com/serviceb
name: serviceab
ports:
- containerPort: 8081
protocol: TCP
If the path is / then I can do this http://my.url.com/api/ping
but as I will have multiple services I want to do this: http://my.url.com/servicea/api/ping but when I set the path to /servicea I get a 404.
I am running kubernetes on AWS with an ingress-nginx ingress controller
Any idea?
You are not using kubernetes Pods as they are intended to be used. A Pod
it contains one or more application containers which are relatively tightly coupled — in a pre-container world, they would have executed on the same physical or virtual machine.
If you have two applications, servicea and serviceb, they should be running on different Pods: one pod for servicea and another one for serviceb. This has many benefits: you can deploy them separately, scale them independently, etc.
As the docs say
A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.
These Pods can be created using Deployments, as you were already doing. That's fine and recommended.
Once you have the Deployments running, you'd create a different Service that would balance traffic between all the Pods for a given Deployment.
And finally, you want to hit servicea or serviceb depending on the request URL. That can be done with Ingress, as you were trying, but mapping each path to different services. For example
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test
annotations:
ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: my.url.com
http:
paths:
- path: /servicea
backend:
serviceName: servicea
servicePort: 80
- path: /serviceb
backend:
serviceName: serviceb
servicePort: 80
That way, requests going to your ingress controller using the /servicea path would be served by the Pods behind the servicea Service. And requests going to your ingress controller using the /serviceb path would be served by the Pods behind the serviceb Service.
For anyone reading this, my configuration was correct (even though unorthodox, as pointed out by fiunchinho), the error was in my Spring Boot applications, that were running in the containers. I needed to change the context paths to match the Ingress path - I could, of course, have changed the #GetMapping and #PostMapping methods in my Spring Controller, but I opted to change the context path.