Accessing service health checks ports after configuring istio - kubernetes

So we're deploying istio 1.0.2 with global mtls and so far it's gone well.
For health checks we've added separate ports to the services and configured them as per the docs:
https://istio.io/docs/tasks/traffic-management/app-health-check/#mutual-tls-is-enabled
Our application ports are now on 8080 and health checks ports are on 8081.
After doing this Kubernetes is able to do health checks and the services appear to be running normally.
However our monitoring solution cannot hit the health check port.
The monitoring application also sits in kubernetes and is currently outside the mesh. The above doc says the following:
Because the Istio proxy only intercepts ports that are explicitly declared in the containerPort field, traffic to 8002 port bypasses the Istio proxy regardless of whether Istio mutual TLS is enabled.e
This is how we have it configured. So in our case 8081 should be outside the mesh:
livenessProbe:
failureThreshold: 3
httpGet:
path: /manage/health
port: 8081
scheme: HTTP
initialDelaySeconds: 180
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: <our-service>
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /manage/health
port: 8081
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
However we can't access 8081 from another pod which is outside the mesh.
For example:
curl http://<our-service>:8081/manage/health
curl: (7) Failed connect to <our-service>:8081; Connection timed out
If we try from another pod inside the mesh istio throws back a 404, which is perhaps expected.
I tried to play around with destination rules like this:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: <our-service>-health
spec:
host: <our-service>.namepspace.svc.cluster.local
trafficPolicy:
portLevelSettings:
- port:
number: 8081
tls:
mode: DISABLE
But that just kills all connectivity to the service, both internally and through the ingress gateway.

According to the official Istio Documentation port 8081 will not get through Istio Envoy, hence won’t be accessible for the other Pods outside your service mesh, because Istio proxy determines only the value of containerPort transmitting through the Pod's service.
In case you build Istio service mesh without TLS authentication between Pods, there is an option to use the same port for the basic network route to the Pod's service and the readiness/liveness probes.
However, if you use port 8001 for both regular traffic and liveness
probes, health check will fail when mutual TLS is enabled because the
HTTP request is sent from Kubelet, which does not send client
certificate to the liveness-http service.
Assuming that Istio Mixer provides a three Prometheus endpoints, you can consider using Prometheus as the main monitoring tool in order to collect and analyze the mesh metrics.

Related

istio failing with "failed checking application ports"

Using istio 1.0.2 and kubernetes 1.12 on GKE.
When deploying a web application, the pod never reaches the healthy status.
My main pod spits out healthy logs.
However, my sidecar, i.e. the istio-proxy container reads:
* failed checking application ports. listeners="0.0.0.0:15090","10.8.48.10:53","10.8.63.194:15443","10.8.63.194:443","10.8.58.47:15011","10.8.54.249:42422","10.8.48.44:443","10.8.58.10:44134","10.8.54.34:443","10.8.63.194:15020","10.8.49.250:8080","10.8.63.194:31400","10.8.63.194:15029","10.8.63.194:15030","10.8.60.185:11211","10.8.49.0:53","10.8.61.194:443","10.8.48.1:443","10.8.48.180:80","10.8.51.133:443","10.8.63.194:15031","10.8.63.194:15032","0.0.0.0:9901","0.0.0.0:9090","0.0.0.0:80","0.0.0.0:3000","0.0.0.0:8060","0.0.0.0:15010","0.0.0.0:8080","0.0.0.0:20001","0.0.0.0:7979","0.0.0.0:9091","0.0.0.0:9411","0.0.0.0:15004","0.0.0.0:15014","0.0.0.0:3030","10.8.33.8:15020","0.0.0.0:15001"
* envoy missing listener for inbound application port: 5000
5000 is indeed the port my web app is listening on.
Any suggestions?
If there is a mismatch between deployment port and service port this can cause some issues in combination with the readiness of the sidecar.
Add the annotation readiness.status.sidecar.istio.io/applicationPorts in your deployment like this:
annotations:
readiness.status.sidecar.istio.io/applicationPorts: "5000"
You can add multiple ports by using comma separation.
#mkrobi I got this working as suggested in this post by adding the following-
readinessProbe:
httpGet:
path: /
port: 8080
scheme: HTTP
to the containers in my deployment. Make sure to change port 8080 to 5000.

gRPC & HTTP servers on GKE Ingress failing healthcheck for gRPC backend

I want to deploy a gRPC + HTTP servers on GKE with HTTP/2 and mutual TLS. My deployment have both a readiness probe and liveness probe with custom path. I expose both the gRPC and HTTP servers via an Ingress.
deployment's probes and exposed ports:
livenessProbe:
failureThreshold: 3
httpGet:
path: /_ah/health
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /_ah/health
port: 8443
scheme: HTTPS
name: grpc-gke
ports:
- containerPort: 8443
protocol: TCP
- containerPort: 50052
protocol: TCP
NodePort service:
apiVersion: v1
kind: Service
metadata:
name: grpc-gke-nodeport
labels:
app: grpc-gke
annotations:
cloud.google.com/app-protocols: '{"grpc":"HTTP2","http":"HTTP2"}'
service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2", "http": "HTTP2"}'
spec:
type: NodePort
ports:
- name: grpc
port: 50052
protocol: TCP
targetPort: 50052
- name: http
port: 443
protocol: TCP
targetPort: 8443
selector:
app: grpc-gke
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grpc-gke-ingress
annotations:
kubernetes.io/ingress.allow-http: "false"
#kubernetes.io/ingress.global-static-ip-name: "grpc-gke-ip"
labels:
app: grpc-gke
spec:
rules:
- http:
paths:
- path: /_ah/*
backend:
serviceName: grpc-gke-nodeport
servicePort: 443
backend:
serviceName: grpc-gke-nodeport
servicePort: 50052
The pod does exist, and has a "green" status, before creating the liveness and readiness probes. I see regular logs on my server that both the /_ah/live and /_ah/ready are called by the kube-probe and the server responds with the 200 response.
I use a Google managed TLS certificate on the load balancer (LB). My HTTP server creates a self-signed certificate -- inspired by this blog.
I create the Ingress after I start seeing the probes' logs. After that it creates an LB with two backends, one for the HTTP and one for the gRPC. The HTTP backend's health checks are OK and the HTTP server is accessible from the Internet. The gRPC backend's health check fails thus the LB does not route the gRPC protocol and I receive the 502 error response.
This is with GKE master 1.12.7-gke.10. I also tried newer 1.13 and older 1.11 masters. The cluster has HTTP load balancing enabled and VPC-native enabled. There are firewall rules to allow access from LB to my pods (I even tried to allow all ports from all IP addresses). Delaying the probes does not help either.
Funny thing is that I deployed nearly the same setup, just the server's Docker image is different, couple of months ago and it is running without any issues. I can even deploy new Docker images of the server and everything is great. I cannot find any difference between these two.
There is one another issue, the Ingress is stuck on the "Creating Ingress" state for days. It never finishes and never sees the LB. The Ingress' LB never has a front-end and I always have to manually add an HTTP/2 front-end with a static IP and Google managed TLS certificate. This should be happening only for cluster which were created without "HTTP load balancing", but it happens in my case every time for all my "HTTP load balancing enabled" clusters. The working deployment is in this state for months already.
Any ideas why the gRPC backend's health check could be failing even though I see logs that the readiness and liveness endpoints are called by kube-probe?
EDIT:
describe svc grpc-gke-nodeport
Name: grpc-gke-nodeport
Namespace: default
Labels: app=grpc-gke
Annotations: cloud.google.com/app-protocols: {"grpc":"HTTP2","http":"HTTP2"}
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"cloud.google.com/app-protocols":"{\"grpc\":\"HTTP2\",\"http\":\"HTTP2\"}",...
service.alpha.kubernetes.io/app-protocols: {"grpc":"HTTP2", "http": "HTTP2"}
Selector: app=grpc-gke
Type: NodePort
IP: 10.4.8.188
Port: grpc 50052/TCP
TargetPort: 50052/TCP
NodePort: grpc 32148/TCP
Endpoints: 10.0.0.25:50052
Port: http 443/TCP
TargetPort: 8443/TCP
NodePort: http 30863/TCP
Endpoints: 10.0.0.25:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
and the health check for the gRPC backend is an HTTP/2 GET using path / on port 32148. Its description is "Default kubernetes L7 Loadbalancing health check." where as the description of the HTTP's back-end health check is "Kubernetes L7 health check generated with readiness probe settings.". Thus the health check for the gRPC back-end is not created from the readiness probe.
Editing the health check to point to port 30863 an changing the path to readiness probe fixes the issue.
GKE ingress just recently started supporting full gRPC support in beta (whereas HTTP2 ro HTTP1.1 conversion was used in the past). To use gRCP though, you need to add an annotation to the ingress "cloud.google.com/app-protocols: '{"http2-service":"HTTP2"}'".
Refer to this how-to doc for more detais.
Editing the health check to point to the readiness probe's path and changed the port to the one of the HTTP back-end fixed this issue (look for the port in the HTTP back-end's health check. it is the NodePort's.). It runs know without any issues.
Using the same health check for the gRPC back-end as for the HTTP back-end did not work, it was reset back to its own health check. Even deleting the gRPC back-end's health check did not help, it was recreated. Only editing it to use a different port and path has helped.

What is the "port" used for for a Kubernetes Service

Considering a very simple service.yaml file:
kind: Service
apiVersion: v1
metadata:
name: gateway-service
spec:
type: NodePort
selector:
app: gateway-app
ports:
- name: gateway-service
protocol: TCP
port: 80
targetPort: 8080
nodePort: 30080
We know that service will route all the requests to the pods with this label app=gateway-app at port 8080 (a.k.a. targetPort). There is another port field in the service definition, which is 80 in this case here. What is this port used for? When should we use it?
From the documentation, there is also this line:
By default the targetPort will be set to the same value as the port field.
Reference: https://kubernetes.io/docs/concepts/services-networking/service/
In other words, when should we keep targetPort and port the same and when not?
In a nodePort service you can have 3 types of ports defined:
TargetPort:
As you mentioned in your question, this is the corresponding port to your pod and essentially the containerPorts you have defined in your replica manifest.
Port (servicePort):
This defines the port that other local resources can refer to. Quoting from the Kubernetes docs:
this Service will be visible [locally] as .spec.clusterIP:spec.ports[*].port
Meaning, this is not accessible publicly, however you can refer to your service port through other resources (within the cluster) with this port. An example is when you are creating an ingress for this service. In your ingress you will be required to present this port in the servicePort field:
...
backend:
serviceName: test
servicePort: 80
NodePort:
This is the port on your node which publicly exposes your service. Again quoting from the docs:
this Service will be visible [publicly] as [NodeIP]:spec.ports[*].nodePort
Port is what clients will connect to. TargetPort is what container is listening. One use case when they are not equal is when you run container under non-root user and cannot normally bind to port below 1024. In this case you can listen to 8080 but clients will still connect to 80 which might be simpler for them.
Service: This directs the traffic to a pod.
TargetPort: This is the actual port on which your application is running on the container.
Port: Some times your application inside container serves different services on a different port. Ex:- the actual application can run 8080 and health checks for this application can run on 8089 port of the container.
So if you hit the service without port it doesn't know to which port of the container it should redirect the request. Service needs to have a mapping so that it can hit the specific port of the container.
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
- name: http
nodePort: 32111
port: 8089
protocol: TCP
targetPort: 8080
- name: metrics
nodePort: 32353
port: 5555
protocol: TCP
targetPort: 5555
- name: health
nodePort: 31002
port: 8443
protocol: TCP
targetPort: 8085
if you hit the my-service:8089 the traffic is routed to 8080 of the container(targetPort). Similarly, if you hit my-service:8443 then it is redirected to 8085 of the container(targetPort).
But this myservice:8089 is internal to the kubernetes cluster and can be used when one application wants to communicate with another application. So to hit the service from outside the cluster someone needs to expose the port on the host machine on which kubernetes is running
so that the traffic is redirected to a port of the container. In that can use nodePort.
From the above example, you can hit the service from outside the cluster(Postman or any restclient) by host_ip:Nodeport
Say your host machine ip is 10.10.20.20 you can hit the http,metrics,health services by 10.10.20.20:32111,10.10.20.20:32353,10.10.20.20:31002
Let's take an example and try to understand with the help of a diagram.
Consider a cluster having 2 nodes and one service. Each nodes having 2 pods and each pod having 2 containers say app container and web container.
NodePort: 3001 (cluster level exposed port for each node)
Port: 80 (service port)
targetPort:8080 (app container port same should be mentioned in docker expose)
targetPort:80 (web container port same should be mentioned in docker expose)
Now the below diagram should help us understand it better.
reference: https://theithollow.com/2019/02/05/kubernetes-service-publishing/

Should i create a API for readinessprobe to work kubernetes

I am trying to create a RollingUpdate and trying to use below code to see if pod came up or not. Should i create explicit API path like /healthz in my application so that kubernetes pings it and gets 200 status back or else its internal url for kubernetes?
specs:
containers:
- name: liveness
readinessProbe:
httpGet:
path: /healthz
port: 80
As#Thomas answered the Http probe, If application does not provide a endpoint to validate the success response. you can use TCP Probe
kubelet tries to establish a TCP connection on the container's port. If it can establish a connection, the container is considered healthy; if it can’t it is considered unhealthy
for example, in your case it would be like this
ports:
- containerPort: 80
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 15
periodSeconds: 20
You can get further information over here configure-liveness-readiness-probes/
Kubernetes will make a request to the container on port 80 and path /healthz and expects a status code in the range of 2xx-3xx to be considered successful.
If your application does not provide a mapping for the path and returns a 404, kubernetes assumes that the health check fails.
Depending on your application you need to manually provide the API, if it is not done by your framework. (You can check using a curl or wget to the path from another pod and verify the result)

Custom Health check with GCP

Hi I try to use custom health check with GCP LoadBalancer.
I have added readinessProbe & livenessProbe like this:
readinessProbe:
httpGet:
path: /health
port: dash
initialDelaySeconds: 5
periodSeconds: 1
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 10
livenessProbe:
httpGet:
path: /health
port: dash
initialDelaySeconds: 5
periodSeconds: 1
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 10
But when I create my ingress I haven't got my custom health check
Path LB
I FINALIZED an answer. What I was trying to do was impossible. My GCE Ingress used a backend on port 80 . But in my ReadinessProbe I told him to check on port 8080 and on the /health path. This is impossible!
The port of the service declared in the Ingress backend must be the same as that declared in the readinessProbe. Only the path can be different. If we do not respect this pattern, it is / that is associated with the Health Check GCP path.
From a network point of view this is logical, the Health Check GCP is "out" of the Kube cluster, if we tell it to route on port 80 but our ReadinessProbe is on another port, how it can ensure that even if the port associated with the ReadinessProbe meets port 80 (which is the one on which it must route traffic) also respond.
In summary, the port of the backend declared in Ingress must have a readinessProbe on the same port. The only thing we can customize is the path.
I think you are confused between resources in GCP.
The code you posted is at no moment in relation to a Load balancer resource, as it's a kubernetes health check for pod states. If you want to know if the probes are working, check your pod state, if it's not running describe your pod and look at the logs, should indicate an issue with the probes.
I'm going to guess that you have an ingress resource somewhere in your kubernetes conf wich creates the lb and all the resources around it like the health check (still guessing that the image you posted is in relation to that).
If you are using GKE you should leave the google automated resource conf from k8s config you deployed as it is, cause you may brake some things that google is already maintaining for you.