Readiness probe failure, Kubernetes expected behavior - kubernetes

According to the Kubernetes documentation,
If the readiness probe fails, the endpoints controller removes the Pod's IP address from the endpoints of all Services that match the Pod.
So I understand that Kubernetes won't redirect requests to the pod when the readiness probe fails.
In addition, does Kubernetes kill the pod? Or does it keep calling the readiness probe until the response is successful?

What are readiness probes for?
Containers can use readiness probes to know whether the container being probed is ready to start receiving network traffic. If your container enters a state where it is still alive but cannot handle incoming network traffic (a common scenario during startup), you want the readiness probe to fail. That way, Kubernetes will not send network traffic to a container that isn't ready for it. If Kubernetes did prematurely send network traffic to the container, it could cause the load balancer (or router) to return a 502 error to the client and terminate the request; either that or the client would get a "connection refused" error message.
The name of the readiness probe conveys a semantic meaning. In effect, this probe answers the true-or-false question: "Is this container ready to receive network traffic?"
A readiness probe failing will not kill or restart the container.
What are liveness probes for?
A liveness probe sends a signal to Kubernetes that the container is either alive (passing) or dead (failing). If the container is alive, then Kubernetes does nothing because the current state is good. If the container is dead, then Kubernetes attempts to heal the application by restarting it.
The name liveness probe also expresses a semantic meaning. In effect, the probe answers the true-or-false question: "Is this container alive?"
A liveness probe failing will kill/restart a failed container.
What are startup probes for?
Kubernetes has a newer probe called startup probes. This probe is useful for applications that are slow to start. It is a better alternative to increasing initialDelaySeconds on readiness or liveness probes. A startup probe allows an application to become ready, joined with readiness and liveness probes, it can increase the applications' availability.
Once the startup probe has succeeded once, the liveness probe takes over to provide a fast response to container deadlocks. If the startup probe never succeeds, the container is killed after failureThreshold * periodSeconds (the total startup timeout) and will be killed and restarted, subject to the pod's restartPolicy.

Related

In Kubernetes, why does readinessProbe have the option periodSeconds?

I understand what a readinessProbe does, but I don't see why it should have a periodSeconds. Once it's determined that the pod is ready, it should stop checking. Wouldn't checking periodically then be up to the livenessProbe? Or am I missing something?
ReadinesProbe and livenessProbe serve for different purposes.
ReadinessProbe checks if a service is ready to serve requests. If the readinessProbe fails, the container will be taken out of service for as long as the probe fails. If the readinessProbe reports up again, the container will be taken back into service again and receive requests.
In contrast, if the livenessProbe fails, it will be considered as not recoverable and the container will be terminated and restarted.
For both, periodSeconds makes sense. Even for the livenessProbe, when failure is considered only after X consecutive failed checks.
Readiness probes determine whether or not a container is ready to serve requests. If the readiness probe returns a failed state, then Kubernetes removes the IP address for the container from the endpoints of all Services.
We use readiness probes to instruct Kubernetes that a running container should not receive any traffic. This is useful when waiting for an application to perform time-consuming initial tasks, such as establishing network connections, loading files, and warming caches.
The readiness probe is configured in the spec.containers.readinessprobe attribute of the pod configuration.
This periodSeconds field specifies that the kubelet should perform a readiness probe for every “x” seconds that is mentioned in the yaml. This Specifies the frequency of the checks to the readiness probe to check. Default to 10 seconds. Minimum value is 1.

I am using 8888 for kubernetes health probes and 8887 for normal HTTP requests. So if readiness probe fails, should i still expect traffic on 8887?

I am using 8888 for liveness & readiness probes, 8887 for normal HTTP requests, readiness probe is failing and pods are in 0/1, not ready state. ButI still see normal POST requests being served by the pod. Is this expected. should health probes and normal requests be received on the same port?
Liveness and readyness probes have different purposes. In short the liveness probe controls whether Kubernetes will restart the pod. But the readyness probe controls whether a pod is included in the endpoints of a service. Unless a pod has indicated it's ready through the readyness probe, it should not receive traffic through a service. That doesn't mean it can't be sent requests, it just means it won't be sent traffic through the service. So in your case the question is, where are those POST requests coming from.
#pst and #Harsh are right but I would like to expand on it a bit.
As the official docs say:
If you'd like to start sending traffic to a Pod only when a probe
succeeds, specify a readiness probe. In this case, the readiness probe
might be the same as the liveness probe, but the existence of the
readiness probe in the spec means that the Pod will start without
receiving any traffic and only start receiving traffic after the
probe starts succeeding.
and:
The kubelet uses readiness probes to know when a container is ready to
start accepting traffic. A Pod is considered ready when all of its
containers are ready. One use of this signal is to control which Pods
are used as backends for Services. When a Pod is not ready, it is
removed from Service load balancers.
Answering your question:
So if readiness probe fails, should i still expect traffic on 8887?
No, the pod should not start receiving traffic if the readiness probe fails.
It can also depend on your app. By using a readiness probe, Kubernetes waits until the app is fully started before it allows the service to send traffic to the new copy.
Also, it is very important to make sure your probes are configured properly:
Probes have a number of fields that you can use to more precisely
control the behavior of liveness and readiness checks:
initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to
0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1
for liveness. Minimum value is 1.
failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness
probe means restarting the container. In case of readiness probe the
Pod will be marked Unready. Defaults to 3. Minimum value is 1.
If you wish to expand your knowledge regarding the liveness, readiness and startup probes please refer to the official docs. You will wind some examples there that can be compared with your setup in order to see if you understand and configured it right.

Kubernetes: Readiness Check with httpGet

I am quite confused about readiness probe. Suppose I use httpGet with /health as the probing endpoint. Once the readiness check returns 500, the server will stop serving traffic. Then how can the /health endpoint work? In other words, once a readiness check fails, how can it ever work again since it can no longer answer to future /health checks?
I guess one valid explanation is that the path is invoked locally? (i.e. not through the https:${ip and port}/health)
You have typo.. you said :
Once the readiness check returns 500, the server will stop serving traffic.
However, it should be :
Once the readiness check returns 500, the k8s service will stop serving traffic.
k8s service behaves like a load balancer for multi-pods.
If pod is ready, an endpoint will be created for the ready pod, and the traffic will be received.
If pod is not ready, its endpoint will be removed and it will not more receive traffic.
While Readiness Probe decides to forward traffic or not, Liveness Probe decides to restart the Pod or not.
If you want to get rid off unhealthy Pod, you have to specify also Liveness Probe.
So let's summarize:
To get full HA deployment you need 3 things together:
Pod are managed by Deployment which will maintain a number of replicas.
Liveness Probe will help to remove/restart the unlheathy pod.. After somtime ( 6 restarts), the Pod will be unhealthy and the Deployment will take care to bring new one.
Readiness Probe will help forward traffic to only ready pods : Either at beginning of run, or at the end of run ( graceful shutdown).

why both liveness is needed with readiness

While doing health check for kubernetes pods, why liveness probe is needed even though we already maintain readiness probe ?
Readiness probe already keeps on checking if the application within pod is ready to serve requests or not, which means that the pod is live. But still, why liveness probe is needed ?
The probes have different meaning with different results:
failing liveness probes -> restart container
failing readiness probes -> do not send traffic to that pod
You can not determine liveness from readiness and vice-versa. Just because pod cannot accept traffic right know, doesn't mean restart is needed, it can mean that it just needs time to finish some work.
If you are deploying e.g. php app, those two will probably be the same, but k8s is a generic system, that supports many types of workloads.
From: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
The kubelet uses liveness probes to know when to restart a Container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a Container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
Sidenote: Actually readiness should be a subset of liveness, that means readiness implies liveness (and failing liveness implies failing readiness). But that doesn't change the explanation above, because if you have only readiness, you can only imply when restart is NOT needed, which is the same as not having any probe for restarting at all. Also because of probes are defined separately there is no guarantee for k8s, that one is subset of the other
The readiness probe checks that your application is ready to serve the requests or not and it will not add that particular pod to ready pod pull until it satisfy readiness checks. The main difference is that it doesnt restart the pod if a pod is not ready.
Liveness probe checks if the pod is not ready(not satisfying specific condition) it will restart the pod in a hope that this pod will recover and becomes ready.
Kubernetes allows you to define few things to make the app available.
1: Liveness probes for your Container.
2: Readiness probes for your Pod.
1- Liveness probes: They help keep your apps healthy by ensuring unhealthy containers are restarted automatically.
2: Readiness probes They help by invoking periodically and determines whether the specific Pod should receive client requests or not.
Operation of Readiness Probes
When a container is started, Kubernetes can be configured to wait for a configurable amount of time to pass before performing the first readiness check. After that, it invokes the probe periodically and acts based on the result of the readiness probe.
If a pod reports that it’s not ready, it’s removed from the service. Just think like- The effect is the same as when the pod doesn’t match the service’s label selector at all.
If the pod then becomes ready again, it’s re-added.
Important difference between liveness and readiness probes
1: Unlike liveness probes(i:e as mentioned above "If a pod reports that it’s not ready, it’s removed from the service"), if a container fails the readiness check, it won’t be killed or restarted.
2: Liveness probes keep pods healthy by killing off unhealthy containers and replacing them with new, healthy ones, whereas readiness probes make sure that only pods that are ready to serve requests receive them
I hope this helps you better.

Pods readiness check fails after running, why are new pods not started?

I'm trying to understand how to best use the Kubernetes readiness and liveness probes. What I have observed when testing is that if a readiness probe fails then the pod is marked as unready and removed from the load balancer. However I then expected a new pod to be started and moved into the load balancer until the original pod became ready again (or it's liveness probe failed and it is killed) and then one of them could be terminated.
We might want to temporarily remove a pod from service if the readiness probe fails but this seems to run the risk that all pods could go un-ready which would result in no pods in the load balancer and no new pods being started.
I assume what I am observing with no new pods being started to cover those that are un-ready is expected behavior? In which case what exactly is the use case for the readiness probe and how is the risk of all pods becoming unready mitigated?
No matter how many pods , if the readiness probe fails , it will be restarted regardless. And traffic will not be sent to it unless it passes the readiness probe. Restarting the same container makes more sense rather than creating a new one.