I am quite confused about readiness probe. Suppose I use httpGet with /health as the probing endpoint. Once the readiness check returns 500, the server will stop serving traffic. Then how can the /health endpoint work? In other words, once a readiness check fails, how can it ever work again since it can no longer answer to future /health checks?
I guess one valid explanation is that the path is invoked locally? (i.e. not through the https:${ip and port}/health)
You have typo.. you said :
Once the readiness check returns 500, the server will stop serving traffic.
However, it should be :
Once the readiness check returns 500, the k8s service will stop serving traffic.
k8s service behaves like a load balancer for multi-pods.
If pod is ready, an endpoint will be created for the ready pod, and the traffic will be received.
If pod is not ready, its endpoint will be removed and it will not more receive traffic.
While Readiness Probe decides to forward traffic or not, Liveness Probe decides to restart the Pod or not.
If you want to get rid off unhealthy Pod, you have to specify also Liveness Probe.
So let's summarize:
To get full HA deployment you need 3 things together:
Pod are managed by Deployment which will maintain a number of replicas.
Liveness Probe will help to remove/restart the unlheathy pod.. After somtime ( 6 restarts), the Pod will be unhealthy and the Deployment will take care to bring new one.
Readiness Probe will help forward traffic to only ready pods : Either at beginning of run, or at the end of run ( graceful shutdown).
Related
I understand what a readinessProbe does, but I don't see why it should have a periodSeconds. Once it's determined that the pod is ready, it should stop checking. Wouldn't checking periodically then be up to the livenessProbe? Or am I missing something?
ReadinesProbe and livenessProbe serve for different purposes.
ReadinessProbe checks if a service is ready to serve requests. If the readinessProbe fails, the container will be taken out of service for as long as the probe fails. If the readinessProbe reports up again, the container will be taken back into service again and receive requests.
In contrast, if the livenessProbe fails, it will be considered as not recoverable and the container will be terminated and restarted.
For both, periodSeconds makes sense. Even for the livenessProbe, when failure is considered only after X consecutive failed checks.
Readiness probes determine whether or not a container is ready to serve requests. If the readiness probe returns a failed state, then Kubernetes removes the IP address for the container from the endpoints of all Services.
We use readiness probes to instruct Kubernetes that a running container should not receive any traffic. This is useful when waiting for an application to perform time-consuming initial tasks, such as establishing network connections, loading files, and warming caches.
The readiness probe is configured in the spec.containers.readinessprobe attribute of the pod configuration.
This periodSeconds field specifies that the kubelet should perform a readiness probe for every “x” seconds that is mentioned in the yaml. This Specifies the frequency of the checks to the readiness probe to check. Default to 10 seconds. Minimum value is 1.
I've got application with 10 pods and traffic is load balanced between all pods. There was an issue that caused transactions queued up and few pods could not recover properly or took a long time to process the queue once the issue was fixed. The new traffic was still too much for some of the pods.
I'm wondering if I can block new traffic to particular pod(s) in a replicaset and let them process the queue and once the queue is processed then let the new traffic come in again?
For that you can use the probe to handle this scenario
A Readiness probe is one way to do it.
What probes to do is, continuously check inside the container or POD for the process is up or not on configured time interval.
Example
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
you can create the endpoint into the application which will be checked by the K8s automatically and if K8s will 200 it will mark the POD as Ready to handle the traffic. Or else mark as Unready not to handle traffic.
Note :
Readiness and liveness probes can be used in parallel for the same container. Using both can ensure that traffic does not reach a container that is not ready for it, and that containers are restarted when they fail.
The Readiness probe won't restart your POD if it's failing, while the liveness probe will restart your POD or container if it's failing and sending 400.
In your scenario, it's better to use the Readiness probe, so the process keeps running and never gets restarted. Once application ready to handle traffic K8s will get the 200 responses on endpoint.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
I am using 8888 for liveness & readiness probes, 8887 for normal HTTP requests, readiness probe is failing and pods are in 0/1, not ready state. ButI still see normal POST requests being served by the pod. Is this expected. should health probes and normal requests be received on the same port?
Liveness and readyness probes have different purposes. In short the liveness probe controls whether Kubernetes will restart the pod. But the readyness probe controls whether a pod is included in the endpoints of a service. Unless a pod has indicated it's ready through the readyness probe, it should not receive traffic through a service. That doesn't mean it can't be sent requests, it just means it won't be sent traffic through the service. So in your case the question is, where are those POST requests coming from.
#pst and #Harsh are right but I would like to expand on it a bit.
As the official docs say:
If you'd like to start sending traffic to a Pod only when a probe
succeeds, specify a readiness probe. In this case, the readiness probe
might be the same as the liveness probe, but the existence of the
readiness probe in the spec means that the Pod will start without
receiving any traffic and only start receiving traffic after the
probe starts succeeding.
and:
The kubelet uses readiness probes to know when a container is ready to
start accepting traffic. A Pod is considered ready when all of its
containers are ready. One use of this signal is to control which Pods
are used as backends for Services. When a Pod is not ready, it is
removed from Service load balancers.
Answering your question:
So if readiness probe fails, should i still expect traffic on 8887?
No, the pod should not start receiving traffic if the readiness probe fails.
It can also depend on your app. By using a readiness probe, Kubernetes waits until the app is fully started before it allows the service to send traffic to the new copy.
Also, it is very important to make sure your probes are configured properly:
Probes have a number of fields that you can use to more precisely
control the behavior of liveness and readiness checks:
initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to
0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1
for liveness. Minimum value is 1.
failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness
probe means restarting the container. In case of readiness probe the
Pod will be marked Unready. Defaults to 3. Minimum value is 1.
If you wish to expand your knowledge regarding the liveness, readiness and startup probes please refer to the official docs. You will wind some examples there that can be compared with your setup in order to see if you understand and configured it right.
Consider a pod which has a healthcheck setup via a http endpoint /health at port 80 and it takes almost 60 seconds to be actually ready & serve the traffic.
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 60
livenessProbe:
httpGet:
path: /health
port: 80
Questions:
Is my above config correct for the given requirement?
Does liveness probe start working only after the pod becomes ready ? In other words, I assume readiness probe job is complete once the POD is ready. After that livenessProbe takes care of health check. In this case, I can ignore the initialDelaySeconds for livenessProbe. If they are independent, what is the point of doing livenessProbe check when the pod itself is not ready! ?
Check this documentation. What do they mean by
If you want your Container to be able to take itself down for
maintenance, you can specify a readiness probe that checks an endpoint
specific to readiness that is different from the liveness probe.
I was assuming, the running pod will take itself down only if the livenessProbe fails. not the readinessProbe. The doc says other way.
Clarify!
I'm starting from the second problem to answer. The second question is:
Does liveness probe start working only after the pod becomes ready?
In other words, I assume readiness probe job is complete once the POD
is ready. After that livenessProbe takes care of health check.
Our initial understanding is that liveness probe will start to check after readiness probe was succeeded but it turn out not to be like that. It has opened an issue for this challenge.Yon can look up to here. Then It was solved this problem by adding startup probes.
To sum up:
livenessProbe
livenessProbe: Indicates whether the Container is running. If the
liveness probe fails, the kubelet kills the Container, and the
Container is subjected to its restart policy. If a Container does not
provide a liveness probe, the default state is Success.
readinessProbe
readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.
startupProbe
startupProbe: Indicates whether the application within the Container is started. All other probes are disabled if a startup probe is provided, until it succeeds. If the startup probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a startup probe, the default state is Success
look up here.
The liveness probes are to check if the container is started and alive. If this isn’t the case, kubernetes will eventually restart the container.
The readiness probes in turn also check dependencies like database connections or other services your container is depending on to fulfill it’s work. As a developer you have to invest here more time into the implementation than just for the liveness probes. You have to expose an endpoint which is also checking the mentioned dependencies when queried.
Your current configuration uses a health endpoint which are usually used by liveness probes. It probably doesn’t check if your services is really ready to take traffic.
Kubernetes relies on the readiness probes. During a rolling update, it will keep the old container up and running until the new service declares that it is ready to take traffic. Therefore the readiness probes have to be implemented correctly.
I will show the difference between them in a couple of simple points:
livenessProbe
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
It is used to indicate if the container has started and is alive or not i.e. proof of being available.
In the given example, if the request fails, it will restart the container.
If not provided the default state is Success.
readinessProbe
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
It is used to indicate if the container is ready to serve traffic or not i.e.proof of being ready to use.
It checks dependencies like database connections or other services your container is depending on to fulfill its work.
In the given example, until the request returns Success, it won't serve any traffic(by removing the Pod’s IP address from the endpoints of all Services that match the Pod).
Kubernetes relies on the readiness probes during rolling updates, it keeps the old container up and running until the new service declares that it is ready to take traffic.
If not provided the default state is Success.
Summary
Liveness Probes: Used to check if the container is available and alive.
Readiness Probes: Used to check if the application is ready to be used and serve the traffic.
Both readiness probe and liveness probe seem to have same behavior. They do same type of checks. But the action they take in case of failures is different.
Readiness Probe shuts the traffic from service down. so that service can always the send the request to healthy pod whereas the liveness probe restarts the pod in case of failure. It does not do anything for the service. Service continues to send the request to the pods as usual if it is in ‘available’ status.
It is recommended to use both probes!!
Check here for detailed explanation with code samples.
The Kubernetes platform has capabilities for validating container applications, called healthchecks. Liveness is proof of availability and readness is proof of pod readiness is ready to use.
The features are designed to prevent service downtime and inconsistent images by enabling restarts when needed. Kubernetes uses liveness to know when to restart the container, so it can solve most problems. Kubernetes uses readness to know when the container is available to accept requests. The pod is considered ready when all containers are ready. Therefore, when the pod takes too long to initialize (by cache mount, DB schema, etc.) it is recommended to increase initialDelaySeconds.
I'd post it as a comment but it's too long, So let's make it a full answer.
Is my above config correct for the given requirement?
IMHO no, you are missing initialDelaySeconds for both probes and liveness and rediness probably should not call the same endpoint. I'd use the suggestionss form #fgul
Does liveness probe start working only after the pod becomes ready ?
In other words, I assume readiness probe job is complete once the POD
is ready. After that livenessProbe takes care of health check. In this
case, I can ignore the initialDelaySeconds for livenessProbe. If they
are independent, what is the point of doing livenessProbe check when
the pod itself is not ready! ?
I think you were thinking about startupProbe, again #fgul described what does what so there is no point in me repeating.
I was assuming, the running pod will take itself down only if the
livenessProbe fails. not the readinessProbe. The doc says other way.
The pod can be restarted only based on livenessProbe, not the redinessProbe.
I'd think twice before binding a rediness probe with external services (being alive as #randy advised), especially in high load services:
Let's assume you have define a deployment with lots of pods, that are connecting to a database and are processing lots of requests.
Now the database goes down.
The rediness probe is checking also db connection and it marks all of the pods as "out of service".
Now the db goes up.
Pods rediness probe will start to pass but not instantly and on all pods right away - the pods will be marked as "Ready" one after an other.
But it might be too slow - the second the first pod will be marked as ready, ALL of the traffic will be sent to this one pod alone. It might end in a situation that the "waking up" pods will be killed by the traffic one after an other.
For that kind of situation I'd say the rediness pod should check only pod internal stuff and don't care about the externall services. The kubernetes endpoint will return an error and either the clients might support failing service (it's called "designed for failure") or the loadbalancer/ingress can cover it.
I think the below image describes the use-cases for each.
Liveness probes are a relatively specialized tool, and you probably don't want one at all. However they run totally independently AFAIK.
While doing health check for kubernetes pods, why liveness probe is needed even though we already maintain readiness probe ?
Readiness probe already keeps on checking if the application within pod is ready to serve requests or not, which means that the pod is live. But still, why liveness probe is needed ?
The probes have different meaning with different results:
failing liveness probes -> restart container
failing readiness probes -> do not send traffic to that pod
You can not determine liveness from readiness and vice-versa. Just because pod cannot accept traffic right know, doesn't mean restart is needed, it can mean that it just needs time to finish some work.
If you are deploying e.g. php app, those two will probably be the same, but k8s is a generic system, that supports many types of workloads.
From: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
The kubelet uses liveness probes to know when to restart a Container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a Container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
Sidenote: Actually readiness should be a subset of liveness, that means readiness implies liveness (and failing liveness implies failing readiness). But that doesn't change the explanation above, because if you have only readiness, you can only imply when restart is NOT needed, which is the same as not having any probe for restarting at all. Also because of probes are defined separately there is no guarantee for k8s, that one is subset of the other
The readiness probe checks that your application is ready to serve the requests or not and it will not add that particular pod to ready pod pull until it satisfy readiness checks. The main difference is that it doesnt restart the pod if a pod is not ready.
Liveness probe checks if the pod is not ready(not satisfying specific condition) it will restart the pod in a hope that this pod will recover and becomes ready.
Kubernetes allows you to define few things to make the app available.
1: Liveness probes for your Container.
2: Readiness probes for your Pod.
1- Liveness probes: They help keep your apps healthy by ensuring unhealthy containers are restarted automatically.
2: Readiness probes They help by invoking periodically and determines whether the specific Pod should receive client requests or not.
Operation of Readiness Probes
When a container is started, Kubernetes can be configured to wait for a configurable amount of time to pass before performing the first readiness check. After that, it invokes the probe periodically and acts based on the result of the readiness probe.
If a pod reports that it’s not ready, it’s removed from the service. Just think like- The effect is the same as when the pod doesn’t match the service’s label selector at all.
If the pod then becomes ready again, it’s re-added.
Important difference between liveness and readiness probes
1: Unlike liveness probes(i:e as mentioned above "If a pod reports that it’s not ready, it’s removed from the service"), if a container fails the readiness check, it won’t be killed or restarted.
2: Liveness probes keep pods healthy by killing off unhealthy containers and replacing them with new, healthy ones, whereas readiness probes make sure that only pods that are ready to serve requests receive them
I hope this helps you better.