In my K8S workloads, I implement Readiness probe and Liveness probe for pods health check.
I'm wondering that should I set the interval (periodSeconds) as low as 1 sec, as it will consume more resources, right?
Is there best practices when doing the pod health check?
Firstly, it is important to understand the difference between Liveness and Readiness. The tl;dr is: Liveness is about whether K8s should kill and restart the container, Readiness is about whether the container is able to accept requests. It is likely that you want different parameters for both.
Whether K8s takes any action based on the outcome of the probe depends on the failureThreshold. This is the number of times in a row the probe has to fail before K8s does something. If you combine this with periodSeconds you can tune the sensitivity of your probes.
In general you want to balance:
the time it takes K8s to take action with how quickly your service can be expected to recover based on the probe
the "cost" of the probes. For example if your Readiness probe connects to a database, then you are adding 1 Query Per Second (QPS) load to your database per replica (With 100 replicas, you would be generating 100QPS just through probes!)
the reliability of your probe, also known as "flakiness". What is the false negative rate - i.e what proportion of the time the probe reports failed but the service is actually running with in expected performance rates
Here is one way of thinking about it:
Work out how long your service can be in the failed state before K8s should take action. This should be based on how long it would take to recover (e.g. restart in the case of Liveness)
If a probe is "expensive", have a longer periodSeconds and smaller failureThreshold
If a probe is "flaky" (i.e. occasionally reports failed and then reports working very quickly afterwards) have a shorter periodSeconds and larger failureThreashold.
Related
I am struggling with defining a sane implementation strategy for Kubernetes probes for my product. Digging into the available guidelines, both official docs and field reports, I am not able to identify a common approach with consensus in the ecosystem; and most probably that's expected.
Here's what I plan to implement as probe definition rules:
Readiness probe:
Enable it for services handling incoming network traffic.
Question: Any other use case where I should consider a readiness probe for a service?
Liveness probe:
This is the most difficult one for me. What I have in mind as a rule is: Don't define it by default and only enable it manually for services where scenarios as deadlocks are detected and only until these bugs are fixed.
I don't see as a healthy approach to just assume that a service will deadlock and leave the liveness probe handle it. First because it is very hard to identify a service deadlock from the probe and second because it would leave bugs unaddressed.
Question: Any other use case where I should consider a liveness probe?
Startup probe:
Enable it only when there is a liveness probe enabled on that service.
Question: Without a liveness probe defined, is there any advantage in defining a startup probe?
Question: Without a liveness probe defined, is there any advantage in
defining a startup probe?
i am not seeing any advantage in defining the just startup probe without liveness. startup probe is there to protect the slow-starting PODs.
Question: Any other use case where I should consider a liveness probe?
If the service depends on DB or any other service and inside the backend service code you have a function or endpoint to check that you can use that as a liveness probe. You are right in the case of leaving the bug unaddressed.
Selling point to keeping the Liveness probe is that it will auto-restart container POD if it's failing while the readiness probe just changes the status from 0/1 to handle the traffic.
if you don't specify the liveness probe it will just decide the container status based on the PID.
If you are using the bash/sh, dumb-init in your docker you might not want PID 1 to trace instead you want child process PID 2 to track in that liveness probe would be required.
Based on the documentation initialDelaySeconds gives some delay before the first readiness probe is checking the pod. But the only effect of readiness probe failure is that the pod is marked unready: Does not get traffic from services and also affects the deployment state.
So giving readiness checks some delay is not really effective for the majority of the applications: We want to make the pod early as soon as we can, meaning we need to check if it's ready as soon as possible.
Things to consider:
No reason to set it earlier than the earliest possible startup time.
If you set it late, you are wasting resources (pods not receiving traffic): 1 minute delay for 60 pods is 1 hour.
How much resource does the readiness Probe consume? Does it make external calls (Database, REST, etc - IMHO this should be avoided)?
Can the pod serve the traffic? Are all the necessary connections (DB, REST) established, caches populated? - No reason to accept traffic if the pod cannot connect to the database/backend service.
To summarize:
You want to minimize startup time
You want to minimize readiness calls
Measure it, set it to the earliest possible time a pod can start.
If your pods are failing the readiness regularly causing restarts, increase it.
So giving readiness checks some delay is not really effective for the majority of the applications: We want to make the pod early as soon as we can, meaning we need to check if it's ready as soon as possible.
It depends what application you use. As you can read what readiness probes is:
Sometimes, applications are temporarily unable to serve traffic. For example, an application might need to load large data or configuration files during startup, or depend on external services after startup. In such cases, you don't want to kill the application, but you don't want to send it requests either. Kubernetes provides readiness probes to detect and mitigate these situations. A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services.
If your application requires loading of large data files or configuration files at startup and it always takes e.g. 30 seconds, it makes no sense to start checking immediately after starting if the application is ready to run, because it will not be ready.
Therefore the initialDelaySeconds option is suitable for this and we can set the checking to start e.g. from 20 seconds instead of immediately
initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
In kubernetes official docs, I was reading this page (about the container probes and why we should use startup-probe)
when-should-you-use-a-startup probe, they stated like:
If your container usually starts in more than initialDelaySeconds + failureThreshold × periodSeconds, you should specify a startup probe that checks the same endpoint as the liveness probe. The default for periodSeconds is 10s. You should then set its failureThreshold high enough to allow the container to start, without changing the default values of the liveness probe. This helps to protect against deadlocks.
I understood the whole things that why we need to use startup probe (what i understood that why we need to use startup probe is that: Startup probes are useful for Pods that have containers that take a long time to come into service. As we know, all other probes are disabled if a startup probe is provided, until it succeeds. So if the container takes longer time to start up then we will use startup probe so that until the container start the other two probes remain disabled).
But here I did not get the scenario of deadlock, where and why the deadlock is happening? can anyone explain the scenario of the deadlock that they are talking about? which deadlock are we preventing by using startup probe?
The startup probe is designed to be performed only once after container start.
Readiness probe and liveness Probe are performed not only the startup.
If a startup probe exceeds the configured failureThreshold without succeeding, the container is killed and restarted, subject to the pod's restartPolicy, a behavior analogous to the liveness probe.
Readiness probe may be used by load balancer to determine when it can send traffic.
Startup probe use-cases
The example reason to use startup probe is:
Your application is starting for a long time. You can increase delays for readiness probe and liveness probe but you do not know when your container is ready because those probes are not performed for delay time.
So startup probe is used commonly with readines and liveness probes. It is performed until your container is ready(till your startup probe returns the Success status), so you do not need delays anymore.
External dependencies
Let's say your application is starting for 1-3 minutes(it may depend on external API, resources, slow network etc.). You can put delays to 190 seconds, but you can waste at least 2 minutes if your container is ready after 60 seconds. To solve that issue startup probe was designed.
First initialization
Sometimes, you have to deal with legacy applications that might require an additional startup time on their first initialization. In such cases, it can be tricky to set up liveness probe parameters without compromising the fast response to deadlocks that motivated such a probe. The trick is to set up a startup probe with the same command, HTTP or TCP check, with a failureThreshold * periodSeconds long enough to cover the worse case startup time.
Your question
The deadlock is situation, when your container is not ready but liveness probe is performing and it exceed failure treshold, because of too short delay time. In this situation your container keeps restarting. To prevent that you should use startup probe and put your threshold high enough.
I am now fully clear about my question. So I would like to explain the full scenario that i understood (hope it will help others in future). The answer of #Daniel is correct, but i just want to explain it in more comprehensively.
Explanation of the Terms:
initialDelaySeconds: Number of seconds after the container has started before the probe is scheduled, which means after this the defined probes will schedule.
failureThreshold: The number of times that the probe is allowed to fail before the liveness probe restarts the container (or in the case of a readiness probe marks the pod as unavailable)
periodSeconds: It means in every periodSeconds the kubelet will perform the scheduled probe.
initialDelaySeconds + failureThreshold × periodSeconds: total time, after that the scheduled probe will take action according to their characteristics (liveness probe restarts the container, or in the case of a readiness probe marks the pod as unavailable)
as from #Daniel comment, Remember also, that all probes has separated failureThreshold and periodSeconds. So for liveness probe those values can be small to kill container as fast as it is not working properly, For starup probe values can be higher to wait long enough for startup.
How deadlock is happening
Now, if the startup probe is not used and the container takes longer than the total (initialDelaySeconds + failureThreshold × periodSeconds) time to start then before the container get started up then liveness probe will restart the container through kubelet as long as initialDelaySeconds + failureThreshold × periodSeconds time passed, because If the liveness probe fails, the kubelet kills the container, and the container is subjected to its restart policy.
And same scenario will happen again while restarting the container, and it will happen again and again and the container won't be able to start up in every time. Here the deadlock happens.
So the deadlock is happening because the container was taking longer than initialDelaySeconds + failureThreshold × periodSeconds time to start up and we did not used startup probe here.
Preventing the deadlock
Now to prevent the deadlock we can do two things:
We can give high liveness interval, but as the time that container can take is not fixed that is why this approach is not better approach.
We can use startup probe, As we know, all other probes are disabled if a startup probe is provided, until it succeeds. So if we use startup probe we don't need to thing about the deadlock that was happening before.
Now only another thing is left, that is we need to give high failureThreshold because the startup probe can also fail if the contianer take longer than initialDelaySeconds + failureThreshold × periodSeconds time to start up (here one thing need to be clear that initialDelaySeconds + failureThreshold × periodSeconds is general formula and it is calculated for all the probes respectively). So we also need to set high failureThreshold while using startup probe. By this way, we can completely solve the deadlock problem and also can guarantee that the container will get enough time to start up.
I have readiness probes configured on several pods (which are members of deployment-managed replica sets). They work as expected -- readiness is required as part of the deployment's rollout strategy, and if a healthy pod becomes NotReady, the associated Service will remove it from the pool of endpoints until it becomes Ready again.
Furthermore, I have external health checking (using Sensu) that alerts me when a pod becomes NotReady.
Sometimes, a pod will report NotReady for an extended period of time, showing no sign of recovery. I would like to configure things such that, if a pod stays in NotReady for an extended period of time, it gets evicted from the node and rescheduled elsewhere. I'll settle for a mechanism that simply kills the container (leading it to be restarted in the same pod), but what I really want is for the pod to be evicted and rescheduled.
I can't seem to find anything that does this. Does it exist? Most of my searching turns up things about evicting pods from NotReady nodes, which is not what I'm looking for at all.
If this doesn't exist, why? Is there some other mechanism I should be using to accomplish something equivalent?
EDIT: I should have specified that I also have liveness probes configured and working the way I want. In the scenario I’m talking about, the pods are “alive.” My liveness probe is configured to detect more severe failures and restart accordingly and is working as desired.
I’m really looking for the ability to evict based on a pod being live but not ready for an extended period of time.
I guess what I really want is the ability to configure an arbitrary number of probes, each with different expectations it checks for, and each with different actions it will take if a certain failure threshold is reached. As it is now, liveness failures have only one method of recourse (restart the container), and readiness failures also have just one (just wait). I want to be able to configure any action.
As of Kubernetes v1.15, you might want to use a combination of readiness probe and liveness probe to achieve the outcome that you want . See configure liveness and readiness probes.
A new feature to start the liveness probe after the pod is ready is likely to be introduced in v1.16. There will be a new probe called startupProbe that can handle this in a more intuitive manner.
You can configure HTTP liveness probe or TCP liveness probe with periodSeconds depends on the type of the container images.
livenessProbe:
.....
initialDelaySeconds: 3
periodSeconds: 5 [ This field specifies that kubelet should perform liveness probe every 3 seconds. ]
You may try to use for that purpose Prometheus metrics and create an alert like here. Based on that you can configure a webhook in alertmanager, which will react properly ( action: kill POD ) and the Pod will be then recreated by the scheduler.
For Openshift Health checks (Liveness and readiness probes), does the liveness check run after the container is ready. So should the Readiness initial delay be less than the Liveness initial delay.
Please advise.
Thanks
B.
The delay specified for both readiness and liveness check is from the start of the deployment. The start of the delay for the liveness check is not dependent on the pod first being ready. Once they start, both run for the life of the pod.
You need to evaluate what you set the delays to based on the role of each check and how you implement the checks.
A readiness probe checks if an application is ready to service requests. It is used initially to determine if the pod has started up correctly and becomes ready, but also subsequently, to determine if the pod IP should be removed from the set of endpoints for any period, with it possibly being added back later if the check is set to pass again, with the application again being ready to handle requests.
A liveness probe checks if an application is still working. It is used to check if your application running in a pod is still running and that it is also working correctly. If the probe keeps failing, the pod will be shutdown, with a new pod started up to replace it.
So having the delay for the liveness check be larger than that for the readiness check is quite reasonable, especially if during the initial startup phase the liveness check would fail. You don't want the pod to be killed off when startup time can be quite long.
You may also want to look at the period and success/failure thresholds.
Overall it is hard to give a set rule as it depends on your application.