Kubernetes Deployment restartPolicy alternatives - kubernetes

I have a Deployment configuration that keeps a certain amount of Pods alive. However due to some strange circumstances these Pods fail the readiness probes sometimes and do not recover after a restart thus requiring me to manually delete the Pod from the Replica Set.
A solution to this would be to set the Pod restartPolicy to Never but that is actually not supported https://github.com/kubernetes/kubernetes/issues/24725.
My question is what alternatives are there to make it so that if a Pod has failed it's readiness probe then the Pod would be deleted.

You could change the liveness probe to make it fail whenever the readiness probe fails. This would kill the pod, and start a new one.

Related

Kubernetes StatefulSets and livenessProbes

Liveness probes are supposed to trigger a restart of failed containers. Do they respect the default stateful set deployment and scaling guarantees. E.g. if the liveness probe fails at the same time for multiple pods within one and the same stateful set, would K8S attempt to restart one container at a time or all in parallel?
According to https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ the liveness probes are a feature implemented in the kubelet:
The kubelet uses liveness probes to know when to restart a container.
This means any decision about scheduling that requires knowledge of multiple pods is not taken into account.
Therefore if all your statefulset's pods have failing liveness probes at the same time they will be rescheduled at about the same time not respecting any deployment-level guarantees.

k8s - Keep pod up even if sidecar crashed

I have a pod with a sidecar. The sidecar does file synchronisation and is optional. However it seems that if the sidecar crashes, the whole pod becomes unavailable. I want the pod to continue serving requests even if its sidecar crashed. Is this doable?
Set pod's restartPolicy to Never. It will prevent the kubelet from restarting your pod even if one of your containers failed.
If a Pod is running and has two Containers. Container 1 exits with failure. If the restartPolicy it set to Never, the kubelet will not restart Container and the Pod's phase stays Running.
Reference

Is readiness probe in kubernetes event after the POD is RUNNING state?

Is readiness probe run even after the POD is ready ?
Will it run even after the POD is in RUNNING state?
Is readiness probe run even after the POD is ready ?
Yes
Will it run even after the POD is in RUNNING state?
Yes
As official documentation is saying, there are some cases when:
applications are temporarily unable to serve traffic... an application might depend on external services ... In such cases, you don't want to kill the application, but you don’t want to send it requests either. Kubernetes provides readiness probes to detect and mitigate these situations. A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services.
So, liveness probe is to detect and remedy situationswhen App cannot recover except by being restarted.
Readiness probe is used to detect situation when traffic shall not be sent to App.
Both probes have the same set of settings as initialDelaySeconds, periodSeconds , etc.
Readiness probe checks if container available for incoming traffic. It is being constantly executed even when container gets ready.
Here are the docs:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

why both liveness is needed with readiness

While doing health check for kubernetes pods, why liveness probe is needed even though we already maintain readiness probe ?
Readiness probe already keeps on checking if the application within pod is ready to serve requests or not, which means that the pod is live. But still, why liveness probe is needed ?
The probes have different meaning with different results:
failing liveness probes -> restart container
failing readiness probes -> do not send traffic to that pod
You can not determine liveness from readiness and vice-versa. Just because pod cannot accept traffic right know, doesn't mean restart is needed, it can mean that it just needs time to finish some work.
If you are deploying e.g. php app, those two will probably be the same, but k8s is a generic system, that supports many types of workloads.
From: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
The kubelet uses liveness probes to know when to restart a Container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a Container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
Sidenote: Actually readiness should be a subset of liveness, that means readiness implies liveness (and failing liveness implies failing readiness). But that doesn't change the explanation above, because if you have only readiness, you can only imply when restart is NOT needed, which is the same as not having any probe for restarting at all. Also because of probes are defined separately there is no guarantee for k8s, that one is subset of the other
The readiness probe checks that your application is ready to serve the requests or not and it will not add that particular pod to ready pod pull until it satisfy readiness checks. The main difference is that it doesnt restart the pod if a pod is not ready.
Liveness probe checks if the pod is not ready(not satisfying specific condition) it will restart the pod in a hope that this pod will recover and becomes ready.
Kubernetes allows you to define few things to make the app available.
1: Liveness probes for your Container.
2: Readiness probes for your Pod.
1- Liveness probes: They help keep your apps healthy by ensuring unhealthy containers are restarted automatically.
2: Readiness probes They help by invoking periodically and determines whether the specific Pod should receive client requests or not.
Operation of Readiness Probes
When a container is started, Kubernetes can be configured to wait for a configurable amount of time to pass before performing the first readiness check. After that, it invokes the probe periodically and acts based on the result of the readiness probe.
If a pod reports that it’s not ready, it’s removed from the service. Just think like- The effect is the same as when the pod doesn’t match the service’s label selector at all.
If the pod then becomes ready again, it’s re-added.
Important difference between liveness and readiness probes
1: Unlike liveness probes(i:e as mentioned above "If a pod reports that it’s not ready, it’s removed from the service"), if a container fails the readiness check, it won’t be killed or restarted.
2: Liveness probes keep pods healthy by killing off unhealthy containers and replacing them with new, healthy ones, whereas readiness probes make sure that only pods that are ready to serve requests receive them
I hope this helps you better.

Pods readiness check fails after running, why are new pods not started?

I'm trying to understand how to best use the Kubernetes readiness and liveness probes. What I have observed when testing is that if a readiness probe fails then the pod is marked as unready and removed from the load balancer. However I then expected a new pod to be started and moved into the load balancer until the original pod became ready again (or it's liveness probe failed and it is killed) and then one of them could be terminated.
We might want to temporarily remove a pod from service if the readiness probe fails but this seems to run the risk that all pods could go un-ready which would result in no pods in the load balancer and no new pods being started.
I assume what I am observing with no new pods being started to cover those that are un-ready is expected behavior? In which case what exactly is the use case for the readiness probe and how is the risk of all pods becoming unready mitigated?
No matter how many pods , if the readiness probe fails , it will be restarted regardless. And traffic will not be sent to it unless it passes the readiness probe. Restarting the same container makes more sense rather than creating a new one.