Kubernetes StatefulSets and livenessProbes - kubernetes

Liveness probes are supposed to trigger a restart of failed containers. Do they respect the default stateful set deployment and scaling guarantees. E.g. if the liveness probe fails at the same time for multiple pods within one and the same stateful set, would K8S attempt to restart one container at a time or all in parallel?

According to https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ the liveness probes are a feature implemented in the kubelet:
The kubelet uses liveness probes to know when to restart a container.
This means any decision about scheduling that requires knowledge of multiple pods is not taken into account.
Therefore if all your statefulset's pods have failing liveness probes at the same time they will be rescheduled at about the same time not respecting any deployment-level guarantees.

Related

Hazelcast failes liveness probe in OpenShift when loading a large map

We have Hazelcast 4.2 on OpenShift deployed as a standalone cluster and stateful set. We use Mongo as a backing data store (it shouldn't matter) and the docker image is created with a copy of dockerfile from Github Hazelcast project with all the package repositories replaced with our internal company servers.
We have a MapLoader which takes a long time (30 minutes) to load all the data. During this load time the cluster fails to respond to liveness and readiness probes:
Readiness probe failed: Get http://xxxx:5701/hazelcast/health/node-state: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
and so OpenShift kills the nodes that are loading the data.
Is there any way to fix it, so Hazelcast responds with "alive" and "not ready" instead of timing out the connection?
You can increase the time for POD to check the health of POD if you are loading data initially.
https://github.com/hazelcast/charts/blob/d3b8d118da400255effc81e67a13c1863fee5f41/stable/hazelcast/values.yaml#L120
Above is helm example line showing the readiness and liveness configuration.
You can change the initialDelaySeconds to wait and after that seconds only Hazelcast will start checking the health.
Accordingly you can also adjust the other probes configuration like failureThreshold: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
if initialDelaySeconds is not helpful, still you can increase the configuration for the readiness & liveness accordingly and increase the timeout whenever map loader loading the data.
Also only in liveness your POD or container get restart if Readiness failing K8s will mark your POD as not ready to accept traffic. So while loading data you can increase the liveness configuration so it never failed and set to max possible.
The kubelet uses liveness probes to know when to restart a container.
For example, liveness probes could catch a deadlock, where an
application is running, but unable to make progress. Restarting a
container in such a state can help to make the application more
available despite bugs.
The kubelet uses readiness probes to know when a container is ready to
start accepting traffic. A Pod is considered ready when all of its
containers are ready. One use of this signal is to control which Pods
are used as backends for Services. When a Pod is not ready, it is
removed from Service load balancers.
You can also set the restartPolicy to Never so the container will not restart.

Are Kubernetes liveness probe failures voluntary or involuntary disruptions?

I have an application deployed to Kubernetes that depends on an outside application. Sometimes the connection between these 2 goes to an invalid state, and that can only be fixed by restarting my application.
To do automatic restarts, I have configured a liveness probe that will verify the connection.
This has been working great, however, I'm afraid that if that outside application goes down (such that the connection error isn't just due to an invalid pod state), all of my pods will immediately restart, and my application will become completely unavailable. I want it to remain running so that functionality not depending on the bad service can continue.
I'm wondering if a pod disruption budget would prevent this scenario, as it limits the # of pods down due to a "voluntary" disruption. However, the K8s docs don't state whether liveness probe failure are a voluntary disruption. Are they?
I would say, accordingly to the documentation:
Voluntary and involuntary disruptions
Pods do not disappear until someone (a person or a controller) destroys them, or there is an unavoidable hardware or system software error.
We call these unavoidable cases involuntary disruptions to an application. Examples are:
a hardware failure of the physical machine backing the node
cluster administrator deletes VM (instance) by mistake
cloud provider or hypervisor failure makes VM disappear
a kernel panic
the node disappears from the cluster due to cluster network partition
eviction of a pod due to the node being out-of-resources.
Except for the out-of-resources condition, all these conditions should be familiar to most users; they are not specific to Kubernetes.
We call other cases voluntary disruptions. These include both actions initiated by the application owner and those initiated by a Cluster Administrator. Typical application owner actions include:
deleting the deployment or other controller that manages the pod
updating a deployment's pod template causing a restart
directly deleting a pod (e.g. by accident)
Cluster administrator actions include:
Draining a node for repair or upgrade.
Draining a node from a cluster to scale the cluster down (learn about Cluster Autoscaling ).
Removing a pod from a node to permit something else to fit on that node.
-- Kubernetes.io: Docs: Concepts: Workloads: Pods: Disruptions
So your example is quite different and according to my knowledge it's neither voluntary or involuntary disruption.
Also taking a look on another Kubernetes documentation:
Pod lifetime
Like individual application containers, Pods are considered to be relatively ephemeral (rather than durable) entities. Pods are created, assigned a unique ID (UID), and scheduled to nodes where they remain until termination (according to restart policy) or deletion. If a Node dies, the Pods scheduled to that node are scheduled for deletion after a timeout period.
Pods do not, by themselves, self-heal. If a Pod is scheduled to a node that then fails, the Pod is deleted; likewise, a Pod won't survive an eviction due to a lack of resources or Node maintenance. Kubernetes uses a higher-level abstraction, called a controller, that handles the work of managing the relatively disposable Pod instances.
-- Kubernetes.io: Docs: Concepts: Workloads: Pods: Pod lifecycle: Pod lifetime
Container probes
The kubelet can optionally perform and react to three kinds of probes on running containers (focusing on a livenessProbe):
livenessProbe: Indicates whether the container is running. If the liveness probe fails, the kubelet kills the container, and the container is subjected to its restart policy. If a Container does not provide a liveness probe, the default state is Success.
-- Kubernetes.io: Docs: Concepts: Workloads: Pods: Pod lifecycle: Container probes
When should you use a liveness probe?
If the process in your container is able to crash on its own whenever it encounters an issue or becomes unhealthy, you do not necessarily need a liveness probe; the kubelet will automatically perform the correct action in accordance with the Pod's restartPolicy.
If you'd like your container to be killed and restarted if a probe fails, then specify a liveness probe, and specify a restartPolicy of Always or OnFailure.
-- Kubernetes.io: Docs: Concepts: Workloads: Pods: Pod lifecycle: When should you use a startup probe
According to those information it would be better to create custom liveness probe which should consider internal process health checks and external dependency(liveness) health check. In the first scenario your container should stop/terminate your process unlike the the second case with external dependency.
Answering following question:
I'm wondering if a pod disruption budget would prevent this scenario.
In this particular scenario PDB will not help.
I'd reckon giving more visibility to the comment, I've made with additional resources on the matter could prove useful to other community members:
Blog.risingstack.com: Designing microservices architecture for failure
Loft.sh: Blog: Kubernetes readiness probles examples common pitfalls: External depenedencies
Cloud.google.com: Archiecture: Scalable and resilient apps: Resilience designing to withstand failures
Testing with PodDisruptionBudget.
Pod will still restart at the same time.
example
https://github.com/AlphaWong/PodDisruptionBudgetAndPodProbe
So yes. like #Dawid Kruk u should create a customized script like following
# something like this
livenessProbe:
exec:
command:
- /bin/sh
- -c
# generate a random number for sleep
- 'SLEEP_TIME=$(shuf -i 2-40 -n 1);sleep $SLEEP_TIME; curl -L --max-time 5 -f nginx2.default.svc.cluster.local'
initialDelaySeconds: 10
# think about the gap between each call
periodSeconds: 30
# it is required after k8s v1.12
timeoutSeconds: 90
I'm wondering if a pod disruption budget would prevent this scenario.
Yes, it will prevent.
As you stated, when the pod goes down (or node failure) nothing can do pods from becoming unavailable. However, Certain services require that a minimum number of pods always keep running always.
There could be another way (Stateful resource) but it’s one of the simplest Kubernetes resources available.
Note: You can also use a percentage instead of an absolute number in the minAvailable field. For example, you could state that 60% of all pods with the app=run-always label need to be running at all times.

How kubernetes handle livenessProbe failure?

I want to understand what happens behind the scene if a liveness probe fails in kubernetes ?
Here is the context:
We are using Helm Chart for deploying out application in Kubernetes cluster.
We have a statefulsets and headless service. To initialize mTLS, we have created a 'job' kind and in 'command' we are passing shell & python scripts are an arguments.
We have written a 'docker-entrypoint.sh' inside 'docker image' for some initialization work.
Inside statefulSet, we are passing a shell script as a command in 'livenessProbe' which runs every 30 seconds.
I want to know if my livenessProbe fails for any reason :
1. Does helm chart monitor this probe & will restart container or it's K8s responsibility ?
2. Will my 'docker-entryPoint.sh' execute if container is restarted ?
3. Will 'Job' execute at the time container restart ?
How Kubernetes handles livenessProbe failure and what steps it takes?
It's not helm's responsibility.It's kubernetes's responsibility to restart the pod in case of readiness probe failure.
Yes docker-entryPoint.sh is executed at container startup.
Job needs to be applied again to the cluster for it to execute. Alternatively you could use initcontainer which is guaranteed to run before the main container starts.
Kubelet kills the container and restarts it if liveness probe fails.
To answer your question liveness probe and readiness probe are actions basically get calls to your application pod to check whether it is healthy.
This is not related to helm charts.
Once the liveness or readiness probe fails container restart takes place .
I would say these liveness probes failure can affect your app uptime, so use a rolling deployment and autoscale your pod counts to enable availability.

Does Liveness Probes restart or kill the Pod?

I had read on the documentation that liveness probes make a new pod and stop the other one. But in the kubernetes dashboard it shows me only restarts with my tcp livness probe. I was wondering what kubernetes does during a liveness probe. Can i control it?
The kubelet uses liveness probes to know when to restart a Container, not recreate the pods.
Probes have a number of fields that you can use to more precisely control the behavior of the checks (initialDelaySeconds,periodSeconds, timeoutSeconds, successThreshold and failureThreshold). You can find details about them here.
For container restart, SIGTERM is first sent with waits for a parameterized grace period, and then Kubernetes sends SIGKILL. You can control some of this behavior by tweaking the terminationGracePeriodSeconds value and/or Attaching Handlers to Container Lifecycle Events.

why both liveness is needed with readiness

While doing health check for kubernetes pods, why liveness probe is needed even though we already maintain readiness probe ?
Readiness probe already keeps on checking if the application within pod is ready to serve requests or not, which means that the pod is live. But still, why liveness probe is needed ?
The probes have different meaning with different results:
failing liveness probes -> restart container
failing readiness probes -> do not send traffic to that pod
You can not determine liveness from readiness and vice-versa. Just because pod cannot accept traffic right know, doesn't mean restart is needed, it can mean that it just needs time to finish some work.
If you are deploying e.g. php app, those two will probably be the same, but k8s is a generic system, that supports many types of workloads.
From: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
The kubelet uses liveness probes to know when to restart a Container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a Container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
Sidenote: Actually readiness should be a subset of liveness, that means readiness implies liveness (and failing liveness implies failing readiness). But that doesn't change the explanation above, because if you have only readiness, you can only imply when restart is NOT needed, which is the same as not having any probe for restarting at all. Also because of probes are defined separately there is no guarantee for k8s, that one is subset of the other
The readiness probe checks that your application is ready to serve the requests or not and it will not add that particular pod to ready pod pull until it satisfy readiness checks. The main difference is that it doesnt restart the pod if a pod is not ready.
Liveness probe checks if the pod is not ready(not satisfying specific condition) it will restart the pod in a hope that this pod will recover and becomes ready.
Kubernetes allows you to define few things to make the app available.
1: Liveness probes for your Container.
2: Readiness probes for your Pod.
1- Liveness probes: They help keep your apps healthy by ensuring unhealthy containers are restarted automatically.
2: Readiness probes They help by invoking periodically and determines whether the specific Pod should receive client requests or not.
Operation of Readiness Probes
When a container is started, Kubernetes can be configured to wait for a configurable amount of time to pass before performing the first readiness check. After that, it invokes the probe periodically and acts based on the result of the readiness probe.
If a pod reports that it’s not ready, it’s removed from the service. Just think like- The effect is the same as when the pod doesn’t match the service’s label selector at all.
If the pod then becomes ready again, it’s re-added.
Important difference between liveness and readiness probes
1: Unlike liveness probes(i:e as mentioned above "If a pod reports that it’s not ready, it’s removed from the service"), if a container fails the readiness check, it won’t be killed or restarted.
2: Liveness probes keep pods healthy by killing off unhealthy containers and replacing them with new, healthy ones, whereas readiness probes make sure that only pods that are ready to serve requests receive them
I hope this helps you better.