Kubernetes custom action when readiness probe fails - kubernetes

I am currently trying to scale automatically a deployment when the readiness probe hass failed for its pods currently running.
A pod is IDLE until a POST request is sent to it and while it is processing the request, it is not answering any other request.
To know when a processing is in progress, I created an endpoint returning TRUE if the pod is IDLE, FALSE otherwise.
I configure the readiness probe to query this endpoint to mark it as unavailable when a processing is in progress (and to mark it back to available when it is not processing anymore).
By default I have a limited pool of pods (like 5) that can answer requests.
But I still want to be able to send another POST with other parameters to trigger another processing when all my 5 pods are unavailable.
So, when the readiness probe fails for all pods, I want to scale the deployment in order to have other pods availbel to answer requests.
The issue here is that I did not find how to do such a thing with K8S or if this is even possible. Is there someone who could help me on this?
An alternative would be to create a 'watcher' pod who would watch all the readiness probe for a given deployement and when I fails for all pods, the watch would be in charge of scaling the deployement.
But this alternative implies development that I would like to avoid if it is natively possible to do in K8S.
Thank you :)

A readinessprobe by itself shouldn't be able to scale a deployment. By default, the only thing it can do is removing the Pod's IP from the endpoints of all the services that match the Pod.
The only solution that comes to my mind is what you said, so having an Horizontal Pod Autoscaler with custom metrics pointing to a Pod that keeps track of all the readinessprobes.

Related

Kubernetes HPA. Settings for right down scale

I use Kubernetes in my project, specially HPA. So, every minute in project we started check-status request for checking if all microservices are available. Availability is defined by simple response from one of replicas (not all) each microservice.
But I have one moment related to HPA. When HPA automatically decides to remove some pods from cluster and my check-status request comes to server at the same time then very often occurs that my API-gateway service push it to deleted pod and doesn't get any response. It means that microservice is unavailable for our server.
My question is what is the best way for setting autoscaler to avoid this cases.
It is not related to HPA in this case but more on how you graceful shut down your pods.
In short, your service/LB is not aware if your pod is ready to accept new requests, so on a SIGTERM signal, your pod should set your readiness probe to false, and give some time for the app to shutdown. If your readiness probe is not healthy, the service won't send new requests to your pod.
Then you can shut it down once all requests have been addressed AND the pod won't receive new requests.
I would advise you of reading these sources:
https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-best-practices-terminating-with-grace
https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html

What happens to traffic to a temporary unavailable pod in a StatefulSet?

I've recently been reading myself into kubernetes and want to create a StatefulSet for a service of mine.
As far as I understood, a StatefulSet with let's say 5 replicas offers certian dns entries to reach it.
E.g. myservice1.internaldns.net, myservice2.internaldns.net
What would now happen, if one of the pods behind the dns entries goes down, even if it's just for a small amount of time?
I had a hard time finding information on this.
Is the request held until the pod is back? Will it be router to another pod, possibly losing the respective state? Will it just straightup fail?
If you're Pod is not ready, then the traffic is not forwarded to that Pod. So, your service will not load balance traffic to Pods that are not ready.
To decide if the given Pod is ready or not, you should define readinessProbe. I recommend reading the Kubernetes documentation on "Configure Liveness, Readiness and Startup Probes".

Is container where liveness or readiness probes's config are set to a "pod check" container?

I'm following this task Configure Liveness, Readiness and Startup Probes
and it's unclear to me whether a container where the check is made is a container only used to check the availability of a pod? Because it makes sense if pod check container fails therefore api won't let any traffic in to the pod.
So a health check signal must be coming from container where some image or app runs? (sorry, another question)
From the link you provided it seems like they are speaking about Containers and not Pods so the probes are meant to be per containers. When all containers are ready the pod is described as ready too as written in the doc you provided :
The kubelet uses readiness probes to know when a Container is ready to
start accepting traffic. A Pod is considered ready when all of its
Containers are ready. One use of this signal is to control which Pods
are used as backends for Services. When a Pod is not ready, it is
removed from Service load balancers.
So yes, every containers that are running some images or apps are supposed to expose those metrics.
Livenes and readiness probes as described by Ko2r are additional checks inside your containers and verified by kubelet according to the settings fro particular probe:
If the command (defined by health-check) succeeds, it returns 0, and the kubelet considers the Container to be alive and healthy. If the command returns a non-zero value, the kubelet kills the Container and restarts it.
In addition:
The kubelet uses liveness probes to know when to restart a Container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a Container in such a state can help to make the application more available despite bugs.
Fro another point of view:
Pod is a top-level resource in the Kubernetes REST API.
As per docs:
Pods are ephemeral. They are not designed to run forever, and when a Pod is terminated it cannot be brought back. In general, Pods do not disappear until they are deleted by a user or by a controller.
Information about controllers can find here:
So the best practise is to use controllers like describe above. You’ll rarely create individual Pods directly in Kubernetes–even singleton Pods. This is because Pods are designed as relatively ephemeral, disposable entities. When a Pod gets created (directly by you, or indirectly by a Controller), it is scheduled to run on a Node in your cluster. The Pod remains on that Node until the process is terminated, the pod object is deleted, the Pod is evicted for lack of resources, or the Node fails.
Note:
Restarting a container in a Pod should not be confused with restarting the Pod. The Pod itself does not run, but is an environment the containers run in and persists until it is deleted
Because Pods represent running processes on nodes in the cluster, it is important to allow those processes to gracefully terminate when they are no longer needed (vs being violently killed with a KILL signal and having no chance to clean up). Users should be able to request deletion and know when processes terminate, but also be able to ensure that deletes eventually complete. When a user requests deletion of a Pod, the system records the intended grace period before the Pod is allowed to be forcefully killed, and a TERM signal is sent to the main process in each container. Once the grace period has expired, the KILL signal is sent to those processes, and the Pod is then deleted from the API server. If the Kubelet or the container manager is restarted while waiting for processes to terminate, the termination will be retried with the full grace period.
The Kubernetes API server validates and configures data for the api objects which include pods, services, replicationcontrollers, and others. The API Server services REST operations and provides the frontend to the cluster’s shared state through which all other components interact.
For example, when you use the Kubernetes API to create a Deployment, you provide a new desired state for the system. The Kubernetes Control Plane records that object creation, and carries out your instructions by starting the required applications and scheduling them to cluster nodes–thus making the cluster’s actual state match the desired state.
Here you can find information about processing pod termination.
There are different probes:
For example for HTTP probe:
even if your app isn’t an HTTP server, you can create a lightweight HTTP server inside your app to respond to the liveness probe.
Command
For command probes, Kubernetes runs a command inside your container. If the command returns with exit code 0 then the container is marked as healthy.
More about probes and best practices.
Hope this help.

How do I make Kubernetes evict a pod that has not been ready for a period of time?

I have readiness probes configured on several pods (which are members of deployment-managed replica sets). They work as expected -- readiness is required as part of the deployment's rollout strategy, and if a healthy pod becomes NotReady, the associated Service will remove it from the pool of endpoints until it becomes Ready again.
Furthermore, I have external health checking (using Sensu) that alerts me when a pod becomes NotReady.
Sometimes, a pod will report NotReady for an extended period of time, showing no sign of recovery. I would like to configure things such that, if a pod stays in NotReady for an extended period of time, it gets evicted from the node and rescheduled elsewhere. I'll settle for a mechanism that simply kills the container (leading it to be restarted in the same pod), but what I really want is for the pod to be evicted and rescheduled.
I can't seem to find anything that does this. Does it exist? Most of my searching turns up things about evicting pods from NotReady nodes, which is not what I'm looking for at all.
If this doesn't exist, why? Is there some other mechanism I should be using to accomplish something equivalent?
EDIT: I should have specified that I also have liveness probes configured and working the way I want. In the scenario I’m talking about, the pods are “alive.” My liveness probe is configured to detect more severe failures and restart accordingly and is working as desired.
I’m really looking for the ability to evict based on a pod being live but not ready for an extended period of time.
I guess what I really want is the ability to configure an arbitrary number of probes, each with different expectations it checks for, and each with different actions it will take if a certain failure threshold is reached. As it is now, liveness failures have only one method of recourse (restart the container), and readiness failures also have just one (just wait). I want to be able to configure any action.
As of Kubernetes v1.15, you might want to use a combination of readiness probe and liveness probe to achieve the outcome that you want . See configure liveness and readiness probes.
A new feature to start the liveness probe after the pod is ready is likely to be introduced in v1.16. There will be a new probe called startupProbe that can handle this in a more intuitive manner.
You can configure HTTP liveness probe or TCP liveness probe with periodSeconds depends on the type of the container images.
livenessProbe:
.....
initialDelaySeconds: 3
periodSeconds: 5 [ This field specifies that kubelet should perform liveness probe every 3 seconds. ]
You may try to use for that purpose Prometheus metrics and create an alert like here. Based on that you can configure a webhook in alertmanager, which will react properly ( action: kill POD ) and the Pod will be then recreated by the scheduler.

Kubernetes default liveness and readiness probe

I wanted to know what kubernetes does to check liveness and readiness of a pod and container by default.
I could find the document which mentions how I can add my custom probe and change the probe parameters like initial delay etc. However, could not find the default probe method used by k8s.
By default, Kubernetes starts to send traffic to a pod when all the containers inside the pod start, and restarts containers when they crash. While this can be good enough when you are starting out, but you can make your deployment more robust by creating custom health checks.
By default, Kubernetes just checks container inside the pod is up and starts sending traffic. There is no by default readiness or liveness check provided by kubernetes.
Readiness Probe
Let’s imagine that your app takes a minute to warm up and start. Your service won’t work until it is up and running, even though the process has started. You will also have issues if you want to scale up this deployment to have multiple copies. A new copy shouldn’t receive traffic until it is fully ready, but by default Kubernetes starts sending it traffic as soon as the process inside the container starts. By using a readiness probe, Kubernetes waits until the app is fully started before it allows the service to send traffic to the new copy.
Liveness Probe
Let’s imagine another scenario where your app has a nasty case of deadlock, causing it to hang indefinitely and stop serving requests. Because the process continues to run, by default Kubernetes thinks that everything is fine and continues to send requests to the broken pod. By using a liveness probe, Kubernetes detects that the app is no longer serving requests and restarts the offending pod.
TL/DR: there is no default readiness probe ("should I send this pod traffic?") and the default liveness probe ("should I kill this pod?") is just whether the container is still running.
Kubernetes will not do anything on its own. You will have to decide what liveness and readiness mean to you. There are multiple things you can do, for example, an HTTP get request, issue a command, or connect on a port. It is up to you to decide how you want to make sure your users are happy and everything is running correctly.