Kuberentes Update Policy Wait Until Actually Ready - kubernetes

I have a K8 cluster running with a deployment which has an update policy of RollingUpdate. How do I get Kubernetes to wait an extra amount of seconds or until some condition is met before marking a container as ready after starting?
An example would be having an API server with no downtime when deploying an update. But after the container starts it still needs X amount of seconds before it is ready to start serving HTTP requests. If it marks it as ready immediately once the container starts and the API server isn't actually ready there will be some HTTP requests that will fail for a brief time window.

Posting #David Maze comment as community wiki for better visibility:
You need a readiness probe; the pod won't show as "ready" (and the deployment won't proceed) until the probe passes.
Example:
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
initialDelaySeconds: Number of seconds after the container has
started before liveness or readiness probes are initiated. Defaults
to 0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default
to 10 seconds. Minimum value is 1.

Related

Is readinessprobe used amidst rolling deployment?

In the below yaml syntax:
readinessProbe:
httpGet:
path: /index.html
port: 80
initialDelaySeconds: 3
timeoutSeconds: 3
periodSeconds: 10
failureThreshold: 3
Readiness probe is used during initial deployments of Pod.
For rolling out new version of application, using rolling deployment strategy,
Is readiness probe used for rolling deployment?
path & port field allows to input url & port number of a specific service, but not dependent service. how to verify, if dependent service is also ready?
using rolling deployment strategy, Is readiness probe used for rolling deployment?
Yes, the new version of Pods is rolled out and older Pods are not terminated until the new version has Pods in ready state.
E.g. if you roll out a new version, that has a bug so that the Pods does not become ready - the old Pods will still be running and the traffic is only routed to the ready old Pods.
Also, if you don't specify a readinessProbe, the process status is used, e.g. a process that terminates will not be seen as ready.
how to verify, if dependent service is also ready?
You can configure a custom readinessProbe, e.g. a http-endpoint on /healtz and it is up to you what logic you want to use in the implementation of that endpoint. A http response code of 2xx is seen as ready.

Why do I need 3 different kind of probes in kubernetes: startupProbe, readinessProbe, livenessProbe

Why do I need 3 different kind of probes in kubernetes:
startupProbe
readinessProbe
livenessProbe
There are some questions (k8s - livenessProbe vs readinessProbe, Setting up a readiness, liveness or startup probe) and articles about this topic. But this is not so clear:
Why do I need 3 different kind of probes?
What are the use cases?
What are the best practises?
These 3 kind of probes have 3 different use cases. That's why we need 3 kind of probes.
Liveness Probe
If the Liveness Probe fails, the pod will be restarted (read more about failureThreshold).
Use case: Restart pod, if the pod is dead.
Best practices: Only include basic checks in the liveness probe. Never include checks on connections to other services (e.g. database). The check shouldn't take too long to complete.
Always specify a light Liveness Probe to make sure that the pod will be restarted, if the pod is really dead.
Startup Probe
Startup Probes check, when the pod is available after startup.
Use case: Send traffic to the pod, as soon as the pod is available after startup. Startup probes might take longer to complete, because they are only called on initializing. They might call a warmup task (but also consider init containers for initialization). After the Startup probe succeeds, the liveliness probe is called.
Best practices: Specify a Startup Probe, if the pod takes a long time to start. The Startup and Liveness Probe can use the same endpoint, but the Startup Probe can have a less strict failure threshhold which prevents a failure on startup (s. Kubernetes in Action).
Readiness Probe
In contrast to Startup Probes Readiness Probes check, if the pod is available during the complete lifecycle.
In contrast to Liveness Probes only the traffic to the pod is stopped, if the Readiness probe fails, but there will be no restart.
Use case: Stop sending traffic to the pod, if the pod can not temporarily serve because a connection to another service (e.g. database) fails and the pod will recover later.
Best practices: Include all necessary checks including connections to vital services. Nevertheless the check shouldn't take too long to complete.
Always specify a Readiness Probe to make sure that the pod only gets traffic, if the pod can properly handle incoming requests.
Documentation
This article explains very well the differences between the 3 kind of probes.
The Official kubernetes documentation gives a good overview about all configuration options.
Best practises for probes.
The book Kubernetes in Action gives most detailed insights about the best practises.
The difference between livenessProbe, readinessProbe, and startupProbe
livenessProbe:
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
It is used to indicate if the container has started and is alive or not i.e. proof of being avaliable.
In the given example, if the request fails, it will restart the container.
If not provided the default state is Success.
readinessProbe:
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
It is used to indicate if the container is ready to serve traffic or not i.e.proof of being ready to use.
It checks dependencies like database connections or other services your container is depending on to fulfill its work.
In the given example, until the request returns Success, it won't serve any traffic(by removing the Pod’s IP address from the endpoints of all Services that match the Pod).
Kubernetes relies on the readiness probes during rolling updates, it keeps the old container up and running until the new service declares that it is ready to take traffic.
If not provided the default state is Success.
startupProbe:
startupProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
It is used to indicate if the application inside the Container has started.
If a startup probe is provided, all other probes are disabled.
In the given example, if the request fails, it will restart the container.
Once the startup probe has succeeded once, the liveness probe takes over to provide a fast response to container deadlocks.
If not provided the default state is Success.
Check K8S documenation for more.
I think the below table describes the use-cases for each.
Feature
Readiness Probe
Liveness Probe
Startup Probe
Exmine
Indicates whether the container is ready to service requests.
Indicates whether the container is running.
Indicates whether the application within the container is started.
On Failure
If the readiness probe fails, the endpoints controller removes the pod's IP address from the endpoints of all services that match the pod.
If the liveness probe fails, the kubelet kills the container, and the container is subjected to its restart policy.
If the startup probe fails, the kubelet kills the container, and the container is subjected to its restart policy.
Default Case
The default state of readiness before the initial delay is Failure. If a container does not provide a readiness probe, the default state is Success.
If a container does not provide a liveness probe, the default state is Success.
If a container does not provide a startup probe, the default state is Success.
Sources:
Kubernetes in Action
Here's a concrete example of one we're using in our app. It has a single crude HTTP healthcheck, accessible on http://hostname:8080/management/health.
ports:
- containerPort: 8080
name: http-traffic
App Initialization (startup)
Spring app that is slow to start - anywhere between 30-120 seconds.
Don't want other probes to run until app is started.
Check it every 10 seconds for up to 180s before k8s gets into a crash loop.
startupProbe:
successThreshold: 1
failureThreshold: 18
periodSeconds: 10
timeoutSeconds: 5
httpGet:
path: /management/health
port: web-traffic
Healthcheck (readiness)
Ping the app every 10 seconds to make sure it's healthy (ie. accepting HTTP requests).
If fail two subsequent pings, cordone it off (prevents cascades).
Must pass two subsequent health checks before can accept traffic again.
readinessProbe:
successThreshold: 2
failureThreshold: 2
periodSeconds: 10
timeoutSeconds: 5
httpGet:
path: /management/health
port: web-traffic
App has Died (liveliness)
If app fails 3 consecutive health checks, 30 seconds apart, reboot the container. Maybe app got into an unrecoverable state like Java ran out of heap memory.
livenessProbe:
successThreshold: 1
failureThreshold: 3
periodSeconds: 30
timeoutSeconds: 5
httpGet:
path: /management/health
port: web-traffic

how to quickly fail the Kubernetes Readiness probe?

Incase a pod goes down in my cluster, it takes around 15secs or more to determine the failure by readiness probe logic, which is not accepted because of call failure (since kubernetes is not identified the pod failure so it will send the traffic to the failed pod / I mean the failed pod is still in the cluster-P service endpoint).
Please suggest here, how to fail the readiness probe immediately or how to remove the endpoint immediately in case of failure, without much reduce the periodSeconds to below 5secs.
Below is my configuration:
initialDelaySeconds:90s
periodSeconds:5s
timeoutSeconds:2s
successThreshold:<default>
failureThreshold:<default>
Thanking in advance.
What you can do is to adjust you probe's configuration in order to meet you requirements:
Probes have a number of fields that you can use to more precisely
control the behavior of liveness and readiness checks:
initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to
0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1
for liveness. Minimum value is 1.
failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness
probe means restarting the container. In case of readiness probe the
Pod will be marked Unready. Defaults to 3. Minimum value is 1.
You haven't specified the failureThreshold so it defaults to 3. The values you are currently using would take ~15-20 seconds to consider the pod as failed and restart it.
If you set the minimal values for periodSeconds, timeoutSeconds, successThreshold and failureThreshold you can expect more frequent checks and faster pod recreations.

Does readinessProbe keep pinging the url

I am using readinessprobe for rolling updates. It works fine. But even after pods comes up. It keeps pinging healthz even after pod is running. I was assuming it will stop pinging when pods are up and running. Is it right?
specs:
containers:
- name: ready
readinessProbe:
httpGet:
path: /healthz
port: 80
The kubelet will continue to run this check every 10 second which is default value. You can customize it according to your need.
It's important data for kubelet to understand if the Container is healthy or not. if it is not healthy it will try to restart it. therefore, its a continuous process. that's how we try to achieve application availability
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
For further detail
configure-probes
readinessProbe and livenessProbe will continue to do the check depends on the periodSeconds you have set or default value which is 10 seconds.
readinessProbe and livenessProbe do the exact same thing. The difference is the actions to be performed in case of failure.
readinessProbe will shut the communication down with the service in case of failure - so that service does not send any request to the Pod.
livenessProbe will restart the Pod in case of failure.

Kubernetes livenessProbe: restarting vs destroying of the pod

Is there a way to tell Kubernetes to just destroy a pod and create a new one if the liveness probe fails? What I see from logs now: my node js application is just restarted and runs in the same pod.
The liveness probe is defined in my YAML specification as follows:
livenessProbe:
httpGet:
path: /app/check/status
port: 3000
httpHeaders:
- name: Accept
value: application/x-www-form-urlencoded
initialDelaySeconds: 60
periodSeconds: 60
Disclaimer:
I am fully aware that recreating a pod if a liveness prove fails is probably not the best idea and a right way would be to get a notification that something is going on.
So liveness and readiness probes are defined in containers not pods so if you have 1 container in your pod and you specify restartPolicy to Never. Then your pod will go into a Failed state and will be scrapped at some point based on the terminated-pod-gc-threshold value.
If you have more than one container in your pod it becomes tricker because of your other container(s) running making the pod still be in Running status. You can build your own automation or try Pod Readiness which is still in alpha as of this writing.