How long do Kubernetes Pods persist? - kubernetes

How long does a pod persist without a replication controller?
I have run some pods that have a very simple purpose, they execute and then terminate. Other pods like a database server pod persists for quite a longer time. However after a day or so, the pod would terminate. I know docker containers exit once their process has finished running, but why would my database pods continue running for a while and then randomly exit.
What controls the termination of a pod?

The easiest way for you to find a definitive answer to that question would be to kubectl describe pod <podName>, or kubectl get events. Any pod termination would have an associated event that you can use to diagnose the reason.
Pods may die due to several reasons, ranging from errors within the container, to a node going down for maintenance. You can usually set the appropriate RestartPolicy, which will restart the pod if it fails (except in case of node failure). If you have multiple nods and would like the pod to be restarted on a different node, you should use a higher level controller like a ReplicaSet or Deployment.
For pods expected to terminate, a job is better suited.

Related

How to prevent or fix Kubernetes pod getting stuck in containerCreating occasionally

I'm running AWS EKS, running on Fargate, and using Kubernetes to orchestrate multiple cron jobs. I spin roughly 1000 pods up and down over the course of a day.
Very seldomly(once every 3 weeks) one of the pods gets stuck in ContainerCreating and just hangs there and because I have concurrency disabled that particular job will never run. The fix is simply terminating the job or the pod and having it restart but this is a manual intervention.
Is there a way to get a pod to terminate or restart, if it takes too long to create?
The reason for the pod getting stuck varies quite a bit. A solution would need to be general. It can be a time based solution as all the pods are running the same code with different configurations so the startup time is relatively consistent.
Sadly there is no mecanism to stop a job if it fail at image pulling or container creating. I also tried to do what you are trying to achieve.
You can set a backoffLimit inside your template. But it won't handle the number of retries during containerCreating, only while running.
What you can do is a script that makes describes of each pods in namespace. And try to parse it and restart the pod if it is stuck in containerCreating.
Or try to debug/trace what is causing this. kubectl describe pods to get info when your pod is in containerCreating.

What are the conditions for a pod to be treated as failed

I am aware of how replicaset works and how it will reconcile the state from its specification .
However, I am not completely aware of what are all the criteria Replicaset uses for it it to reconcile the state ?
I happened to take look the documentation to understand the scenarios.
https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
One scenarios is when the pod is down for any reason - application issue
Node is down
What are all the other scenarios ? If the pod is stuck in making progress, will replica set take care of that ? Or is it just check whether the pod is alive or not ?
If the pod is stuck in making progress, will replica set take care of that ?
As long as the main process inside of a container is running, it is considered healthy by default and it will be treated as such. If there is an application issue which prevents your application from working correctly but the main process is still running, you will be stuck with an "unhealthy" pod.
That is the reason why you want to implement livenessProbe for your containers and specify what "behavior" represents a healthy state of the container. In such scenario, failure to successfully respond to health check multiple times (configurable) will result in a container being treated as failed and your replica set will take an action.
Example might be a simple HTTP GET request to some predefined path if you are running web application in your pod (eg /api/health). Now, even if main process is running, your application needs to periodically respond to this health-check query otherwise it will be replaced.
If the Pod or the Node is not down, the Pod will only fail and a new one will be created if you have Liveness Probe defined.
If you don't have it implemented, k8s will never know that your Pod is not up and running.
Take a look at this doc page for more info.
OOM Killed issue - which cause pod kill and restart pod
Cpu limit issue - This cause 404 issue but do not restart pod

Can a pod be forcefully marked as "terminating" using liveness probe or any other way

We have a use case in our application where we need the pod to terminate after it processes a request. The corresponding deployment will take care of spinning up a new pod to maintain the replica count.
I was exploring to use liveness probes, but they only restart the containers and not the pods.
Is there any other way to terminate the pod, from service level or deployment level?
You need to get familiar with Pod lifetime
In general, Pods do not disappear until someone destroys them. This
might be a human or a controller. The only exception to this rule is
that Pods with a phase of Succeeded or Failed for more than some
duration (determined by terminated-pod-gc-threshold in the master)
will expire and be automatically destroyed.
In your case consider using Jobs.
Use a Job for Pods that are expected to terminate, for example, batch
computations. Jobs are appropriate only for Pods with restartPolicy
equal to OnFailure or Never.
Please let me know if that helped.

Is container where liveness or readiness probes's config are set to a "pod check" container?

I'm following this task Configure Liveness, Readiness and Startup Probes
and it's unclear to me whether a container where the check is made is a container only used to check the availability of a pod? Because it makes sense if pod check container fails therefore api won't let any traffic in to the pod.
So a health check signal must be coming from container where some image or app runs? (sorry, another question)
From the link you provided it seems like they are speaking about Containers and not Pods so the probes are meant to be per containers. When all containers are ready the pod is described as ready too as written in the doc you provided :
The kubelet uses readiness probes to know when a Container is ready to
start accepting traffic. A Pod is considered ready when all of its
Containers are ready. One use of this signal is to control which Pods
are used as backends for Services. When a Pod is not ready, it is
removed from Service load balancers.
So yes, every containers that are running some images or apps are supposed to expose those metrics.
Livenes and readiness probes as described by Ko2r are additional checks inside your containers and verified by kubelet according to the settings fro particular probe:
If the command (defined by health-check) succeeds, it returns 0, and the kubelet considers the Container to be alive and healthy. If the command returns a non-zero value, the kubelet kills the Container and restarts it.
In addition:
The kubelet uses liveness probes to know when to restart a Container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a Container in such a state can help to make the application more available despite bugs.
Fro another point of view:
Pod is a top-level resource in the Kubernetes REST API.
As per docs:
Pods are ephemeral. They are not designed to run forever, and when a Pod is terminated it cannot be brought back. In general, Pods do not disappear until they are deleted by a user or by a controller.
Information about controllers can find here:
So the best practise is to use controllers like describe above. You’ll rarely create individual Pods directly in Kubernetes–even singleton Pods. This is because Pods are designed as relatively ephemeral, disposable entities. When a Pod gets created (directly by you, or indirectly by a Controller), it is scheduled to run on a Node in your cluster. The Pod remains on that Node until the process is terminated, the pod object is deleted, the Pod is evicted for lack of resources, or the Node fails.
Note:
Restarting a container in a Pod should not be confused with restarting the Pod. The Pod itself does not run, but is an environment the containers run in and persists until it is deleted
Because Pods represent running processes on nodes in the cluster, it is important to allow those processes to gracefully terminate when they are no longer needed (vs being violently killed with a KILL signal and having no chance to clean up). Users should be able to request deletion and know when processes terminate, but also be able to ensure that deletes eventually complete. When a user requests deletion of a Pod, the system records the intended grace period before the Pod is allowed to be forcefully killed, and a TERM signal is sent to the main process in each container. Once the grace period has expired, the KILL signal is sent to those processes, and the Pod is then deleted from the API server. If the Kubelet or the container manager is restarted while waiting for processes to terminate, the termination will be retried with the full grace period.
The Kubernetes API server validates and configures data for the api objects which include pods, services, replicationcontrollers, and others. The API Server services REST operations and provides the frontend to the cluster’s shared state through which all other components interact.
For example, when you use the Kubernetes API to create a Deployment, you provide a new desired state for the system. The Kubernetes Control Plane records that object creation, and carries out your instructions by starting the required applications and scheduling them to cluster nodes–thus making the cluster’s actual state match the desired state.
Here you can find information about processing pod termination.
There are different probes:
For example for HTTP probe:
even if your app isn’t an HTTP server, you can create a lightweight HTTP server inside your app to respond to the liveness probe.
Command
For command probes, Kubernetes runs a command inside your container. If the command returns with exit code 0 then the container is marked as healthy.
More about probes and best practices.
Hope this help.

Configure Kubernetes StatefulSet to start pods first restart failed containers after start?

Basic info
Hi, I'm encountering a problem with Kubernetes StatefulSets. I'm trying to spin up a set with 3 replicas.
These replicas/pods each have a container which pings a container in the other pods based on their network-id.
The container requires a response from all the pods. If it does not get a response the container will fail. In my situation I need 3 pods/replicas for my setup to work.
Problem description
What happens is the following. Kubernetes starts 2 pods rather fast. However since I need 3 pods for a fully functional cluster the first 2 pods keep crashing as the 3rd is not up yet.
For some reason Kubernetes opts to keep restarting both pods instead of adding the 3rd pod so my cluster will function.
I've seen my setup run properly after about 15 minutes because Kubernetes added the 3rd pod by then.
Question
So, my question.
Does anyone know a way to delay restarting failed containers until the desired amount of pods/replicas have been booted?
I've since found out the cause of this.
StatefulSets launch pods in a specific order. If one of the pods fails to launch it does not launch the next one.
You can add a podManagementPolicy: "Parallel" to launch the pods without waiting for previous pods to be Running.
See this documentation
I think a better way to deal with your problem is to leverage liveness probe, as described in the document, rather than delay the restart time (not configurable in the YAML).
Your pods respond to the liveness probe right after they are started to let Kubernetes know they are alive, which prevents them from being restarted. Meanwhile, your pods keep ping others until they are all up. Only when all your pods are started will serve the external requests. This is similar to creating a Zookeeper ensemble.