How can a pod have status ready and terminating? - kubernetes

Curiously, I saw that a pod I had had both ready 1/1 status and status terminating when I ran kubectl get pods. Are these states not mutually exclusive? Why or why not?
For context, this was noticed immediately after I had killed skaffold so these pods were in the middle of shutting down.

When pods are in terminating state, they could still be functioning. The pod could be delayed in termination due to many reasons (eg. could be that you have a PVC attached, other pods are being terminated at the same time, etc). You could test this by running the following on a pod with a PVC attached or another reason to be terminated with a delay:
$ kubectl delete pod mypod-xxxxx-xxxxxx
pod mypod-xxxxx-xxxxxx deleted
$ kubectl delete pod mypod-xxxxx-xxxxxx
pod mypod-xxxxx-xxxxxx deleted
$ kubectl apply mypod.yaml
pod mypod-xxxxx-xxxxxx configured
Sometimes this happens because the pod is still in the terminating period and is functioning normally, so it will be treated as an existing pod that gets configured (neglecting the fact that you usually can't configure pods like this, but you get the point).

The ready column says how many containers are up.
The status terminating means no more traffic is being sent to that pod by the controllers. From kubernetes' docs:
When a user requests deletion of a pod, the system records the
intended grace period before the pod is allowed to be forcefully
killed, and a TERM signal is sent to the main process in each
container. Once the grace period has expired, the KILL signal is sent
to those processes, and the pod is then deleted from the API server.
That's the state it is. The containers are up, finishing processing whatever work it had already and a TERM signal was sent.

I want to update #nrxr answer:
The status terminating means no more traffic is being sent to that pod by the controllers.
That is what we want, but in reality, it not always be like that. The pod may terminate completely and the traffic still forward to it.
For detail please read this blog: https://learnk8s.io/graceful-shutdown.

Related

Auto delete CrashBackoffLoop pods in a deployment

In my kubernetes cluster, there are multiple deployments in a namespace.
For a specific deployment, there is a need to not allow "CrashLoopBackoff" pods to exist.
So basically, when any pod gets to this state, I would want it to be deleted and later a new pod to be created which is already handled by the ReplicaSet.
I tried with custom controllers, with the thought that the SharedInformer would alert about the state of Pod and then I would delete it from that loop.
However, this brings dependency on the pod on which the custom controller would run.
I also tried searching for any option to be configured in the manifest itself, but could not find any.
I am pretty new to Kuberenetes, so need help in the implementation of this behaviour.
Firstly, you should address the reason why the pod has entered the CrashLoopBackOff state rather than just delete it. If you do this, you'll potentially just recreate the problem again and you'll be deleting pods repeatedly. For example, if your pod is trying to access an external DB and that DB is down, it'll CrashLoop, and deleting and restarting the pod won't help fix that.
Secondly, if you want to do this deleting in an automated manner, an easy way would be to run a CronJob resource that goes through your deployment and deletes the CrashLooped pods. You could set the cronjob to run once an hour or whatever schedule you wish.
Deleting the POD and waiting for the New one is like restarting the deployment or POD.
Kubernetes will auto restart your CrashLoopBackoff POD if failing, you can check the Restart count.
NAME READY STATUS RESTARTS AGE
te-pod-1 0/1 CrashLoopBackOff 2 1m44s
This restarts will be similar to what you have mentioned
when any pod gets to this state, I would want it to be deleted and
later a new pod to be created which is already handled by the
ReplicaSet.
If you want to remove Crashing the POD fully and not look for new POD to come up, you have to rollback the deployment.
If there is any issue with your Replicaset and your POD is crashing it would be useless, any number of times you delete and restart the POD it will crash all time, unless you check logs & debug to solve the real issue in replicaset(Deployment).

How to rollout without killing processes in K8s?

I'm using:
kubectl rollout restart deployment my_cool_workers
This terminates the workers and start new ones.
However I want to rollout in a way where if something is running on a specific worker I want to let the task finish - I don't want to kill the tasks (so the worker should finish the tasks but not accepting new)
Meaning - rollout new workers -> old workers no longer accept traffic -> when old worker is no longer running anything terminate it.
How can this be done?
If a Pod gets killed, manually via kubectl or by any k8s controller like during a deployment, it will instantly change from Running into Terminating state. At the same time, the SIGTERM signal will be sent to all containers inside that Pod.
Starting from Kubernetes 1.19 you can debug running pods using Ephemeral Containers and kubectl debug command.
While in Terminating state, containers of a Pod are not restarted if they end. Whenever a container inside a Pod stops while in Running state, the container is restarted. This is done because a Pod should always be running unless an error occurred.
For more information refer to this document.

Kubernetes limit number of retry

For some context, I'm creating an API in python that creates K8s Jobs with user input in ENV variables.
Sometimes, it happens that the Image selected does not exist or has been deleted. Secrets does not exists or Volume isn't created. So it makes the Job in a crashloopbackoff or imagepullbackoff state.
First I'm am wondering if the ressource during this state are allocated to the job?
If yes, I don't want the Job to loop forever and lock resources to a never starting Job.
I've set the backofflimit to 0, but this is when the Job detect a Pod that goes in fail and tries to relaunch an other Pod to retry. In my case, I know that if a Pod fails for a job, then it's mostly due to OOM or code that fails and will always fails due to user input. So retrying will always fail.
But it doesn't limit the number of tries to crashloopbackoff or imagepullbackoff. Is there a way to set to terminate or fail the Job? I don't want to kill it, but just free the ressource and keep the events in (status.container.state.waiting.reason + status.container.state.waiting.message) or (status.container.state.terminated.reason + status.container.state.terminated.exit_code)
Could there be an option to set to limit the number of retry at the creation so I can free resources, but not to remove it to keep logs.
I have tested your first question and YES even if a pod is in crashloopbackoff state, the resources are still allocated to it !!! Here is my test: Are the Kubernetes requested resources by a pod still allocated to it when it is in crashLoopBackOff state?
Thanks for your question !
Long answer short, unfortunately there is no such option in Kubernetes.
However, you can do this manually by checking if the pod is in a crashloopbackoff then, unallocate its resources or simply delete the pod itself.
The following script delete any pod in the crashloopbackoff state from a specified namespace
#!/bin/bash
# This script check the passed namespace and delete pods in 'CrashLoopBackOff state
NAMESPACE="test"
delpods=$(sudo kubectl get pods -n ${NAMESPACE} |
grep -i 'CrashLoopBackOff' |
awk '{print $1 }')
for i in ${delpods[#]}; do
sudo kubectl delete pod $i --force=true --wait=false \
--grace-period=0 -n ${NAMESPACE}
done
Since we have passed the option --grace-period=0 the pod won't automatically restart again.
But, if after using this script or assigning it to a job, you noticed that the pod continues to restart and fall in the CrashLoopBackOff state again for some weird reason. Thera is a workaround for this, which is changing the restart policy of the pod:
A PodSpec has a restartPolicy field with possible values Always,
OnFailure, and Never. The default value is Always. restartPolicy
applies to all Containers in the Pod. restartPolicy only refers to
restarts of the Containers by the kubelet on the same node. Exited
Containers that are restarted by the kubelet are restarted with an
exponential back-off delay (10s, 20s, 40s …) capped at five minutes,
and is reset after ten minutes of successful execution. As discussed
in the Pods document, once bound to a node, a Pod will never be
rebound to another node.
See more details in the documentation or from here.
And that is it! Happy hacking.
Regarding the first question, it is already answered by bguess here.

Job still running even when deleting nodes

I created a two nodes clusters and I created a new job using the busybox image that sleeps for 300 secs. I checked on which node this job is running using
kubectl get pods -o wide
I deleted the node but surprisingly the job was still finishing to run on the same node. Any idea if this is a normal behavior? If not how can I fix it?
Jobs aren't scheduled or running on nodes. The role of a job is just to define a policy by making sure that a pod with certain specifications exists and ensure that it runs till the completion of the task whether it completed successfully or not.
When you create a job, you are declaring a policy that the built-in job-controller will see and will create a pod for. Then the built-in kube-scheduler will see this pod without a node and patch the pod to it with a node's identity. The kubelet will see a pod with a node matching it's own identity and hence a container will be started. As the container will be still running, the control-plane will know that the node and the pod still exist.
There are two ways of breaking a node, one with a drain and the second without a drain. The process of breaking a node without draining is identical to a network cut or a server crash. The api-server will keep the node resource for a while, but it 'll cease being Ready. The pods will be then terminated slowly. However, when you drain a node, it looks as if you are preventing new pods from scheduling on to the node and deleting the pods using kubectl delete pod.
In both ways, the pods will be deleted and you will be having a job that hasn't run to completion and doesn't have a pod, therefore job-controller will make a new pod for the job and the job's failed-attempts will be increased by 1, and the loop will start over again.

AdmissionController holding back a Terminated Pod from getting completely removed

I have an AdmissionController which is running successfully and prevents some pods from getting instantiated, checking on the prescribed conditions.
But the Pod gets stuck in Terminated Status and never goes away. I also have a process that monitors for stuck pods and cleans up. It tries to delete these Terminated Pods using deleteNamespacedPod. The Api call works fine, but the Pod lingers on without getting deleted. Is the AdmissionController denial a finalizer that is holding back the Pod from getting deleted ?
When I took down the Admission Controller, the clean up process was successfully able to delete the Pod.
Any insights or things I am missing in the AdmissionController ?
I appreciate any help/insights in this issue.
Thanks a lot,
-Sreeni
Run the below command against the terminated pod to delete it forcefully
kubectl patch pod <pod-name> -p '{"metadata":{"finalizers":null}}'