Kubernetes CronJob/pod terminated by occasional SIGTERM - kubernetes

I've had a series of cronjobs run on google container engine and I've been hit with an occasional SIGTERM to one of my pods triggered by my cronJob for unknown reason:
I've only been able to find this out as Sentry logged my tasks receiving a SIGTERM. Unable to find the pod itself that was terminated even with
kubectl get po -a
I've looked at the kubernetes nodes (which have autoscaling ability) and none of them have been drained and deleted. The ages of all the nodes is still the same.
I've also set the concurrencyPolicy: forbid (does kubernetes check for concurrency after it has initiated a pod or before it?)

Related

Why kubernetes keeps pods in Error/Completed status - preemptible nodes, GKE

I have an issue with my GKE cluster. I am using two node pools: secondary - with standard set of highmen-n1 nodes, and primary - with preemptible highmem-n1 nodes. Issue is that I have many pods in Error/Completed status which are not cleared by k8s, all ran on preemptible set. THESE PODS ARE NOT JOBS.
GKE documentation says that:
"Preemptible VMs are Compute Engine VM instances that are priced lower than standard VMs and provide no guarantee of availability. Preemptible VMs offer similar functionality to Spot VMs, but only last up to 24 hours after creation."
"When Compute Engine needs to reclaim the resources used by preemptible VMs, a preemption notice is sent to GKE. Preemptible VMs terminate 30 seconds after receiving a termination notice."
Ref: https://cloud.google.com/kubernetes-engine/docs/how-to/preemptible-vms
And from the kubernetes documentation:
"For failed Pods, the API objects remain in the cluster's API until a human or controller process explicitly removes them.
The Pod garbage collector (PodGC), which is a controller in the control plane, cleans up terminated Pods (with a phase of Succeeded or Failed), when the number of Pods exceeds the configured threshold (determined by terminated-pod-gc-threshold in the kube-controller-manager). This avoids a resource leak as Pods are created and terminated over time."
Ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-garbage-collection
So, from my understanding every 24 hours this set of nodes is changing, so it kills all the pods running on them and depending on graceful shutdown pods are ending up in Completed or Error state. Nevertheless, kubernetes is not clearing or removing them, so I have tons of pods in mentioned statuses in my cluster, which is not expected at all.
I am attaching screenshots for reference.
Example kubectl describe pod output:
Status: Failed
Reason: Terminated
Message: Pod was terminated in response to imminent node shutdown.
Apart from that, no events, logs, etc.
GKE version:
1.24.7-gke.900
Both Node pools versions:
1.24.5-gke.600
Did anyone encounter such issue or knows what's going on there? Is there solution to clear it in a different way than creating some script and running it periodically?
I tried digging in into GKE logs, but I couldn't find anything. I also tried to look for the answers in docs, but I've failed.
While using the node pool with Preemptible mode the clusters running GKE version 1.20 and later, the kubelet graceful node shutdown feature is enabled by default. The kubelet notices the termination notice and gracefully terminates Pods that are running on the node. If the Pods are part of a Deployment, the controller creates and schedules new Pods to replace the terminated Pods.
During graceful Pod termination, the kubelet updates the status of the Pod, assigning a Failed phase and a Terminated reason to the terminated Pods. When the number of terminated Pods reaches a threshold, garbage collection cleans up the Pods.
You can also delete shutdown Pods manually for GKE version 1.21.3-gke.1200 and later
Delete shutdown Pods manually:
kubectl get pods --all-namespaces | grep -i NodeShutdown | awk '{print $1, $2}' | xargs -n2 kubectl delete pod -n
kubectl get pods --all-namespaces | grep -i Terminated | awk '{print $1, $2}' | xargs -n2 kubectl delete pod -n

GKE: Pods remain in Error state after node scaledown or replacement

I'm running a GKE cluster in version 1.22, on Spot nodes.
Every time a node is being scaled down or replaced, the pods which are running on that node are staying in Error state, and not removed even after the new pods are up and running.
There are no related events in the namespace, and in the pod description I see the following:
Status: Failed
Reason: Terminated
Message: Pod was terminated in response to imminent node shutdown.
Any idea how to solve this ?
Expected outcome: Old pods should be completely removed.

Terminate k8s pods faster

I have a simple Node.js server running on some Kubernetes pods. When I delete a deployment with:
kubectl delete deployment <deployment-name>
sometimes pods have a status of terminating for 5-8 seconds. Seems excessive - is there a way to kill them faster somehow?
If you really want to kill them instead of gracefully shutdown, then:
kubectl delete deployment <deployment-name> --grace-period=0
Also, if you have any preStop handlers configured, you could investigate if they are causing any unnecessary delay.

Reasons of Pod Status Failed

If Pod's status is Failed, Kubernetes will try to create new Pods until it reaches terminated-pod-gc-threshold in kube-controller-manager. This will leave many Failed Pods in a cluster and need to be cleaned up.
Are there other reasons except Evicted that will cause Pod Failed?
There can be many causes for the POD status to be FAILED. You just need to check for problems(if there exists any) by running the command
kubectl -n <namespace> describe pod <pod-name>
Carefully check the EVENTS section where all the events those occurred during POD creation are listed. Hopefully you can pinpoint the cause of failure from there.
However there are several reasons for POD failure, some of them are the following:
Wrong image used for POD.
Wrong command/arguments are passed to the POD.
Kubelet failed to check POD liveliness(i.e., liveliness probe failed).
POD failed health check.
Problem in network CNI plugin (misconfiguration of CNI plugin used for networking).
For example:
In the above example, the image "not-so-busybox" couldn't be pulled as it doesn't exist, so the pod FAILED to run. The pod status and events clearly describe the problem.
Simply do this:
kubectl get pods <pod_name> -o yaml
And in the output, towards the end, you can see something like this:
This will give you a good idea of where exactly did the pod fail and what happened.
PODs will not survive scheduling failures, node failures, or other evictions, such as lack of resources, or in the case of node maintenance.
Pods should not be created manually but almost always via controllers like Deployments (self-healing, replication etc).
Reason why pod failed or was terminated can be obtain by
kubectl describe pod <pod_name>
Others situation I have encountered when pod Failed:
Issues with image (not existing anymore)
When pod is attempting to access i.e ConfigMap or Secrets but it is not found in namespace.
Liveness Probe Failure
Persistent Volume fails to mount
Validation Error
In addition, eviction is based on resources - EvictionPolicy
It can be also caused by DRAINing the Node/Pod. You can read about DRAIN here.

Deploying container as a CronJob to (Google) Kubernetes Engine - How to stop Pod after completing task

I have a container that runs some data fetching from a MySQL database and simply displays the result in console.log(), and want to run this as a cron job in GKE. So far I have the container working on my local machine, and have successfully deployed this to GKE (in terms of there being no errors thrown so far as I can see).
However, the pods that were created were just left as Running instead of stopping after completion of the task. Are the pods supposed to stop automatically after executing all the code, or do they require explicit instruction to stop and if so what is the command to terminate a pod after creation (by the Cron Job)?
I'm reading that there is supposedly some kind of termination grace period of ~30s by default, but after running a minutely-executed cronjob for ~20minutes, all the pods were still running. Not sure if there's a way to terminate the pods from inside the code, otherwise it would be a little silly to have a cronjob generating lots of pods left running idly..My cronjob.yaml below:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: test
spec:
schedule: "5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: test
image: gcr.io/project/test:v1
# env:
# - name: "DELAY"
# value: 15
restartPolicy: OnFailure
A CronJob is essentially a cookie cutter for jobs. That is, it knows how to create jobs and execute them at a certain time. Now, that being said, when looking at garbage collection and clean up behaviour of a CronJob, we can simply look at what the Kubernetes docs have to say about this topic in the context of jobs:
When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml).
Adding a process.kill(); line in the code to explicitly end the process after the code has finished executing allowed the pod to automatically stop after execution
A job in Kubernetes is intended to run a single instance of a pod and ensure it runs to completion. As another answer specifies, a CronJob is a factory for Jobs which knows how and when to spawn a job according to the specified schedule.
Accordingly, and unlike a service which is intended to run forever, the container(s) in the pod created by the pod must exit upon completion of the job. There is a notable problem with the sidecar pattern which often requires manual pod lifecycle handling; if your main pod requires additional pods to provide logging or database access, you must arrange for these to exit upon completion of the main pod, otherwise they will remain running and k8s will not consider the job complete. In such circumstances, the pod associated with the job will never terminate.
The termination grace period is not applicable here: this timer applies after Kubernetes has requested that your pod terminate (e.g. if you delete it). It specifies the maximum time the pod is afforded to shutdown gracefully before the kubelet will summarily terminate it. If Kubernetes never considers your job to be complete, this phase of the pod lifecycle will not be entered.
Furthermore, old pods are kept around after completion for some time to allow perusal of logs and such. You may see pods listed which are not actively running and so not consuming compute resources on your worker nodes.
If your pods are not completing, please provide more information regarding the code they are running so we can assist in determining why the process never exits.