kubernetes cron job which should run every 10mins and should delete the pods which are in "Terminating" state in all the namespaces in the cluster? - kubernetes

which are in "Terminating" state in all the namespaces in the cluster? please
help me out....am struggling with the bash one liner shell script
apiVersion: batch/v1
kind: Job
name: process-item-$ITEM
jobgroup: jobexample
name: jobexample
jobgroup: jobexample
- name: c
image: busybox
command: ["sh", "-c", "echo Processing item $ITEM && sleep 5"]
restartPolicy: Never

List all terminating pods in all namespace with the format {namespace}.{name}
kubectl get pods --field-selector=status.phase=Terminating --output=jsonpath='{range .items[*]}{.metadata.namespace}{"."}{.metadata.name}{"\n"}{end}' --all-namespaces=true
Given a pod's name and its namespace, it can be force deleted by
kubectl delete pods <pod> --grace-period=0 --force --ns=<namespace>
In one line
for i in `kubectl get pods --field-selector=status.phase=Terminating --output=jsonpath='{range .items[*]}{.metadata.namespace}{"."}{.metadata.name}{"\n"}{end}' --all-namespaces=true`; do kubectl delete pods ${i##*.} --grace-period=0 --force --ns=${i%%.*}; done


How to achieve Automatic Rollback in Kubernetes?

Let's say I've a deployment. For some reason it's not responding after sometime. Is there any way to tell Kubernetes to rollback to previous version automatically on failure?
You mentioned that:
I've a deployment. For some reason it's not responding after sometime.
In this case, you can use liveness and readiness probes:
The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
The above probes may prevent you from deploying a corrupted version, however liveness and readiness probes aren't able to rollback your Deployment to the previous version. There was a similar issue on Github, but I am not sure there will be any progress on this matter in the near future.
If you really want to automate the rollback process, below I will describe a solution that you may find helpful.
This solution requires running kubectl commands from within the Pod.
In short, you can use a script to continuously monitor your Deployments, and when errors occur you can run kubectl rollout undo deployment DEPLOYMENT_NAME.
First, you need to decide how to find failed Deployments. As an example, I'll check Deployments that perform the update for more than 10s with the following command:
NOTE: You can use a different command depending on your need.
kubectl rollout status deployment ${deployment} --timeout=10s
To constantly monitor all Deployments in the default Namespace, we can create a Bash script:
while true; do
sleep 60
deployments=$(kubectl get deployments --no-headers -o custom-columns=":metadata.name" | grep -v "deployment-checker")
echo "====== $(date) ======"
for deployment in ${deployments}; do
if ! kubectl rollout status deployment ${deployment} --timeout=10s 1>/dev/null 2>&1; then
echo "Error: ${deployment} - rolling back!"
kubectl rollout undo deployment ${deployment}
echo "Ok: ${deployment}"
We want to run this script from inside the Pod, so I converted it to ConfigMap which will allow us to mount this script in a volume (see: Using ConfigMaps as files from a Pod):
$ cat check-script-configmap.yml
apiVersion: v1
kind: ConfigMap
name: check-script
checkScript.sh: |
while true; do
sleep 60
deployments=$(kubectl get deployments --no-headers -o custom-columns=":metadata.name" | grep -v "deployment-checker")
echo "====== $(date) ======"
for deployment in ${deployments}; do
if ! kubectl rollout status deployment ${deployment} --timeout=10s 1>/dev/null 2>&1; then
echo "Error: ${deployment} - rolling back!"
kubectl rollout undo deployment ${deployment}
echo "Ok: ${deployment}"
$ kubectl apply -f check-script-configmap.yml
configmap/check-script created
I've created a separate deployment-checker ServiceAccount with the edit Role assigned and our Pod will run under this ServiceAccount:
NOTE: I've created a Deployment instead of a single Pod.
$ cat all-in-one.yaml
apiVersion: v1
kind: ServiceAccount
name: deployment-checker
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: deployment-checker-binding
- kind: ServiceAccount
name: deployment-checker
namespace: default
kind: ClusterRole
name: edit
apiGroup: rbac.authorization.k8s.io
apiVersion: apps/v1
kind: Deployment
app: deployment-checker
name: deployment-checker
app: deployment-checker
app: deployment-checker
serviceAccountName: deployment-checker
- name: check-script
name: check-script
- image: bitnami/kubectl
name: test
command: ["bash", "/mnt/checkScript.sh"]
- name: check-script
mountPath: /mnt
After applying the above manifest, the deployment-checker Deployment was created and started monitoring Deployment resources in the default Namespace:
$ kubectl apply -f all-in-one.yaml
serviceaccount/deployment-checker created
clusterrolebinding.rbac.authorization.k8s.io/deployment-checker-binding created
deployment.apps/deployment-checker created
$ kubectl get deploy,pod | grep "deployment-checker"
deployment.apps/deployment-checker 1/1 1
pod/deployment-checker-69c8896676-pqg9h 1/1 Running
Finally, we can check how it works. I've created three Deployments (app-1, app-2, app-3):
$ kubectl create deploy app-1 --image=nginx
deployment.apps/app-1 created
$ kubectl create deploy app-2 --image=nginx
deployment.apps/app-2 created
$ kubectl create deploy app-3 --image=nginx
deployment.apps/app-3 created
Then I changed the image for the app-1 to the incorrect one (nnnginx):
$ kubectl set image deployment/app-1 nginx=nnnginx
deployment.apps/app-1 image updated
In the deployment-checker logs we can see that the app-1 has been rolled back to the previous version:
$ kubectl logs -f deployment-checker-69c8896676-pqg9h
====== Thu Oct 7 09:20:15 UTC 2021 ======
Ok: app-1
Ok: app-2
Ok: app-3
====== Thu Oct 7 09:21:16 UTC 2021 ======
Error: app-1 - rolling back!
deployment.apps/app-1 rolled back
Ok: app-2
Ok: app-3
I stumbled upon Argo Rollout which addresses this non automatic rollback and many other deployment related things.

Redeploy statefulset with CrashLoopBackOff status in kubernetes

That's what I do:
Deploy a stateful set. The pod will always exit with an error to provoke a failing pod in status CrashLoopBackOff: kubectl apply -f error.yaml
Change error.yaml (echo a => echo b) and redeploy stateful set: kubectl apply -f error.yaml
Pod keeps the error status and will not immediately redeploy but wait until the pod is restarted after some time.
Requesting pod status:
$ kubectl get pod errordemo-0
errordemo-0 0/1 CrashLoopBackOff 15 59m
apiVersion: apps/v1
kind: StatefulSet
name: errordemo
app.kubernetes.io/name: errordemo
serviceName: errordemo
replicas: 1
app.kubernetes.io/name: errordemo
app.kubernetes.io/name: errordemo
- name: demox
image: busybox:1.28.2
command: ['sh', '-c', 'echo a; sleep 5; exit 1']
terminationGracePeriodSeconds: 1
How can I achieve an immediate redeploy even if the pod has an error status?
I found out these solutions but I would like to have a single command to achieve that (In real life I'm using helm and I just want to call helm upgrade for my deployments):
Kill the pod before the redeploy
Scale down before the redeploy
Delete the statefulset before the redeploy
Why doesn't kubernetes redeploy the pod at once?
In my demo example I have to wait until kubernetes tries to restart the pod after waiting some time.
A pod with no error (e.g. echo a; sleep 10000;) will be restarted immediately. That's why I set terminationGracePeriodSeconds: 1
But in my real deployments (where I use helm) I also encountered the case that the pods are never redeployed. Unfortunately I cannot reproduce this behaviour in a simple example.
You could set spec.podManagementPolicy: "Parallel"
Parallel pod management tells the StatefulSet controller to launch or terminate all Pods in parallel, and not to wait for Pods to become Running and Ready or completely terminated prior to launching or terminating another Pod.
Remember that the default podManagementPolicy is OrderedReady
OrderedReady pod management is the default for StatefulSets. It tells the StatefulSet controller to respect the ordering guarantees demonstrated above
And if your application requires ordered update then there is nothing you can do.

Check logs for a Kubernetes resource CronJob

I created a CronJob resource in Kubernetes.
I want to check the logs to validate that my crons are run. But not able to find any way to do that. I have gone through the commands but looks like all are for pod resource type.
Also tried following
$ kubectl logs cronjob/<resource_name>
error: cannot get the logs from *v1beta1.CronJob: selector for *v1beta1.CronJob not implemented
How to check logs of CronJob Resource type?
If I want this resource to be in specific namespace, how to implement that same?
You need to check the logs of the pods which are created by the cronjob. The pods will be in completed state but you can check logs.
# here you can get the pod_name from the stdout of the cmd `kubectl get pods`
$ kubectl logs -f -n default <pod_name>
For creating a cronjob in a namespace just add namespace in metadata section. The pods will created in that namespace.
apiVersion: batch/v1beta1
kind: CronJob
name: hello
namespace: default
schedule: "*/1 * * * *"
- name: hello
image: busybox
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Ideally you should be sending the logs to a log aggregator system such as ELK or Splunk.
In case you create job from cronjob it works like this:
kubectl -n "namespace" logs jobs.batch/<resource_name> --tail 4

Replicaset doesnot update pods in when pod image is modified

I have created a replicaset with wrong container image with below configuration.
apiVersion: extensions/v1beta1
kind: ReplicaSet
name: rs-d33393
namespace: default
replicas: 4
name: busybox-pod
name: busybox-pod
- command:
- sh
- -c
- echo Hello Kubernetes! && sleep 3600
image: busyboxXXXXXXX
name: busybox-container
Pods Information:
$ kubectl get pods
rs-d33393-5hnfx 0/1 InvalidImageName 0 11m
rs-d33393-5rt5m 0/1 InvalidImageName 0 11m
rs-d33393-ngw78 0/1 InvalidImageName 0 11m
rs-d33393-vnpdh 0/1 InvalidImageName 0 11m
After this, i try to edit the image inside replicaset using kubectl edit replicasets.extensions rs-d33393 and update image as busybox.
Now, i am expecting pods to be recreated with proper image as part of replicaset.
This has not been the exact result.
Can someone please explain, why it is so?
Thanks :)
With ReplicaSets directly you have to kill the old pod, so the new ones will be created with the right image.
If you would be using a Deployment, and you should, changing the image would force the pod to be re-created.
Replicaset does not support updates. As long as required number of pods exist matching the selector labels, replicaset's jobs is done. You should use Deployment instead.
From the docs:
To update Pods to a new spec in a controlled way, use a Deployment, as
ReplicaSets do not support a rolling update directly.
Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to Pods. Therefore, it is recommend to use Deployments instead of directly using ReplicaSets unless you don’t require updates at all. ( i.e. one may never need to manipulate ReplicaSet objects when using a Deployment)
Its easy to perform rolling updates and rollbacks when deployed using deployments.
$ kubectl create deployment busybox --image=busyboxxxxxxx --dry-run -o yaml > busybox.yaml
$ cat busybox.yaml
apiVersion: apps/v1
kind: Deployment
creationTimestamp: null
app: busybox
name: busybox
replicas: 1
app: busybox
strategy: {}
creationTimestamp: null
app: busybox
- image: busyboxxxxxxx
name: busyboxxxxxxx
ubuntu#dlv-k8s-cluster-master:~$ kubectl create -f busybox.yaml --record=true
deployment.apps/busybox created
Check rollout history
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout history deployment busybox
1 kubectl create --filename=busybox.yaml --record=true
Update image on deployment
ubuntu#dlv-k8s-cluster-master:~$ kubectl set image deployment.app/busybox *=busybox --record
deployment.apps/busybox image updated
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout history deployment busybox
1 kubectl create --filename=busybox.yaml --record=true
2 kubectl set image deployment.app/busybox *=busybox --record=true
Rollback Deployment
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout undo deployment busybox
deployment.apps/busybox rolled back
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout history deployment busybox
2 kubectl set image deployment.app/busybox *=busybox --record=true
3 kubectl create --filename=busybox.yaml --record=true
You could use
k scale rs new-replica-set --replicas=0
and then
k scale rs new-replica-set --replicas=<Your number of replicas>
Edit the replicaset(assuming its called replicaset.yaml) file with command:
kubectl edit rs replicaset
edit the image name in the editor
save the file
exit the editor
Now , you will need to either delete the replica sets or delete the existing pods:
kubectl delete rs new-replica-set
kubectl delete pod pod_1 pod_2 pod_2 pod_4
replicaset should spin up new pods with new image.

Imperative command for creating job and cronjob in Kubernetes

Is this a valid imperative command for creating job?
kubectl create job my-job --image=busybox
I see this in https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands. But the command is not working. I am getting error as bellow:
Error: unknown flag: --image
What is the correct imperative command for creating job?
Try this one
kubectl create cronjob my-job --schedule="0,15,30,45 * * * *" --image=busy-box
What you have should work, though it not recommended as an approach anymore. I would check what version of kubectl you have, and possibly upgrade it if you aren't using the latest.
That said, the more common approach these days is to write a YAML file containing the Job definition and then run kubectl apply -f myjob.yaml or similar. This file-driven approach allowed for more natural version control, editing, review, etc.
Using correct value for --restart field on "kubectl run" will result run command to create an deployment or job or cronjob
--restart='Always': The restart policy for this Pod. Legal values [Always, OnFailure, Never]. If set to 'Always'
a deployment is created, if set to 'OnFailure' a job is created, if set to 'Never', a regular pod is created. For the
latter two --replicas must be 1. Default 'Always', for CronJobs `Never`.
Use "kubectl run" for creating basic kubernetes job using imperatively command as below
master $ kubectl run nginx --image=nginx --restart=OnFailure --dry-run -o yaml > output.yaml
Above should result an "output.yaml" as below example, you can edit this yaml for advance configurations as needed and create job by "kubectl create -f output.yaml or if you just need basic job then remove --dry-run option from above command and you will get basic job created.
apiVersion: batch/v1
kind: Job
creationTimestamp: null
run: nginx
name: nginx
creationTimestamp: null
run: nginx
- image: nginx
name: nginx
resources: {}
restartPolicy: OnFailure
status: {}