Tell when Job is Complete - kubernetes

I'm looking for a way to tell (from within a script) when a Kubernetes Job has completed. I want to then get the logs out of the containers and perform cleanup.
What would be a good way to do this? Would the best way be to run kubectl describe job <job_name> and grep for 1 Succeeded or something of the sort?

Since version 1.11, you can do:
kubectl wait --for=condition=complete job/myjob
and you can also set a timeout:
kubectl wait --for=condition=complete --timeout=30s job/myjob

You can visually watch a job's status with this command:
kubectl get jobs myjob -w
The -w option watches for changes. You are looking for the SUCCESSFUL column to show 1.
For waiting in a shell script, I'd use this command:
until kubectl get jobs myjob -o jsonpath='{.status.conditions[?
(#.type=="Complete")].status}' | grep True ; do sleep 1 ; done

You can use official Python kubernetes-client.
https://github.com/kubernetes-client/python
Create new Python virtualenv:
virtualenv -p python3 kubernetes_venv
activate it with
source kubernetes_venv/bin/activate
and install kubernetes client with:
pip install kubernetes
Create new Python script and run:
from kubernetes import client, config
config.load_kube_config()
v1 = client.BatchV1Api()
ret = v1.list_namespaced_job(namespace='<YOUR-JOB-NAMESPACE>', watch=False)
for i in ret.items:
print(i.status.succeeded)
Remember to set up your specific kubeconfig in ~/.kube/config and valid value for your job namespace -> '<YOUR-JOB-NAMESPACE>'

I would use -w or --watch:
$ kubectl get jobs.batch --watch
NAME COMPLETIONS DURATION AGE
python 0/1 3m4s 3m4s

Adding the best answer, from a comment by #Coo, If you add a -f or --follow option when getting logs, it'll keep tailing the log and terminate when the job completes or fails. The $# status code is even non-zero when the job fails.
kubectl logs -l job-name=myjob --follow
One downside of this approach, that I'm aware of, is that there's no timeout option.
Another downside is the logs call may fail while the pod is in Pending (while the containers are being started). You can fix this by waiting for the pod:
# Wait for pod to be available; logs will fail if the pod is "Pending"
while [[ "$(kubectl get pod -l job-name=myjob -o json | jq -rc '.items | .[].status.phase')" == 'Pending' ]]; do
# Avoid flooding k8s with polls (seconds)
sleep 0.25
done
# Tail logs
kubectl logs -l job-name=myjob --tail=400 -f

It either one of these queries with kubectl
kubectl get job test-job -o jsonpath='{.status.succeeded}'
or
kubectl get job test-job -o jsonpath='{.status.conditions[?(#.type=="Complete")].status}'

Although kubectl wait --for=condition=complete job/myjob and kubectl wait --for=condition=complete job/myjob allow us to check whether the job completed but there is no way to check if the job just finished executing (irrespective of success or failure). If this is what you are looking for, a simple bash while loop with kubectl status check did the trick for me.
#!/bin/bash
while true; do
status=$(kubectl get job jobname -o jsonpath='{.status.conditions[0].type}')
echo "$status" | grep -qi 'Complete' && echo "0" && exit 0
echo "$status" | grep -qi 'Failed' && echo "1" && exit 1
done

Related

searching for a keyword in all the pods/replicas of a kuberntes deployment

I am running a deployment called mydeployment that manages several pods/replicas for a certain service. I want to search all the service pods/instances/replicas of that deployment for a certain keyword. The command below defaults to one replica and returns the keyword matching in this replica only.
Kubectl logs -f deploy/mydeployment | grep "keyword"
Is it possible to customize the above command to return all the matching keywords out of all instances/pods of the deployment mydeployment? Any hint?
Save this to a file fetchLogs.sh file, and if you are using Linux box, use sh fetchLogs.sh
#!/bin/sh
podName="key-word-from-pod-name"
keyWord="actual-log-search-keyword"
nameSpace="name-space-where-the-pods-or-running"
echo "Running Script.."
for podName in `kubectl get pods -A -o name | grep -i ${podName} | cut -d'/' -f2`;
do
echo "searching pod ${podName}"
kubectl -n ${nameSpace} logs pod/${podName} | grep -i ${keyWord}
done
I used the pods, if you want to use deployment, the idea is same change the kubectl command accordingly.

Running script from Linux shell inside a Kubernetes pod

Team,
I need to execute a shell script that is within a kubernetes pod. However the call needs to come from outside the pod. Below is the script for your reference:
echo 'Enter Namespace: '; read namespace; echo $namespace;
kubectl exec -it `kubectl get po -n $namespace|grep -i podName|awk '{print $1}'` -n $namespace --- {scriptWhichNeedToExecute.sh}
Can anyone suggest on how to do this?`
There isn't really a good way. A simple option might be cat script.sh | kubectl exec -i -- bash but that can have weird side effects. The more correct solution would be to use a debug container but that feature is still in alpha right now.

Get count of Kubernetes pods that aren't running

I've this command to list the Kubernetes pods that are not running:
sudo kubectl get pods -n my-name-space | grep -v Running
Is there a command that will return a count of pods that are not running?
If you add ... | wc -l to the end of that command, it will print the number of lines that the grep command outputs. This will probably include the title line, but you can suppress that.
kubectl get pods -n my-name-space --no-headers \
| grep -v Running \
| wc -l
If you have a JSON-processing tool like jq available, you can get more reliable output (the grep invocation will get an incorrect answer if an Evicted pod happens to have the string Running in its name). You should be able to do something like (untested)
kubectl get pods -n my-namespace -o json \
| jq '.items | map(select(.status.phase != "Running")) | length'
If you'll be doing a lot of this, writing a non-shell program using the Kubernetes API will be more robust; you will generally be able to do an operation like "get pods" using an SDK call and get back a list of pod objects that you can filter.
You can do it without any external tool:
kubectl get po \
--field-selector=status.phase!=Running \
-o go-template='{{len .items}}'
the filtering is done with field-selectors
the counting is done with go-template: {{ len .items }}

How to kubectl wait for crd creation?

What is the best method for checking to see if a custom resource definition exists before running a script, using only kubectl command line?
We have a yaml file that contains definitions for a NATS cluster ServiceAccount, Role, ClusterRoleBinding and Deployment. The image used in the Deployment creates the crd, and the second script uses that crd to deploy a set of pods. At the moment our CI pipeline needs to run the second script a few times, only completing successfully once the crd has been fully created. I've tried to use kubectl wait but cannot figure out what condition to use that applies to the completion of a crd.
Below is my most recent, albeit completely wrong, attempt, however this illustrates the general sequence we'd like.
kubectl wait --for=condition=complete kubectl apply -f 1.nats-cluster-operator.yaml kubectl apply -f 2.nats-cluster.yaml
The condition for a CRD would be established:
kubectl -n <namespace-here> wait --for condition=established --timeout=60s crd/<crd-name-here>
You may want to adjust --timeout appropriately.
In case you are wanting to wait for a resource that may not exist yet, you can try something like this:
{ grep -q -m 1 "crontabs.stable.example.com"; kill $!; } < <(kubectl get crd -w)
or
{ sed -n /crontabs.stable.example.com/q; kill $!; } < <(kubectl get crd -w)
I understand the question would prefer to only use kubectl, however this answer helped in my case. The downside to this method is that the timeout will have to be set in a different way and that the condition itself is not actually checked.
In order to check the condition more thoroughly, I made the following:
#!/bin/bash
condition-established() {
local name="crontabs.stable.example.com"
local condition="Established"
jq --arg NAME $name --arg CONDITION $condition -n \
'first(inputs | if (.metadata.name==$NAME) and (.status.conditions[]?.type==$CONDITION) then
null | halt_error else empty end)'
# This is similar to the first, but the full condition is sent to stdout
#jq --arg NAME $name --arg CONDITION $condition -n \
# 'first(inputs | if (.metadata.name==$NAME) and (.status.conditions[]?.type==$CONDITION) then
# .status.conditions[] | select(.type==$CONDITION) else empty end)'
}
{ condition-established; kill $!; } < <(kubectl get crd -w -o json)
echo Complete
To explain what is happening, $! refers to the command run by bash's process substitution. I'm not sure how well this might work in other shells.
I tested with the CRD from the official kubernetes documentation.

Kubernetes - delete all jobs in bulk

I can delete all jobs inside a custer running
kubectl delete jobs --all
However, jobs are deleted one after another which is pretty slow (for ~200 jobs I had the time to write this question and it was not even done).
Is there a faster approach ?
It's a little easier to setup an alias for this bash command:
kubectl delete jobs `kubectl get jobs -o custom-columns=:.metadata.name`
I have a script for deleting which was quite faster in deleting:
$ cat deljobs.sh
set -x
for j in $(kubectl get jobs -o custom-columns=:.metadata.name)
do
kubectl delete jobs $j &
done
And for creating 200 jobs used following script with the command for i in {1..200}; do ./jobs.sh; done
$ cat jobs.sh
kubectl run memhog-$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 8 | head -n 1) --restart=OnFailure --record --image=derekwaynecarr/memhog --command -- memhog -r100 20m
If you are using CronJob and those are piling up quickly, you can let kubernetes delete them automatically by configuring job history limit described in documentation. That is valid starting from version 1.6.
...
spec:
...
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
This works really well for me:
kubectl delete jobs $(kubectl get jobs -o custom-columns=:.metadata.name)
There is an easier way to do it:
To delete successful jobs:
kubectl delete jobs --field-selector status.successful=1
To delete failed or long-running jobs:
kubectl delete jobs --field-selector status.successful=0
I use this script, it's fast but it can trash CPU (a process per job), you can always adjust the sleep parameter:
#!/usr/bin/env bash
echo "Deleting all jobs (in parallel - it can trash CPU)"
kubectl get jobs --all-namespaces | sed '1d' | awk '{ print $2, "--namespace", $1 }' | while read line; do
echo "Running with: ${line}"
kubectl delete jobs ${line} &
sleep 0.05
done
The best way for me is (for completed jobs older than a day):
kubectl get jobs | grep 1/1 | gawk 'match($0, / ([0-9]*)h/, ary) { if(ary[1]>24) print $1}' | parallel -r --bar -P 32 kubectl delete jobs
grep 1/1 for completed jobs
gawk 'match($0, / ([0-9]*)h/, ary) { if(ary[1]>24) print $1}' for jobs older than a day
-P number of parallel processes
It is faster than kubectl delete jobs --all, has a progress bar and you can use it when some jobs are still running.
kubectl delete jobs --all --cascade=false is fast, but won't delete associated resources, such as Pods
https://github.com/kubernetes/kubernetes/issues/8598
Parallelize using GNU parallel
parallel --jobs=5 "echo {}; kubectl delete jobs {} -n core-services;" ::: $(kubectl get job -o=jsonpath='{.items[?(#.status.succeeded==1)].metadata.name}' -n core-services)
kubectl get jobs -o custom-columns=:.metadata.name | grep specific* | xargs kubectl delete jobs
kubectl get jobs -o custom-columns=:.metadata.name gives you list of jobs name | then you can grep specific that you need with regexp | then xargs use output to delete one by one from the list.
Probably, there's no other way to delete all job at once,because even kubectl delete jobs also queries one job at a time, what Norbert van Nobelen suggesting might get faster result, but it will make much difference.
Kubectl bulk (bulk-action on krew) plugin may be useful for you, it gives you bulk operations on selected resources.
This is the command for deleting jobs
' kubectl bulk jobs delete '
You could check details in
https://github.com/emreodabas/kubectl-plugins/blob/master/README.md#kubectl-bulk-aka-bulk-action