Kubernetes Job is not getting terminated even after specifying "activeDeadlineSeconds" - kubernetes

My yaml file
apiVersion: batch/v1
kind: Job
metadata:
name: auto
labels:
app: auto
spec:
backoffLimit: 5
activeDeadlineSeconds: 100
template:
metadata:
labels:
app: auto
spec:
containers:
- name: auto
image: busybox
imagePullPolicy: Always
ports:
- containerPort: 9080
imagePullSecrets:
- name: imageregistery
restartPolicy: Never
The pods are killed appropriately but the job ceases to kill itself post 100 seconds.
Is there anything that we could do to kill the job post the container/pod's functionality is completed.
kubectl version --short
Client Version: v1.6.1
Server Version: v1.13.10+IKS
kubectl get jobs --namespace abc
NAME DESIRED SUCCESSFUL AGE
auto 1 1 26m
Thank you,

The default way to delete jobs after they are done is to use kubectl delete command.
As mentioned by #Erez:
Kubernetes is keeping pods around so you can get the
logs,configuration etc from it.
If you don't want to do that manually you could write a script running in your cluster that would check for jobs with completed status and than delete them.
Another way would be to use TTL feature that deletes the jobs automatically after a specified number of seconds. However, if you set it to zero it will clean them up immediately. For more details of how to set it up look here.
Please let me know if that helped.

Related

How can I run a cli app in a pod inside a Kubernetes cluster?

I have a cli app written in NodeJS [not by me].
I want to deploy this on a k8s cluster like I have done many times with web servers.
I have not deployed something like this before, so I am in a kind of a loss.
I have worked with dockerized cli apps [like Terraform] before, and i know how to use them in a CICD.
But how should I deploy them in a pod so they are always available for usage from another app in the cluster?
Or is there a completely different approach that I need to consider?
#EDIT#
I am using this in the end of my Dockerfile ..
# the main executable
ENTRYPOINT ["sleep", "infinity"]
# a default command
CMD ["mycli help"]
That way the pod does not restart and the cli inside is waiting for commands like mycli do this
Is it a hacky way that is frowned upon or a legit solution?
Your edit is one solution, another one if you do not want or cannot change the Docker image is to Define a Command for a Container to loop infinitely, this would achieve the same as the Dockerfile ENTRYPOINT but without having to rebuild the image.
Here's an example of such implementation:
apiVersion: v1
kind: Pod
metadata:
name: command-demo
labels:
purpose: demonstrate-command
spec:
containers:
- name: command-demo-container
image: debian
command: ["/bin/sh", "-ec", "while :; do echo '.'; sleep 5 ; done"]
restartPolicy: OnFailure
As for your question about if this is a legit solution, this is hard to answer; I would say it depends on what your application is designed to do. Kubernetes Pods are designed to be ephemeral, so a good solution would be one that is running until the job is completed; for a web server, for example, the job is never completed because it should be constantly listening to requests.
If your pods are in the same cluster they are already available to other pods through Core-DNS. An internal DNS service which allows you to access them by their internal DNS name. Something like my-cli-app.my-namespace.svc.cluster. DNS for service and pods
You would then create a deployment file with all your apps. Note this doesn't need ports to work and also doesn't include communication through the internet.
#deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80

Kubernetes Pod with Sleep command takes time to get deleted

Currently it takes quite a long time before the pod can be terminated after a kubectl delete command. I have the feeling that it could be because of the sleep command.
How can I make the container stop faster?
What best practices should I use here?
apiVersion: apps/v1
kind: Deployment
...
spec:
template:
spec:
containers:
- image: alpine
..
command:
- /bin/sh
- -c
- |
trap : TERM INT
while true; do
# some code to check something
sleep 10
done
Is my approach with "trap: TERM INT" correct? At the moment I don't see any positive effect...
When I terminate the pod it takes several seconds for the command to come back.
kubectl delete pod my-pod
Add terminationGracePeriodSeconds to your spec will do:
...
spec:
template:
spec:
terminationGracePeriodSeconds: 10 # <-- default is 30, can go as low as 0 to send SIGTERM immediately.
containers:
- image: alpine

Kubernetes doesn't remove completed jobs for a Cronjob

Kubernetes doesn't delete a manually created completed job when historylimit is set when using newer versions of kubernetes clients.
mycron.yaml:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: myjob
spec:
schedule: "* * 10 * *"
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Creating cronjob:
kubectl create -f mycron.yaml
Creating job manually:
kubectl create job -n myjob --from=cronjob/hello hello-job
Result:
Job is completed but not removed
NAME COMPLETIONS DURATION AGE
hello-job 1/1 2s 6m
Tested with kubernetes server+client versions of 1.19.3 and 1.20.0
However when I used an older client version (1.15.5) against the server's 1.19/1.20 it worked well.
Comparing the differences while using different client versions:
kubernetes-controller log:
Using client v1.15.5 I have this line in the log (But missing when using client v1.19/1.20):
1 event.go:291] "Event occurred" object="myjob/hello" kind="CronJob" apiVersion="batch/v1beta1" type="Normal" reason="SuccessfulDelete" message="Deleted job hello-job"
Job yaml:
Exactly the same, except the ownerReference part:
For client v1.19/1.20
ownerReferences:
- apiVersion: batch/v1beta1
kind: CronJob
name: hello
uid: bb567067-3bd4-4e5f-9ca2-071010013727
For client v1.15
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: CronJob
name: hello
uid: bb567067-3bd4-4e5f-9ca2-071010013727
And that is it. No other informations in the logs, no errors, no warnings ..nothing (checked all the pods logs in kube-system)
Summary:
It seems to be a bug in kubectl client itself but not in kubernetes server. But don't know how to proceed further.
edit:
When I let the cronjob itself to do the job (ie hitting the time in the expression), it will remove the completed job successfully.

Kubernetes - how to run job only once

I have a job definition based on example from kubernetes website.
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout-6
spec:
activeDeadlineSeconds: 30
completions: 1
parallelism: 1
template:
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["exit", "1"]
restartPolicy: Never
I would like run this job once and not restart if fails. With comand exit 1 kubernetes trying to run new pod to get exit 0 code until reach activeDeadlineSeconds timeout. How can avoid that? I would like run build commands in kubernetes to check compilation and if compilation fails I'll get exit code different than 0. I don't want run compilation again.
Is it possible? How?
By now this is possible by setting backoffLimit: 0 which tells the controller to do 0 retries. default is 6
If you want a one-try command runner, you probably should create bare pod, because the job will try to execute the command until it's successful or the active deadline is met.
Just create the pod from your template:
apiVersion: v1
kind: Pod
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["exit", "1"]
restartPolicy: Never
Sadly there is currently no way to prevent the job controller to just respawn new pods when they fail, but the kubernetes community is working on a solution, see:
"Backoff policy and failed pod limit" https://github.com/kubernetes/community/pull/583

Restart a Successful/Failed pod manually

running kubernetes v1.2.2 on coreos on vmware:
I have a pod with the restart policy set to Never. Is it possible to manually start the same pod back up?
In my use case we will have a postgres instance in this pod. If it was to crash I would like to leave the pod in a failed state until we can look at it closer to see why it failed and then start it manually. Rather than try to restart with a restartpolicy of Always.
Looking through kubectl it doesnt seem like there is a manual start option. I could delete and recreate but i think this would remove the data from my container. Maybe I should be mounting a local volume on my host, and I should not need to worry about losing data?
this is my sample pod yaml. I dont seem to be able to restart the 'health' pod.
apiVersion: v1
kind: Pod
metadata:
name: health
labels:
environment: dev
app: health
spec:
containers:
- image: busybox
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Never
One simple method that might address your needs is to add a unique instance label, maybe a simple counter. If each pod is labelled differently you can start as many as you like and keep around as many failed instances as you like.
e.g. first pod
apiVersion: v1
kind: Pod
metadata:
name: health
labels:
environment: dev
app: health
instance: 0
spec:
containers: ...
second pod
apiVersion: v1
kind: Pod
metadata:
name: health
labels:
environment: dev
app: health
instance: 1
spec:
containers: ...
Based on your question and comments sounds like you want to restart a failed container to retain its state and data. In fact, application containers and pods are considered to be relatively ephemeral (rather than durable) entities. When a container crashes its files will be lost and kubelet will restart it with a clean state.
To retain your data and logs use persistent volume types in your deployment. This will let you to preserve data across container restarts.