How do I manually trigger a kubernates job (not a cron) in k8s - kubernetes

I have sample k8s job as soon as you do kubectl apply the job gets triggered and the pods are created . How to control the pod creation?
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout
spec:
backoffLimit: 5
activeDeadlineSeconds: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never

If you want to manually control the pod creation, you can achieve it through parallelism.
Documentation says:
The requested parallelism (.spec.parallelism) can be set to any non-negative value. If it is unspecified, it defaults to 1. If it is specified as 0, then the Job is effectively paused until it is increased.
You can set it to 0 while doing the kubectl apply. Configuration looks something like below
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout
spec:
backoffLimit: 5
parallelism: 0
activeDeadlineSeconds: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
You can set it to 1 whenever you decide to run.

The trigger is running kubectl apply. When you create the Job, it runs. You might be looking for a more fully featured background task system like Airflow or Argo.

Related

Kubernetes Cronjobs are not removed

I'm running the following cronjob in my minikube:
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello
spec:
schedule: "* * * * *"
concurrencyPolicy: Allow
suspend: false
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- somefailure
restartPolicy: OnFailure
I've added the "somefailure" to force failing of the job. My problem is that it seems that my minikube installation (running v1.23.3) ignores successfulJobsHistoryLimit and failedJobsHistoryLimit. I've checked the documentation on https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/ and it says that both parameters are available, but in the end, Kubernetes generates up to 10 jobs. When I add ttlSecondsAfterFinished: 1, it removes the container after 1 second, but the other parameters are completely ignored.
So I wonder if I need to enable something in minikube or if these parameters are deprecated or what's the reason why it doesn't work. Any idea?
It seems it's a Kubernetes bug: https://github.com/kubernetes/kubernetes/issues/53331.

restart policy in Kubernetes deployment

Is it possible to create a deployment on Kubernetes and using restart policy on failure ?
I did a small research but didn't find anything that enables restart policy on failure for Deployment.
KR
restartPolicy for kind: Deployment support Always only.
One of the requirements is "restart policy on failure max 3 times"
Try:
apiVersion: batch/v1
kind: Job
metadata:
name: busybox
spec:
template:
backoffLimit: 3 # <-- max fail 3 times
spec:
restartPolicy: OnFailure # <-- You can do this with job
containers:
- name: busybox
image: busybox
command: ["ash","-c","sleep 15"] # <-- replace with an invalid command to see backoffLimit in action

suspend kubernetes cronjob on job failure to avoid subsequent job runs

Whenever a job run fails I want to suspend cronjob so that no further jobs are started. Is their any possible way?
k8s version: 1.10
you can configure it simply using suspend: true
apiVersion: batch/v1
kind: Job
metadata:
name: my-job
spec:
suspend: true
parallelism: 2
completions: 10
template:
spec:
containers:
- name: my-container
image: busybox
command: ["sleep", "5"]
restartPolicy: Never
Any currently running jobs will complete but future jobs will be suspended.
Read more at : https://kubernetes.io/blog/2021/04/12/introducing-suspended-jobs/
If you are on an older version you can use backoffLimit: 1
apiVersion: batch/v1
kind: Job
metadata:
name: error
spec:
backoffLimit: 1
template:
.spec.backoffLimit can limit the number of time a pod is restarted when running inside a job
If you can't suspend it however we make sure job won't get re-run using
backoffLimit means the number of times it will retry before it is
considered failed. The default is 6.
concurrencyPolicy means it will run 0 or 1 times, but
not more.
restartPolicy: Never means it won't restart on failure.

run initContainer only once

I have a yamlmanifest with parallelism: 2, including one initContainers. The command in the initContainers, therefore, runs two times and cause problems to the main command. How can I make it run only once?
Here are the important parts of the yaml
kind: Job
apiVersion: batch/v1
metadata:
name: bankruptcy
spec:
parallelism: 2
template:
metadata:
labels:
app: bankruptcy
spec:
restartPolicy: Never
containers:
- name: bankruptcy
image: "myimage"
workingDir: /mount/
command: ["bash","./sweep.sh"]
resources:
limits:
nvidia.com/gpu: 1
initContainers:
- name: dev-init-sweep
image: 'myimage'
workingDir: /mount/
command: ['/bin/bash']
args:
- '--login'
- '-c'
- 'wandb sweep ./sweep.yaml 2>&1 | tee ./wandb/sweep-output.txt; echo `expr "$(cat ./wandb/sweep-output.txt)" : ".*\(wandb agent.*\)"` > ./sweep.sh;'
An initContainer runs once per Pod.
You can't make the initContainer run only once for a given number of pods. But you could implement a guard as part of your initContainer that detects that another one has already started and just returns without performing an own operation or waits until a condition is met.
You have to implement it yourself, though, there is no support from Kubernetes for this.

How to set minimum-container-ttl-duration in yml

I'm trying to set the minimum-container-ttl-duration property on a Kubernetes CronJob. I see a bunch of properties like this that appear to be configurable, but the documentation doesn't appear to show where, in the yml file, they can actually be set.
In this example yml, where would I put this property?
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
minimum-container-ttl-duration is not a property on CronJob but is a Node-level property set via a command line parameter: kubelet ... --minimum-container-ttl-duration=x.
https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/#user-configuration:
minimum-container-ttl-duration, minimum age for a finished container before it is garbage collected. Default is 0 minute, which means every finished container will be garbage collected.
The usage of this flag is deprecated.