Kubernetes doesn't remove completed jobs for a Cronjob - kubernetes

Kubernetes doesn't delete a manually created completed job when historylimit is set when using newer versions of kubernetes clients.
mycron.yaml:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: myjob
spec:
schedule: "* * 10 * *"
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Creating cronjob:
kubectl create -f mycron.yaml
Creating job manually:
kubectl create job -n myjob --from=cronjob/hello hello-job
Result:
Job is completed but not removed
NAME COMPLETIONS DURATION AGE
hello-job 1/1 2s 6m
Tested with kubernetes server+client versions of 1.19.3 and 1.20.0
However when I used an older client version (1.15.5) against the server's 1.19/1.20 it worked well.
Comparing the differences while using different client versions:
kubernetes-controller log:
Using client v1.15.5 I have this line in the log (But missing when using client v1.19/1.20):
1 event.go:291] "Event occurred" object="myjob/hello" kind="CronJob" apiVersion="batch/v1beta1" type="Normal" reason="SuccessfulDelete" message="Deleted job hello-job"
Job yaml:
Exactly the same, except the ownerReference part:
For client v1.19/1.20
ownerReferences:
- apiVersion: batch/v1beta1
kind: CronJob
name: hello
uid: bb567067-3bd4-4e5f-9ca2-071010013727
For client v1.15
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: CronJob
name: hello
uid: bb567067-3bd4-4e5f-9ca2-071010013727
And that is it. No other informations in the logs, no errors, no warnings ..nothing (checked all the pods logs in kube-system)
Summary:
It seems to be a bug in kubectl client itself but not in kubernetes server. But don't know how to proceed further.
edit:
When I let the cronjob itself to do the job (ie hitting the time in the expression), it will remove the completed job successfully.

Related

Predictable pod name in kubernetes cron job

I have a cron job on kubernetes that I trigger like so for testing purposes:
kubectl create -f src/cronjob.yaml
kubectl create job --from=cronjob/analysis analysis-test
This creates a pod with the name analysis-test-<random-string>. I was wondering if it's possible to omit or make the suffix predictable?
Filtered cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: analysis
labels:
job: analysis
spec:
schedule: "0 0 * * 0"
concurrencyPolicy: "Forbid"
suspend: true
failedJobsHistoryLimit: 3
successfulJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: container-name
image: myimage
env:
- name: ENVIRONMENT
value: "DEV"
imagePullPolicy: IfNotPresent
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;"]
As of v1beta1, no you can't, here's the doc regarding cronjob
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Here's an excerpt from the docs:
When creating the manifest for a CronJob resource, make sure the name you provide is a valid DNS subdomain name. The name must be no longer than 52 characters. This is because the CronJob controller will automatically append 11 characters to the job name provided and there is a constraint that the maximum length of a Job name is no more than 63 characters.
Also here's a reference page to CronJob v1beta1 spec to view the available options config:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#cronjobspec-v1beta1-batch
Digging through the source code a little bit, you can see how the CronJob controller create the Job Resource
https://github.com/kubernetes/kubernetes/blob/v1.20.1/pkg/controller/cronjob/cronjob_controller.go#L327
https://github.com/kubernetes/kubernetes/blob/v1.20.1/pkg/controller/cronjob/utils.go#L219

Check logs for a Kubernetes resource CronJob

I created a CronJob resource in Kubernetes.
I want to check the logs to validate that my crons are run. But not able to find any way to do that. I have gone through the commands but looks like all are for pod resource type.
Also tried following
$ kubectl logs cronjob/<resource_name>
error: cannot get the logs from *v1beta1.CronJob: selector for *v1beta1.CronJob not implemented
Questions:
How to check logs of CronJob Resource type?
If I want this resource to be in specific namespace, how to implement that same?
You need to check the logs of the pods which are created by the cronjob. The pods will be in completed state but you can check logs.
# here you can get the pod_name from the stdout of the cmd `kubectl get pods`
$ kubectl logs -f -n default <pod_name>
For creating a cronjob in a namespace just add namespace in metadata section. The pods will created in that namespace.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: default
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Ideally you should be sending the logs to a log aggregator system such as ELK or Splunk.
In case you create job from cronjob it works like this:
kubectl -n "namespace" logs jobs.batch/<resource_name> --tail 4

Kubernetes Job is not getting terminated even after specifying "activeDeadlineSeconds"

My yaml file
apiVersion: batch/v1
kind: Job
metadata:
name: auto
labels:
app: auto
spec:
backoffLimit: 5
activeDeadlineSeconds: 100
template:
metadata:
labels:
app: auto
spec:
containers:
- name: auto
image: busybox
imagePullPolicy: Always
ports:
- containerPort: 9080
imagePullSecrets:
- name: imageregistery
restartPolicy: Never
The pods are killed appropriately but the job ceases to kill itself post 100 seconds.
Is there anything that we could do to kill the job post the container/pod's functionality is completed.
kubectl version --short
Client Version: v1.6.1
Server Version: v1.13.10+IKS
kubectl get jobs --namespace abc
NAME DESIRED SUCCESSFUL AGE
auto 1 1 26m
Thank you,
The default way to delete jobs after they are done is to use kubectl delete command.
As mentioned by #Erez:
Kubernetes is keeping pods around so you can get the
logs,configuration etc from it.
If you don't want to do that manually you could write a script running in your cluster that would check for jobs with completed status and than delete them.
Another way would be to use TTL feature that deletes the jobs automatically after a specified number of seconds. However, if you set it to zero it will clean them up immediately. For more details of how to set it up look here.
Please let me know if that helped.

Kubernetes 1.7.7: how to enable Cron Jobs?

I've added —-runtime-config=batch/v2alpha1=true to the kube-apiserver config like so:
... other stuff
command:
- "/hyperkube"
- "apiserver"
- "--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota"
- "--address=0.0.0.0"
- "--allow-privileged"
- "--insecure-port=8080"
- "--secure-port=443"
- "--cloud-provider=azure"
- "--cloud-config=/etc/kubernetes/azure.json"
- "--service-cluster-ip-range=10.0.0.0/16"
- "--etcd-servers=http://127.0.0.1:2379"
- "--etcd-quorum-read=true"
- "--advertise-address=10.240.255.15"
- "--tls-cert-file=/etc/kubernetes/certs/apiserver.crt"
- "--tls-private-key-file=/etc/kubernetes/certs/apiserver.key"
- "--client-ca-file=/etc/kubernetes/certs/ca.crt"
- "--service-account-key-file=/etc/kubernetes/certs/apiserver.key"
- "--storage-backend=etcd2"
- "--v=4"
- "—-runtime-config=batch/v2alpha1=true"
... etc
but after restarting the master kubectl api-versions still shows only batch/v1, no v2alpha1 to be seen.
$ kubectl api-versions
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1beta1
apps/v1beta1
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
batch/v1
certificates.k8s.io/v1beta1
extensions/v1beta1
networking.k8s.io/v1
policy/v1beta1
rbac.authorization.k8s.io/v1alpha1
rbac.authorization.k8s.io/v1beta1
settings.k8s.io/v1alpha1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1
Here's my job definition:
kind: CronJob
apiVersion: batch/v2alpha1
metadata:
name: mongo-backup
spec:
schedule: "* */1 * * *"
jobTemplate:
spec:
... etc
And the error I get when I try to create the job:
$ kubectl create -f backup-job.yaml
error: error validating "backup-job.yaml": error validating data: unknown object type schema.GroupVersionKind{Group:"batch", Version:"v2alpha1", Kind:"CronJob"}; if you choose to ignore these errors, turn validation off with --validate=false
$ kubectl create -f backup-job.yaml --validate=false
error: unable to recognize "backup-job.yaml": no matches for batch/, Kind=CronJob
What else do I need to do?
PS. this is on Azure ACS, I don't think it makes a difference though.
You may use the latest API versions here apiVersion: batch/v1beta1 that should fix the issue.
The latest Kubernetes v1.21 Release Notes states that:
The batch/v2alpha1 CronJob type definitions and clients are deprecated and removed. (#96987, #soltysh) [SIG API
Machinery, Apps, CLI and Testing]
In that version you should use apiVersion: batch/v1, see the example below:
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Notice that:
CronJobs was promoted to general availability in Kubernetes v1.21. If
you are using an older version of Kubernetes, please refer to the
documentation for the version of Kubernetes that you are using, so
that you see accurate information. Older Kubernetes versions do not
support the batch/v1 CronJob API.
There is an open issue for that on their github:
https://github.com/kubernetes/kubernetes/issues/51939
I believe there is no other option but wait, right now
i'm actually stuck on the same issue

Kubernetes - how to run job only once

I have a job definition based on example from kubernetes website.
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout-6
spec:
activeDeadlineSeconds: 30
completions: 1
parallelism: 1
template:
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["exit", "1"]
restartPolicy: Never
I would like run this job once and not restart if fails. With comand exit 1 kubernetes trying to run new pod to get exit 0 code until reach activeDeadlineSeconds timeout. How can avoid that? I would like run build commands in kubernetes to check compilation and if compilation fails I'll get exit code different than 0. I don't want run compilation again.
Is it possible? How?
By now this is possible by setting backoffLimit: 0 which tells the controller to do 0 retries. default is 6
If you want a one-try command runner, you probably should create bare pod, because the job will try to execute the command until it's successful or the active deadline is met.
Just create the pod from your template:
apiVersion: v1
kind: Pod
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["exit", "1"]
restartPolicy: Never
Sadly there is currently no way to prevent the job controller to just respawn new pods when they fail, but the kubernetes community is working on a solution, see:
"Backoff policy and failed pod limit" https://github.com/kubernetes/community/pull/583