Predictable pod name in kubernetes cron job - kubernetes

I have a cron job on kubernetes that I trigger like so for testing purposes:
kubectl create -f src/cronjob.yaml
kubectl create job --from=cronjob/analysis analysis-test
This creates a pod with the name analysis-test-<random-string>. I was wondering if it's possible to omit or make the suffix predictable?
Filtered cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: analysis
labels:
job: analysis
spec:
schedule: "0 0 * * 0"
concurrencyPolicy: "Forbid"
suspend: true
failedJobsHistoryLimit: 3
successfulJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: container-name
image: myimage
env:
- name: ENVIRONMENT
value: "DEV"
imagePullPolicy: IfNotPresent
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;"]

As of v1beta1, no you can't, here's the doc regarding cronjob
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Here's an excerpt from the docs:
When creating the manifest for a CronJob resource, make sure the name you provide is a valid DNS subdomain name. The name must be no longer than 52 characters. This is because the CronJob controller will automatically append 11 characters to the job name provided and there is a constraint that the maximum length of a Job name is no more than 63 characters.
Also here's a reference page to CronJob v1beta1 spec to view the available options config:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#cronjobspec-v1beta1-batch
Digging through the source code a little bit, you can see how the CronJob controller create the Job Resource
https://github.com/kubernetes/kubernetes/blob/v1.20.1/pkg/controller/cronjob/cronjob_controller.go#L327
https://github.com/kubernetes/kubernetes/blob/v1.20.1/pkg/controller/cronjob/utils.go#L219

Related

How to do log rotation for NebulaGraph k8s Operator deployed cluster?

When deploying NebulaGraph in binary packages (RPM/DEB), I could leverage the logrotate from OS, which is a basic expectation/solution for cleaning up the logs generated.
While in K8s deployment, there is no such layer at the OS level anymore, what is the state-of-the-art thing I should do? or it's a missing piece in Nebula-Operator?
I think we could attach log dir to a pod running logrotate, too, but it looks not elegant to me(or I am wrong?).
After some study, I think the best way could be to leverage what K8s Conjob API could provide.
We could create it like:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: log-cleanup
spec:
schedule: "0 0 * * *" # run the job every day at midnight
jobTemplate:
spec:
template:
spec:
containers:
- name: log-cleanup
image: your-log-cleanup-image:latest
command: ["/bin/sh", "-c", "./cleanup.sh /path/to/log"]
restartPolicy: OnFailure
And in /cleanup.sh we could either simple put the log removing logic or log archiving logic(say move them to s3)

Kubernetes doesn't remove completed jobs for a Cronjob

Kubernetes doesn't delete a manually created completed job when historylimit is set when using newer versions of kubernetes clients.
mycron.yaml:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: myjob
spec:
schedule: "* * 10 * *"
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Creating cronjob:
kubectl create -f mycron.yaml
Creating job manually:
kubectl create job -n myjob --from=cronjob/hello hello-job
Result:
Job is completed but not removed
NAME COMPLETIONS DURATION AGE
hello-job 1/1 2s 6m
Tested with kubernetes server+client versions of 1.19.3 and 1.20.0
However when I used an older client version (1.15.5) against the server's 1.19/1.20 it worked well.
Comparing the differences while using different client versions:
kubernetes-controller log:
Using client v1.15.5 I have this line in the log (But missing when using client v1.19/1.20):
1 event.go:291] "Event occurred" object="myjob/hello" kind="CronJob" apiVersion="batch/v1beta1" type="Normal" reason="SuccessfulDelete" message="Deleted job hello-job"
Job yaml:
Exactly the same, except the ownerReference part:
For client v1.19/1.20
ownerReferences:
- apiVersion: batch/v1beta1
kind: CronJob
name: hello
uid: bb567067-3bd4-4e5f-9ca2-071010013727
For client v1.15
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: CronJob
name: hello
uid: bb567067-3bd4-4e5f-9ca2-071010013727
And that is it. No other informations in the logs, no errors, no warnings ..nothing (checked all the pods logs in kube-system)
Summary:
It seems to be a bug in kubectl client itself but not in kubernetes server. But don't know how to proceed further.
edit:
When I let the cronjob itself to do the job (ie hitting the time in the expression), it will remove the completed job successfully.

Want to parameterize cronjob schedule on Kubernetes

I have an yaml. I want to parameterize the schedule of that kubernetes cronjob. On environment file I declared JobFrequencyInMinutes: "10"
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: scheduled-mongo-cronjob
spec:
schedule: "*/$(JobFrequencyInMinutes) * * * *"
concurrencyPolicy: "Forbid"
jobTemplate:
spec:
template:
spec:
containers:
- name: scheduled-mongo-cronjob
image: xxxx
env:
- name: JobFrequencyInMinutes
valueFrom:
configMapKeyRef:
key: JobFrequencyInMinutes
name: env-conf
When I am applying the above yaml I am getting an error.
The CronJob "scheduled-mongo-cronjob" is invalid: spec.schedule: Invalid value: "*/$(JobFrequencyInMinutes) * * * *": Failed to parse int from $(JobFrequencyInMinutes): strconv.Atoi: parsing "$(JobFrequencyInMinutes)": invalid syntax
Please guide me if there is any alternative way to achieve this.
The issue here is that the environment variable will be just available when the CronJob is created and inside the job itself, but it is failing to create because the variable $JobFrequencyInMinutes does not exists in the node level.
I would say that to achieve what you are trying to do, you would need to have an environment variable at cluster level. Whenever you want to update your schedule, you would need to set a new value to it and then re-create your CronJob.
It seems though that the declarative way it's not working (via your YAML), so you would need to create using the imperative way:
kubectl run scheduled-mongo-cronjob --schedule="*/$JobFrequencyInMinutes * * * *" --restart=OnFailure --image=xxxx

Is this possible to schedule CronJob to execute on each of Kubernetes nodes?

What I would like to do is to run some backup scripts on each of Kubernetes nodes periodically. I want it to run inside Kubernetes cluster in contrast to just adding script to each node's crontab. This is because I will store backup on the volume mounted to the node by Kubernetes. It differs from the configuration but it could be CIFS filesystem mounted by Flex plugin or awsElasticBlockStore.
It would be perfect if CronJob will be able to template DaemonSet (instead of fixing it as jobTemplate) and there will be possibility to set DaemonSet restart policy to OnFailure.
I would like to avoid defining n different CronJobs for each of n nodes and then associate them together by defining nodeSelectors since this will be not so convenient to maintain in environment where nodes count changes dynamically.
What I can see problem was discussed here without any clear conclusion: https://github.com/kubernetes/kubernetes/issues/36601
Maybe do you have any hacks or tricks to achieve this?
You can use DaemonSet with the following bash script:
while :; do
currenttime=$(date +%H:%M)
if [[ "$currenttime" > "23:00" ]] && [[ "$currenttime" < "23:05" ]]; then
do_something
else
sleep 60
fi
test "$?" -gt 0 && notify_failed_job
done
i know i am late to party,
First option :
Using the parallelism to run multiple Job PODs, with topologySpreadConstraints to spread/schedule the PODs on all the nodes.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mycronjob
labels:
jobgroup: parallel
spec:
schedule: "*/5 * * * *"
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 0
jobTemplate:
spec:
template:
metadata:
name: kubejob
labels:
jobgroup: parallel
spec:
topologySpreadConstraints:
- maxSkew: 2
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
jobgroup: parallel
containers:
- name: mycron-container
image: alpine
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'echo Job Pod is Running ; sleep 10']
restartPolicy: OnFailure
terminationGracePeriodSeconds: 0
parallelism: 5
concurrencyPolicy: Allow
Option two :
Using cronjob you can apply the YAML template of daemon set and delete it after a certain duration which will work as job ideally on all nodes. Also if a custom docker image runs inside the deamon set it could also be complete once done with execution.
Extra:
i would suggest checking out this CRD also : https://github.com/AmitKumarDas/metac/tree/master/examples/daemonjob
Read more about the CRD : https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/
You can write your own custom resource and add it into the kuberetes.

Kubernetes - how to run job only once

I have a job definition based on example from kubernetes website.
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout-6
spec:
activeDeadlineSeconds: 30
completions: 1
parallelism: 1
template:
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["exit", "1"]
restartPolicy: Never
I would like run this job once and not restart if fails. With comand exit 1 kubernetes trying to run new pod to get exit 0 code until reach activeDeadlineSeconds timeout. How can avoid that? I would like run build commands in kubernetes to check compilation and if compilation fails I'll get exit code different than 0. I don't want run compilation again.
Is it possible? How?
By now this is possible by setting backoffLimit: 0 which tells the controller to do 0 retries. default is 6
If you want a one-try command runner, you probably should create bare pod, because the job will try to execute the command until it's successful or the active deadline is met.
Just create the pod from your template:
apiVersion: v1
kind: Pod
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["exit", "1"]
restartPolicy: Never
Sadly there is currently no way to prevent the job controller to just respawn new pods when they fail, but the kubernetes community is working on a solution, see:
"Backoff policy and failed pod limit" https://github.com/kubernetes/community/pull/583