Why does a kubernetes cronjob pauses - kubernetes

I have cronjob that is defined by this manifest:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: trigger
spec:
concurrencyPolicy: Forbid
startingDeadlineSeconds: 5
schedule: "*/1 * * * *"
jobTemplate:
spec:
activeDeadlineSeconds: 50
backoffLimit: 1
parallelism: 1
template:
spec:
containers:
- env:
- name: ApiKey
valueFrom:
secretKeyRef:
key: apiKey
name: something
name: trigger
image: curlimages/curl:7.71.1
args:
- -H
- "Content-Type: application/json"
- -H
- "Authorization: $(ApiKey)"
- -d
- '{}'
- http://url
restartPolicy: Never
It sort of works, but not 100%. For some reason it runs 10 jobs, then it pauses for 5-10 minutes or so and then run 10 new jobs. No errors are reported, but we don't understand why it pauses.
Any ideas on what might cause a cronjob in kubernetes to pause?

The most common problem of running CronJobs on k8s is
spawning to many pods which consume all cluster resources.
It is very important to set proper CronJob limitations. So try to set memory limits for pods.
Also speaking about concurrencyPolicy you set concurrencyPolicy param to Forbid which means that the cron job does not allow concurrent runs; if it is time for a new job run and the previous job run hasn't finished yet, the cron job skips the new job run.
The .spec.concurrencyPolicy field is optional. It specifies how to treat concurrent executions of a job that is created by this cron job. There are following concurrency policies:
Allow (default): The cron job allows concurrently running jobs
Forbid: explained above
Replace: If it is time for a new job run and the previous job run hasn't finished yet, the cron job replaces the currently running job run with a new job run
Try to change policy to allow or replace according to your needs.
Speaking about a non-parallel Job, you can leave .spec.parallelism unset. When it is unset, it is defaulted to 1.
Take a look: cron-jobs-running-for-one-cron-execution-point-in-kubernetes, cron-job-limitations, cron-jobs.

Related

kubernetes cronjob unexpected scheduling behavior

I'm using kubernetes 1.21 cronjob to schedule a few jobs to run at a certain time every day.
I scheduled a job to be run at 4pm, via kubectl apply -f <name of yaml file>. Subsequently, I updated the yaml schedule: "0 22 * * *" to trigger the job at 10pm, using the same command kubectl apply -f <name of yaml file>
However, after applying the configuration at around 1pm, the job still triggers at 4pm (shouldn't have happened), and then triggers again at 10pm (intended trigger time).
Is there an explanation as to why this happens, and can I prevent it?
Sample yaml for the cronjob below:
apiVersion: batch/v1
kind: CronJob
metadata:
name: job-name-1
spec:
schedule: "0 16 * * *" # 4pm
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- image: sample-image
name: job-name-1
args:
- node
- ./built/script.js
env:
- name: NODE_OPTIONS
value: "--max-old-space-size=5000"
restartPolicy: Never
nodeSelector:
app: cronjob
I'm expecting the job to only trigger at 10pm.
Delete the cronjob and reapply it seems to eliminate such issues, but there are scenarios where I cannot the delete the job (because it's still running).
As you use kubectl apply -f <name of yaml file> to schedule a second Job at 10pm which means it will schedule a new Job but it will not replace the existing job. so the reason was that the Job at 4pm also scheduled and it runned.
Instead you need to use the below command to replace the Job with another scheduled Job.
kubectl patch cronjob my-cronjob -p '{"spec":{"schedule": "0 22 * * *"}}'
This will run Job only at 10Pm.
In order to delete the running Job use the below Process :
run in console:
crontab -e
then you will get crontab opened with an editor, simply delete the line there, save the file and quit the editor - that's it.
if you are running with a root user then use the below command and proceed as above step.
sudo crontab -e

Can manually triggered cron jobs respect the concurrencyPolicy?

So I've a cron job like this:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: my-cron-job
spec:
schedule: "0 0 31 2 *"
failedJobsHistoryLimit: 3
successfulJobsHistoryLimit: 1
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 30
jobTemplate:
spec:
backoffLimit: 0
activeDeadlineSeconds: 120
...
Then i trigger the job manually like so:
kubectl create job my-job --namespace precompile --from=cronjob/my-cron-job
But it seams like I can trigger the job as often as I want and the concurrencyPolicy: "Forbid" is ignored.
Is there a way so that manually triggered jobs will respect this or do I have to check this manually?
Note that concurrency policy only applies to the jobs created by the same cron job.
The concurrencyPolicy field only applies to jobs created by the same cron job, as stated in the documentation: https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#concurrency-policy
When executing $ kubectl create job my-job --namespace precompile --from=cronjob/my-cron-job you are essentially creating a one-time job on its own that uses the spec.jobTemplate field as a reference to create it. Since concurrencyPolicy is a cronjob field, it is not even being evaluated.
TL;DR
This actually is the expected behavior. Manually created jobs are not effected by concurrencyPolicy. There is no flag you could pass to change this behavior.

How to run kubernetes cronjob immediately

Im very new to kubernetes ,here i tired a cronjob yaml in which the pods are created at every 1 minute.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
but the pods are created only after 1 minute.is it possible to run job immediately and after that every 1 minute ?
As already stated in the comments CronJob is backed by Job. What you can do is literally launch Cronjob and Job resources using the same spec at the same time. You can do that conveniently using helm chart or kustomize.
Alternatively you can place both manifests in the same file or two files in the same directory and then use:
kubectl apply -f <file/dir>
With this workaround initial Job is started and then after some time Cronjob.
The downside of this solution is that first Job is standalone and it is not included in the Cronjob's history. Another possible side effect is that the first Job and first CronJob can run in parallel if the Job cannot finish its tasks fast enough. concurrencyPolicy does not take that Job into consideration.
From the documentation:
A cron job creates a job object about once per execution time of its
schedule. We say "about" because there are certain circumstances where
two jobs might be created, or no job might be created. We attempt to
make these rare, but do not completely prevent them.
So if you want to keep the task execution more strict, perhaps it may be better to use Bash wrapper script with sleep 1 between task executions or design an app that forks sub processes after specified interval, create a container image and run it as a Deployment.

Avoid multiple cron jobs running for one cron execution point in Kubernetes

EDIT: Question is solved, it was my mistake, i simply used the wrong cron settings. I assumed "* 2 * * *" would only run once per day at 2, but in fact it runs every minute past the hour 2. So Kubernetes behaves correctly.
I keep having multiple jobs running at one cron execution point. But it seems only if those jobs have a very short runtime. Any idea why this happens and how I can prevent it? I use concurrencyPolicy: Forbid, backoffLimit: 0 and restartPolicy: Never.
Example for a cron job that is supposed to run once per day, but runs multiple times just after its scheduled run time:
job-1554346620 1/1 11s 4h42m
job-1554346680 1/1 11s 4h41m
job-1554346740 1/1 10s 4h40m
Relevant config:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: job
spec:
schedule: "* 2 * * *"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: job
image: job_image:latest
command: ["rake", "run_job"]
restartPolicy: Never
imagePullSecrets:
- name: regcred
backoffLimit: 0
The most common problem of running CronJobs on k8s is:
spawning to many pods which consume all cluster resources
It is very important to set proper CronJob limitations
If you are not sure what you need - just take this example as a template:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: my-first-conjob
namespace: devenv-admitriev
spec:
schedule: "*/10 * * * *" # MM HH DD MM WKD -- Minutes, Hour, Day, Month, Weekday (eg. Sun, Mon)
successfulJobsHistoryLimit: 3 # how many completed jobs should be kept
failedJobsHistoryLimit: 1 # how many failed jobs should be kept
suspend: false # Here you can suspend cronjob without deliting it
concurrencyPolicy: Forbid # Choose Forbid if you don't want concurrent executions of your Job
# The amount of time that Kubernetes can miss and still start a job.
# If Kubernetes missed too many job starts (100)
# then Kubernetes logs an error and doesn’t start any future jobs.
startingDeadlineSeconds: 300 # if a job hasn't started in this many seconds, skip
jobTemplate:
spec:
parallelism: 1 # How many pods will be instantiated at once.
completions: 1 # How many containers of the job are instantiated one after the other (sequentially) inside the pod.
backoffLimit: 3 # Maximum pod restarts in case of failure
activeDeadlineSeconds: 1800 # Limit the time for which a Job can continue to run
template:
spec:
restartPolicy: Never # If you want to restart - use OnFailure
terminationGracePeriodSeconds: 30
containers:
- name: my-first-conjob
image: busybox
command:
- /bin/sh
args:
- -c
- date; echo sleeping....; sleep 90s; echo exiting...;
resources:
requests:
memory: '128Mi'
limits:
memory: '1Gi'
Hi it's not clear what you expected - looking into the question but if I understand correctly you mean not running all cronjobs at the same time:
1. First option - it's to change their schedule time,
2. Second option try to use in your spec template other options like - Parallel Jobs - described: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
"For a work queue Job, you must leave .spec.completions unset, and set .spec.parallelism to a non-negative integer"
jobTemplate:
spec:
parallelism: 1
template:
To recreate this task please provide more details.
In addition for "Jobs History"
by default successfulJobsHistoryLimit and failedJobsHistoryLimit are set to 3 and 1 respectively.
Please take at: https://kubernetes.io/docs/tasks/job/
If you are interested you can set-up limit in "spec" section:
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 1
Hope this help.

How to fail a (cron) job after a certain number of retries?

We have a Kubernetes cluster of web scraping cron jobs set up. All seems to go well until a cron job starts to fail (e.g., when a site structure changes and our scraper no longer works). It looks like every now and then a few failing cron jobs will continue to retry to the point it brings down our cluster. Running kubectl get cronjobs (prior to a cluster failure) will show too many jobs running for a failing job.
I've attempted following the note described here regarding a known issue with the pod backoff failure policy; however, that does not seem to work.
Here is our config for reference:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: scrape-al
spec:
schedule: '*/15 * * * *'
concurrencyPolicy: Allow
failedJobsHistoryLimit: 0
successfulJobsHistoryLimit: 0
jobTemplate:
metadata:
labels:
app: scrape
scrape: al
spec:
template:
spec:
containers:
- name: scrape-al
image: 'govhawk/openstates:1.3.1-beta'
command:
- /opt/openstates/openstates/pupa-scrape.sh
args:
- al bills --scrape
restartPolicy: Never
backoffLimit: 3
Ideally we would prefer that a cron job would be terminated after N retries (e.g., something like kubectl delete cronjob my-cron-job after my-cron-job has failed 5 times). Any ideas or suggestions would be much appreciated. Thanks!
You can tell your Job to stop retrying using backoffLimit.
Specifies the number of retries before marking this job failed.
In your case
spec:
template:
spec:
containers:
- name: scrape-al
image: 'govhawk/openstates:1.3.1-beta'
command:
- /opt/openstates/openstates/pupa-scrape.sh
args:
- al bills --scrape
restartPolicy: Never
backoffLimit: 3
You set 3 asbackoffLimit of your Job. That means when a Job is created by CronJob, It will retry 3 times if fails. This controls Job, not CronJob
When Job is failed, another Job will be created again as your scheduled period.
You want:
If I am not wrong, you want to stop scheduling new Job, when your scheduled Jobs are failed for 5 times. Right?
Answer:
In that case, this is not possible automatically.
Possible solution:
You need to suspend CronJob so than it stop scheduling new Job.
Suspend: true
You can do this manually. If you do not want to do this manually, you need to setup a watcher, that will watch your CronJob status, and will update CronJob to suspend if necessary.