Kubernetes Cronjob Only Runs Half the Time

Kubernetes Cronjob Only Runs Half the Time - kubernetes

I want a job to trigger every 15 minutes but it is consistently triggering every 30 minutes.
UPDATE:
I've simplified the problem by just running:
kubectl run hello --schedule="*/1 * * * *" --restart=OnFailure --image=busybox -- /bin/sh -c "date; echo Hello from the Kubernetes cluster"
As specified in the docs here: https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/
and yet the job still refuses to run on time.
$ kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
hello */1 * * * * False 1 5m 30m
hello2 */1 * * * * False 1 5m 12m
It took 25 minutes for the command line created cronjob to run and 7 minutes for the cronjob created from yaml. They were both finally scheduled at the same time so it's almost like etcd finally woke up and did something?
ORIGINAL ISSUE:
When I drill into an active job I see Status: Terminated: Completed but
Age: 25 minutes or something greater than 15.
In the logs I see that the python script meant to run has completed it's final print statement. The script takes about ~2min to complete based on it's output file in s3. Then no new job is scheduled for 28 more minutes.
I have tried with different configurations:
Schedule: */15 * * * * AND Schedule: 0,15,30,45 * * * *
As well as
Concurrency Policy: Forbid AND Concurrency Policy: Replace
What else could be going wrong here?
Full config with identifying lines modified:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
type: f-c
name: f-c-p
namespace: extract
spec:
concurrencyPolicy: Forbid
failedJobsHistoryLimit: 1
jobTemplate:
metadata:
creationTimestamp: null
spec:
template:
metadata:
creationTimestamp: null
labels:
type: f-c
spec:
containers:
- args:
- /f_c.sh
image: identifier.amazonaws.com/extract_transform:latest
imagePullPolicy: Always
env:
- name: ENV
value: prod
- name: SLACK_TOKEN
valueFrom:
secretKeyRef:
key: slack_token
name: api-tokens
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: aws_access_key_id
name: api-tokens
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: aws_secret_access_key
name: api-tokens
- name: F_ACCESS_TOKEN
valueFrom:
secretKeyRef:
key: f_access_token
name: api-tokens
name: s-f-c
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
schedule: '*/15 * * * *'
successfulJobsHistoryLimit: 1
suspend: false
status: {}

After running these jobs in a test cluster I discovered that external circumstances prevented them from running as intended.
On the original cluster there were ~20k scheduled jobs. The built-in scheduler for Kubernetes is not yet capable of handling this volume consistently.
The maximum number of jobs that can be reliably run (within a minute of the time intended) may depend on the size of your master nodes.

Isn't that by design?
A cron job creates a job object about once per execution time of its schedule. We say “about” because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent.
Ref. https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#cron-job-limitations

Related

Script in a pod is not getting executed

I have an EKS cluster and an RDS (mariadb). I am trying to make a backup of given databases though a script in a CronJob. The CronJob object looks like this:
apiVersion: batch/v1
kind: CronJob
metadata:
name: mysqldump
namespace: mysqldump
spec:
schedule: "* * * * *"
concurrencyPolicy: Replace
jobTemplate:
spec:
template:
spec:
containers:
- name: mysql-backup
image: viejo/debian-mysqldump:latest
envFrom:
- configMapRef:
name: mysqldump-config
args:
- /bin/bash
- -c
- /root/mysqldump.sh "(${MYSQLDUMP_DATABASES})" > /proc/1/fd/1 2>/proc/1/fd/2 || echo KO > /tmp/healthcheck
resources:
limits:
cpu: "0.5"
memory: "0.5Gi"
restartPolicy: OnFailure
The script is called mysqldump.sh, which gets all necessary details from a ConfigMap object. It makes the dump of the databases in an environment variable MYSQLDUMP_DATABASES, and moves it to S3 bucket.
Note: I am going to move some variables to a Secret, but before I need this to work.
What happens is NOTHING. The script is never getting executed I tried putting a "echo starting the backup", before the script, and "echo backup ended" after it, but I don't see none of them. If I'd access the container and execute the same exact command manually, it works:
root#mysqldump-27550908-sjwfm:/# /root/mysqldump.sh "(${MYSQLDUMP_DATABASES})" > /proc/1/fd/1 2>/proc/1/fd/2 || echo KO > /tmp/healthcheck
root#mysqldump-27550908-sjwfm:/#
Can anyone point out a possible issue?

Try change args to command:
...
command:
- /bin/bash
- -c
- /root/mysqldump.sh "(${MYSQLDUMP_DATABASES})" > /proc/1/fd/1 2>/proc/1/fd/2 || echo KO > /tmp/healthcheck
...

The active users count dont match on execution through 2 containers in Kubernates

We are running jmx through Tauras using 2 containers in Kubernetes.
We are seeing only 50 users in results instead of 100(50*2 containers).
Can anyone please through some light if we are missing something here.
We get two jtl and checking them individual or combined the total users are same 50 only. Is it related to same Thread name being generated and logged in jtl file or something else.
Here is the yml details:
apiVersion: v1
kind: ConfigMap
metadata:
name: joba
namespace: AAA
data:
protocol: "https"
serverUrl: “testurl”
users: "50”
duration: "1m"
nodeName: "Nodename"
---
apiVersion: /v1
kind: Job
metadata:
name: perftest
namespace: dev
spec:
template:
spec:
containers:
- args: ["split -l ${users} --numeric-suffixes Test.csv Test-; /bin/bash ./Shellscripttoread_assignvariables.sh;"]
command: ["/bin/bash", "-c"]
env:
- name: JobNumber
value: "00"
envFrom:
- configMapRef:
name: job-multi
image: imagepath
name: ubuntu-00
resources:
limits:
memory: “8000Mi"
cpu: "2880m"
- args: ["split -l ${users} --numeric-suffixes Test.csv Test-; /bin/bash ./Shellscripttoread_assignvariables.sh;"]
command: ["/bin/bash", "-c"]
env:
- name: JobNumber
value: "01”
envFrom:
- configMapRef:
name: job-multi
image: imagepath
name: ubuntu-01
resources:
limits:
memory: “8000Mi"
cpu: "2880m"

Your YAML is very nice but it doesn't tell anything about how do you launch JMeter or what these shell scripts you invoke are doing.
If you just kick off 2 separate JMeter instances by means of k8s - JMeter will look at the number of active threads from the .jtl file and given the Sampler/Transaction names are the same JMeter "thinks" that the tests were executed on one engine.
The workaround is to add i.e. machineName() or __machineIP() function to sampler/transaction labels, this way JMeter will distinguish the results coming from different instances and you will see real number of active threads.
The solution would be running your JMeter test in Distributed Mode so master will run in one pod, slaves in their own pods and the master will be responsible for transferring .jmx script to the slaves and collecting results from them

how to run a cronjob every 10 seconds in kubernetes?

"I just want to run a cronjob in Kubernetes in every 10 seconds. what would be the imperative command for that?"

You can’t use CronJob kubernetes object for running less than 1 minute. You might be using the wrong tool for a process that has to run so often. https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Create an infinite loop on a Deployment (daemonize it)
You’ll need to use a bash formula (or whatever programming language you like best, Go, Java, Python or Ruby) to make an infinite loop and sleep 10 seconds per each execution inside a Deployment. Here an example with bash/sh:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cronjob-deployment
labels:
app: cronjob
spec:
replicas: 1
selector:
matchLabels:
app: cronjob
template:
metadata:
labels:
app: cronjob
spec:
containers:
- name: cronjob
image: busybox
args:
- /bin/sh
- -c
- while true; do echo call ./script.sh here; sleep 10; done
Create 1 CronJob with several containers
If you still want to use CronJobs you can do it with 6 containers inside the definition. One without delay, and the others with 10, 20, 30, 40 and 50 seconds of delay.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: no_delay
image: busybox
args:
- /bin/sh
- -c
- echo call ./script.sh here
- name: 10_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 10; echo call ./script.sh here
- name: 20_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 20; echo call ./script.sh here
- name: 30_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 30; echo call ./script.sh here
- name: 40_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 40; echo call ./script.sh here
- name: 50_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 50; echo call ./script.sh here
restartPolicy: OnFailure
Of course, one of the problems you might encounter is that your process might be overlapped (runned concurrently at the same time). This will depend on the amount of seconds your process needs to run, and the time kubernetes needs to schedule and create a container.

If your task needs to run that frequently, cron is the wrong tool.
Aside from the fact that it simply won't launch jobs that frequently, you also risk some serious problems if the job takes longer to run than the interval between launches. Rewrite your task to daemonize and run persistently, then launch it from cron if necessary (while making sure that it won't relaunch if it's already running).

You can write a script that executes for 6 times with an interval of 10 seconds.
and set Kubernetes cron job to run every minute.
in that manner, in every minute your script starts running which in turn execute the task in every 10 seconds.
script to run logic in every 10 seconds for 6 times when cron job executes after one minute.
This will print hello world in every 10 seconds for 6 times :
#!/bin/bash -x
a=0
until [ $a -gt 5 ]
do
echo "hello world"
a=expr $a + 1
sleep 10
done
cronjob sample :
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image:
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- ./sample.sh
restartPolicy: OnFailure
~
So in that way your cron job executes in every one minute .which in turns starts your srcipt which runs in every 10 seconds and execute buisness logic for 6 minutes.
This is the idea which you can follow to make cron job work in seconds as Kubernetes does not provide value for scheduling lower than 1 minute.
Although in this approach you need to set the strategy of not overlapping next execution of cron job.
for example, if your business logic takes 15 seconds to executes and you are running business logic every 10 seconds 6 times in a minute.
As business logic takes 15seconds so ideally, it should run for 4 times rather 6 times in a minute. Accordingly, you need to tweak the repetition inside the script.

Handling cronjobs in a Pod with multiple containers

I have a requirement in which I need to create a cronjob in kubernetes but the pod is having multiple containers (with single container its working fine).
Is it possible?
The requirement is something like this:
1. First container: Run the shell script to do a job.
2. Second container: run fluentbit conf to parse the log and send it.
Previously I thought to have a deployment in place and that is working fine but since that deployment was used just for 10 mins jobs I thought to make it a cron job.
Any help is really appreciated.
Also about the cronjob I am not sure if a pod can support multiple containers to do that same.
Thank you,
Sunny

Yes you can create a cronjob with multiple containers. CronJob is an abstraction on top of pod. So in the pod spec you can have multiple containers just like you can have in a normal pod. As an example
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: default
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
- name: app
image: alpine
command:
- echo
- Hello World!
restartPolicy: OnFailure

I need to agree with the answer provided by #Arghya Sadhu. It shows how you can run multi container Pod with a CronJob. Before the answer I would like to give more attention to the comment provided by #Chris Stryczynski:
It's not clear whether the containers are run in parallel or sequentially
It is not entirely clear if the workload that you are trying to run:
The requirement is something like this:
First container: Run the shell script to do a job.
Second container: run fluentbit conf to parse the log and send it.
could be used in parallel (both running at the same time) or require sequential approach (after X completed successfully, run Y).
If the workload could be run in parallel the answer provided by #Arghya Sadhu is correct, however if one workload is depending on another, I'd reckon you should be using initContainers instead of multi container Pods.
The example of a CronJob that implements the initContainer could be following:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
containers:
- name: ubuntu
image: ubuntu
command: [/bin/bash]
args: ["-c","cat /data/hello_there.txt"]
volumeMounts:
- name: data-dir
mountPath: /data
initContainers:
- name: echo
image: busybox
command: ["bin/sh"]
args: ["-c", "echo 'General Kenobi!' > /data/hello_there.txt"]
volumeMounts:
- name: data-dir
mountPath: "/data"
volumes:
- name: data-dir
emptyDir: {}
This CronJob will write a specific text to a file with an initContainer and then a "main" container will display its result. It's worth to mention that the main container will not start if the initContainer won't succeed with its operations.
$ kubectl logs hello-1234567890-abcde
General Kenobi!
Additional resources:
Linchpiner.github.io: K8S multi container pods

Whats about sidecar container for logging as second container which keep running without exit code. Even the job might run the state of the job still failed.

How to use concurrencyPolicy for GKE cron job correctly?

I set concurrencyPolicy to Allow, here is my cronjob.yaml:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: gke-cron-job
spec:
schedule: '*/1 * * * *'
startingDeadlineSeconds: 10
concurrencyPolicy: Allow
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
metadata:
labels:
run: gke-cron-job
spec:
restartPolicy: Never
containers:
- name: gke-cron-job-solution-2
image: docker.io/novaline/gke-cron-job-solution-2:1.3
env:
- name: NODE_ENV
value: 'production'
- name: EMAIL_TO
value: 'novaline.dulin#gmail.com'
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
ports:
- containerPort: 8080
protocol: TCP
After reading docs: https://cloud.google.com/kubernetes-engine/docs/how-to/cronjobs
I still don't understand how to use concurrencyPolicy.
How can I run my cron job concurrency?
Here is the logs of cron job:
☁ nodejs-gcp [master] ⚡ kubectl logs -l run=gke-cron-job
> gke-cron-job-solution-2#1.0.2 start /app
> node ./src/index.js
config: { ENV: 'production',
EMAIL_TO: 'novaline.dulin#gmail.com',
K8S_POD_NAME: 'gke-cron-job-1548660540-gmwvc',
VERSION: '1.0.2' }
[2019-01-28T07:29:10.593Z] Start daily report
send email: { to: 'novaline.dulin#gmail.com', text: { test: 'test data' } }
> gke-cron-job-solution-2#1.0.2 start /app
> node ./src/index.js
config: { ENV: 'production',
EMAIL_TO: 'novaline.dulin#gmail.com',
K8S_POD_NAME: 'gke-cron-job-1548660600-wbl5g',
VERSION: '1.0.2' }
[2019-01-28T07:30:11.405Z] Start daily report
send email: { to: 'novaline.dulin#gmail.com', text: { test: 'test data' } }
> gke-cron-job-solution-2#1.0.2 start /app
> node ./src/index.js
config: { ENV: 'production',
EMAIL_TO: 'novaline.dulin#gmail.com',
K8S_POD_NAME: 'gke-cron-job-1548660660-8mn4r',
VERSION: '1.0.2' }
[2019-01-28T07:31:11.099Z] Start daily report
send email: { to: 'novaline.dulin#gmail.com', text: { test: 'test data' } }
As you can see, the timestamp indicates that the cron job is not concurrency.

It's because you're reading the wrong documentation. CronJobs aren't a GKE-specific feature. For the full documentation on CronJob API, refer to the Kubernetes documentation: https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#concurrency-policy (quoted below).
Concurrency policy decides whether a new container can be started while the previous CronJob is still running. If you have a CronJob that runs every 5 minutes, and sometimes the Job takes 8 minutes, then you may run into a case where multiple jobs are running at a time. This policy decides what to do in that case.
Concurrency Policy
The .spec.concurrencyPolicy field is also optional. It specifies how to treat concurrent executions of a job that is created by this cron job. the spec may specify only one of the following concurrency policies:
Allow (default): The cron job allows concurrently running jobs
Forbid: The cron job does not allow concurrent runs; if it is time for a new job run and the previous job run hasn’t finished yet, the cron job skips the new job run
Replace: If it is time for a new job run and the previous job run hasn’t finished yet, the cron job replaces the currently running job run with a new job run
Note that concurrency policy only applies to the jobs created by the same cron job. If there are multiple cron jobs, their respective jobs are always allowed to run concurrently.