K8S - Use "Last Schedule" of a CronJob as a parameters in container command - kubernetes

I created a Cronjob that is scheduled for every morning, from Monday to Friday.
schedule: "30 7 * * 1-5"
The job synchronize a database with a datawarehouse. As I do not want to sync all the database everyday, I want to be able to give two parameters in the command container's section of my CronJob ; last time the job started and now:
containers:
- image: xxx
name: dwh
command: ["/bin/bash", "-c", "rake etl:sync LAST_TIME_SCHEDULED NOW"]
resources: {}
How can I get the "Last Schedule" timestamp of the CronJob in this situation, and the timestamp of now ?

Related

k8s check wait for pod existence

In K8s job file declared in yaml (with helm), I need that the job will run only whether the pod database exists and ready.
For the readiness, it works fine, since I added the following:
initContainers
- name: wait-mysql-ready
image: "bitnami//kubectl:latest"
imagePullPolicy: IfNotPresent
command:
- kubectl
args:
- wait
- pod/mysql-pod
- --for=condition=ready
- --timeout=120s
It works fine, but I need the job to run once (without duplications, and with long time period till it ends).
The job doesn't as usual run once, and it's name, when running kubectl get pods is <jobname-hash>
If it doesn't run once, at a the last attempt it will succeed (because the pod isn't created yet). Other attempt may failed.
So I added the following lines in main spec:
spec:
parallelism: 1
completions: 1
backoffLimit: 0
and in init container section (befor the previous one), I have added:
initContainers
- name: wait-mysql-exist-pod
image: "bitnami/kubectl:latest"
imagePullPolicy: IfNotPresent
command:
- /bin/sh
args:
- -c
- "while !(kubectl get pod mysql-pod); do echo 'Waiting for mysql pod to be existed...'; sleep 5; done"
(I could not find other good syntax for ! - for multiline string. I would be glade knowing how).
Also need another job to wait for the current job.
How can I create a job that run once, and check in the job that the pod exists before checking the ready state?
You can use this script as init-container to wait for other resources.

cronjob yml file with wget command

Hi I'm new with Kubernetes. I'm trying to run wget command in cronjob.yml file to get data from url each day. For now I'm testing it and pass schedule as 1min. I also add some echo command just to get some response from that job. Below is my yml file. I'm changing directory to folder where I want to save data and passing url with site from which I'm taking it. I tried url in terminal with wget url and it works and download json file hidden in url.
apiVersion: batch/v1
kind: CronJob
metadata:
name: reference
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: reference
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
- cd /mnt/c/Users/path_to_folder
- wget {url}
restartPolicy: OnFailure
When I create job and watch the pod logs nothing happen with url, I don't get any response.
Commands I run are:
kubectl create -f cronjob.yml
kubectl get pods
kubectl logs <pod_name>
In return I just get only command with date (img above)
When I leave just command with wget, nothing happen. In pods I can see in STATUS CrashLoopBackOff. So the command has problem to run.
command:
- cd /mnt/c/Users/path_to_folder
- wget {url}
How does wget command in cronjob.yml should look like?
The command in kubernetes is docker equivalent to entrypoint in docker. For any container, there should be only one process as entry point. Either the default entry point in the image or supplied via command.
Here you are using /bin/sh as a single process and everything else as it's argument. The way you were executing /bin/sh -c , it means providing date; echo Hello from the Kubernetes cluster as input command. NOT the cd and wget commands. Change your manifest to the following to feed everything as one block to the /bin/sh. Note that, all the commands is fit as 1 argument.
apiVersion: batch/v1
kind: CronJob
metadata:
name: reference
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: reference
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster; cd /mnt/c/Users/path_to_folder;wget {url}
restartPolicy: OnFailure
To illustrate the problem, check the following examples. Note that only 1st argument is executed.
/bin/sh -c date
Tue 24 Aug 2021 12:28:30 PM CDT
/bin/sh -c echo hi
/bin/sh -c 'echo hi'
hi
/bin/sh -c 'echo hi && date'
hi
Tue 24 Aug 2021 12:28:45 PM CDT
/bin/sh -c 'echo hi' date #<-----your case is similar to this, no date printed.
hi
-c Read commands from the command_string operand instead of from the standard input. Special parameter 0
will be set from the command_name operand and the positional parameters ($1, $2, etc.) set from the re‐
maining argument operands.

how to run a cronjob every 10 seconds in kubernetes?

"I just want to run a cronjob in Kubernetes in every 10 seconds. what would be the imperative command for that?"
You can’t use CronJob kubernetes object for running less than 1 minute. You might be using the wrong tool for a process that has to run so often. https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
Create an infinite loop on a Deployment (daemonize it)
You’ll need to use a bash formula (or whatever programming language you like best, Go, Java, Python or Ruby) to make an infinite loop and sleep 10 seconds per each execution inside a Deployment. Here an example with bash/sh:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cronjob-deployment
labels:
app: cronjob
spec:
replicas: 1
selector:
matchLabels:
app: cronjob
template:
metadata:
labels:
app: cronjob
spec:
containers:
- name: cronjob
image: busybox
args:
- /bin/sh
- -c
- while true; do echo call ./script.sh here; sleep 10; done
Create 1 CronJob with several containers
If you still want to use CronJobs you can do it with 6 containers inside the definition. One without delay, and the others with 10, 20, 30, 40 and 50 seconds of delay.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: no_delay
image: busybox
args:
- /bin/sh
- -c
- echo call ./script.sh here
- name: 10_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 10; echo call ./script.sh here
- name: 20_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 20; echo call ./script.sh here
- name: 30_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 30; echo call ./script.sh here
- name: 40_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 40; echo call ./script.sh here
- name: 50_seconds
image: busybox
args:
- /bin/sh
- -c
- sleep 50; echo call ./script.sh here
restartPolicy: OnFailure
Of course, one of the problems you might encounter is that your process might be overlapped (runned concurrently at the same time). This will depend on the amount of seconds your process needs to run, and the time kubernetes needs to schedule and create a container.
If your task needs to run that frequently, cron is the wrong tool.
Aside from the fact that it simply won't launch jobs that frequently, you also risk some serious problems if the job takes longer to run than the interval between launches. Rewrite your task to daemonize and run persistently, then launch it from cron if necessary (while making sure that it won't relaunch if it's already running).
You can write a script that executes for 6 times with an interval of 10 seconds.
and set Kubernetes cron job to run every minute.
in that manner, in every minute your script starts running which in turn execute the task in every 10 seconds.
script to run logic in every 10 seconds for 6 times when cron job executes after one minute.
This will print hello world in every 10 seconds for 6 times :
#!/bin/bash -x
a=0
until [ $a -gt 5 ]
do
echo "hello world"
a=expr $a + 1
sleep 10
done
cronjob sample :
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image:
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- ./sample.sh
restartPolicy: OnFailure
~
So in that way your cron job executes in every one minute .which in turns starts your srcipt which runs in every 10 seconds and execute buisness logic for 6 minutes.
This is the idea which you can follow to make cron job work in seconds as Kubernetes does not provide value for scheduling lower than 1 minute.
Although in this approach you need to set the strategy of not overlapping next execution of cron job.
for example, if your business logic takes 15 seconds to executes and you are running business logic every 10 seconds 6 times in a minute.
As business logic takes 15seconds so ideally, it should run for 4 times rather 6 times in a minute. Accordingly, you need to tweak the repetition inside the script.

Handling cronjobs in a Pod with multiple containers

I have a requirement in which I need to create a cronjob in kubernetes but the pod is having multiple containers (with single container its working fine).
Is it possible?
The requirement is something like this:
1. First container: Run the shell script to do a job.
2. Second container: run fluentbit conf to parse the log and send it.
Previously I thought to have a deployment in place and that is working fine but since that deployment was used just for 10 mins jobs I thought to make it a cron job.
Any help is really appreciated.
Also about the cronjob I am not sure if a pod can support multiple containers to do that same.
Thank you,
Sunny
Yes you can create a cronjob with multiple containers. CronJob is an abstraction on top of pod. So in the pod spec you can have multiple containers just like you can have in a normal pod. As an example
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: default
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
- name: app
image: alpine
command:
- echo
- Hello World!
restartPolicy: OnFailure
I need to agree with the answer provided by #Arghya Sadhu. It shows how you can run multi container Pod with a CronJob. Before the answer I would like to give more attention to the comment provided by #Chris Stryczynski:
It's not clear whether the containers are run in parallel or sequentially
It is not entirely clear if the workload that you are trying to run:
The requirement is something like this:
First container: Run the shell script to do a job.
Second container: run fluentbit conf to parse the log and send it.
could be used in parallel (both running at the same time) or require sequential approach (after X completed successfully, run Y).
If the workload could be run in parallel the answer provided by #Arghya Sadhu is correct, however if one workload is depending on another, I'd reckon you should be using initContainers instead of multi container Pods.
The example of a CronJob that implements the initContainer could be following:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
containers:
- name: ubuntu
image: ubuntu
command: [/bin/bash]
args: ["-c","cat /data/hello_there.txt"]
volumeMounts:
- name: data-dir
mountPath: /data
initContainers:
- name: echo
image: busybox
command: ["bin/sh"]
args: ["-c", "echo 'General Kenobi!' > /data/hello_there.txt"]
volumeMounts:
- name: data-dir
mountPath: "/data"
volumes:
- name: data-dir
emptyDir: {}
This CronJob will write a specific text to a file with an initContainer and then a "main" container will display its result. It's worth to mention that the main container will not start if the initContainer won't succeed with its operations.
$ kubectl logs hello-1234567890-abcde
General Kenobi!
Additional resources:
Linchpiner.github.io: K8S multi container pods
Whats about sidecar container for logging as second container which keep running without exit code. Even the job might run the state of the job still failed.

How to use concurrencyPolicy for GKE cron job correctly?

I set concurrencyPolicy to Allow, here is my cronjob.yaml:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: gke-cron-job
spec:
schedule: '*/1 * * * *'
startingDeadlineSeconds: 10
concurrencyPolicy: Allow
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
metadata:
labels:
run: gke-cron-job
spec:
restartPolicy: Never
containers:
- name: gke-cron-job-solution-2
image: docker.io/novaline/gke-cron-job-solution-2:1.3
env:
- name: NODE_ENV
value: 'production'
- name: EMAIL_TO
value: 'novaline.dulin#gmail.com'
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
ports:
- containerPort: 8080
protocol: TCP
After reading docs: https://cloud.google.com/kubernetes-engine/docs/how-to/cronjobs
I still don't understand how to use concurrencyPolicy.
How can I run my cron job concurrency?
Here is the logs of cron job:
☁ nodejs-gcp [master] ⚡ kubectl logs -l run=gke-cron-job
> gke-cron-job-solution-2#1.0.2 start /app
> node ./src/index.js
config: { ENV: 'production',
EMAIL_TO: 'novaline.dulin#gmail.com',
K8S_POD_NAME: 'gke-cron-job-1548660540-gmwvc',
VERSION: '1.0.2' }
[2019-01-28T07:29:10.593Z] Start daily report
send email: { to: 'novaline.dulin#gmail.com', text: { test: 'test data' } }
> gke-cron-job-solution-2#1.0.2 start /app
> node ./src/index.js
config: { ENV: 'production',
EMAIL_TO: 'novaline.dulin#gmail.com',
K8S_POD_NAME: 'gke-cron-job-1548660600-wbl5g',
VERSION: '1.0.2' }
[2019-01-28T07:30:11.405Z] Start daily report
send email: { to: 'novaline.dulin#gmail.com', text: { test: 'test data' } }
> gke-cron-job-solution-2#1.0.2 start /app
> node ./src/index.js
config: { ENV: 'production',
EMAIL_TO: 'novaline.dulin#gmail.com',
K8S_POD_NAME: 'gke-cron-job-1548660660-8mn4r',
VERSION: '1.0.2' }
[2019-01-28T07:31:11.099Z] Start daily report
send email: { to: 'novaline.dulin#gmail.com', text: { test: 'test data' } }
As you can see, the timestamp indicates that the cron job is not concurrency.
It's because you're reading the wrong documentation. CronJobs aren't a GKE-specific feature. For the full documentation on CronJob API, refer to the Kubernetes documentation: https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#concurrency-policy (quoted below).
Concurrency policy decides whether a new container can be started while the previous CronJob is still running. If you have a CronJob that runs every 5 minutes, and sometimes the Job takes 8 minutes, then you may run into a case where multiple jobs are running at a time. This policy decides what to do in that case.
Concurrency Policy
The .spec.concurrencyPolicy field is also optional. It specifies how to treat concurrent executions of a job that is created by this cron job. the spec may specify only one of the following concurrency policies:
Allow (default): The cron job allows concurrently running jobs
Forbid: The cron job does not allow concurrent runs; if it is time for a new job run and the previous job run hasn’t finished yet, the cron job skips the new job run
Replace: If it is time for a new job run and the previous job run hasn’t finished yet, the cron job replaces the currently running job run with a new job run
Note that concurrency policy only applies to the jobs created by the same cron job. If there are multiple cron jobs, their respective jobs are always allowed to run concurrently.