Run scheduled task inside Pod in Kubernetes - kubernetes

I have a small instance of influxdb running in my kubernetes cluster.
The data of that instance is stored in a persistent storage.
But I also want to run the backup command from influx at scheduled interval.
influxd backup -portable /backuppath
What I do now is exec into the pod and run it manually.
Is there a way that I can do this automatically?

You can consider running a CronJob with bitnami kubectl which will execute the backup command. This is the same as exec into the pod and run except now you automate it with CronJob.

CronJob is the way to go here. It acts more or less like a crontab, but for Kubernetes.
As an example you could use this
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup
spec:
schedule: 0 8 * * *
jobTemplate:
spec:
template:
spec:
containers:
- name: influxdb-backup
image: influxdb
imagePullPolicy: IfNotPresent
command: ["/bin/sh"]
args:
- "-c"
- "influxd backup -portable /backuppath"
restartPolicy: Never
This will create a Job, everyday at 08:00, executing influxd backup -portable /backuppath. Of course, you have to edit it accordingly, to work on your environment.

This is the solution I have used for this question
apiVersion: v1
kind: ConfigMap
metadata:
name: cm-backupscript
namespace: influx
data:
backupscript.sh: |
#!/bin/bash
echo 'getting pod name'
podName=$(kubectl get pods -n influx --field-selector=status.phase==Running --output=jsonpath={.items..metadata.name})
echo $podName
#echo 'create backup'
kubectl exec -it $podName -n influx -- /mnt/influxBackupScript/influxbackup.sh
echo 'done'
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: backup-cron
namespace: influx
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
volumes:
- name: backup-script
configMap:
name: cm-backupscript
defaultMode: 0777
containers:
- name: kubectl
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- /mnt/scripts/backupscript.sh
volumeMounts:
- name: backup-script
mountPath: "/mnt/scripts"
restartPolicy: Never

You can either run it as a cronjob and setup the image to be able to connect to the DB, or you can sidecar it alongside your db pod, and set it to run the cron image (i.e. will run as a mostly-idle container in the same pod as your DB)

Related

Executing a Script using a Cronjob Kubernetes Cluster

I have a 3 node K8 v1.21 cluster in AWS and looking for SOLID config to run a script using a cronjob. I have seen many documents on here and Google using cronjob and hostPath to Persistent Volumes/Claims to using ConfigMaps, the list goes one.
I keep getting "Back-off restarting failed container/CrashLoopBackOff" errors.
Any help is much appreciated.
cronjob.yaml
The script I am trying to run is basic for testing only
#! /bin/<br/>
kubectl create deployment nginx --image=nginx
Still getting the same error.
kubectl describe pod/xxxx
This hostPath in AWS cluster created using eksctl works.
apiVersion: v1
kind: Pod
metadata:
name: redis-hostpath
spec:
containers:
- image: redis
name: redis-container
volumeMounts:
- mountPath: /test-mnt
name: test-vol
volumes:
- name: test-vol
hostPath:
path: /test-vol
UPDATE
Tried running your config in GCP on a fresh cluster. Only thing I changed was the /home/script.sh to /home/admin/script.sh
Did you test this on your cluster?
Warning FailedPostStartHook 5m27s kubelet Exec lifecycle hook ([/home/mchung/script.sh]) for Container "busybox" in Pod "dumb-job-1635012900-qphqr_default(305c4ed4-08d1-4585-83e0-37a2bc008487)" failed - error: rpc error: code = Unknown desc = failed to exec in container: failed to create exec "0f9f72ccc6279542f18ebe77f497e8c2a8fd52f8dfad118c723a1ba025b05771": cannot exec in a deleted state: unknown, message: ""
Normal Killing 5m27s kubelet FailedPostStartHook
Assuming you're running it in a remote multi-node cluster (since you mentioned AWS in your question), hostPath is NOT an option there for volume mount. Your best choice would be to use a ConfigMap and use it as volume mount.
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-script
data:
script.sh: |
# write down your script here
And then:
apiVersion: batch/v1
kind: CronJob
metadata:
name: redis-job
spec:
schedule: '*/5 * * * *'
jobTemplate:
spec:
template:
spec:
containers:
- name: redis-container
image: redis
args:
- /bin/sh
- -c
- /home/user/script.sh
volumeMounts:
- name: redis-data
mountPath: /home/user/script.sh
subPath: script.sh
volumes:
- name: redis-data
configMap:
name: redis-script
Hope this helps. Let me know if you face any difficulties.
Update:
I think you're doing something wrong. kubectl isn't something you should run from another container / pod. Because it requires the necessary binary to be existed into that container and an appropriate context set. I'm putting a working manifest below for you to understand the whole concept of running a script as a part of cron job:
apiVersion: v1
kind: ConfigMap
metadata:
name: script-config
data:
script.sh: |-
name=StackOverflow
echo "I love $name <3"
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: dumb-job
spec:
schedule: '*/1 * * * *' # every minute
jobTemplate:
spec:
template:
spec:
containers:
- name: busybox
image: busybox:stable
lifecycle:
postStart:
exec:
command:
- /home/script.sh
volumeMounts:
- name: some-volume
mountPath: /home/script.sh
volumes:
- name: some-volume
configMap:
name: script-config
restartPolicy: OnFailure
What it'll do is it'll print some texts in the STDOUT in every minute. Please note that I have put only the commands that container is capable to execute, and kubectl is certainly not one of them which exists in that container out-of-the-box. I hope that is enough to answer your question.

How to execute script shell in Kubernetes cronjob

I would like to run a shell script inside the Kubernetes using CronJob, here is my CronJon.yaml file :
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- /home/admin_/test.sh
restartPolicy: OnFailure
CronJob has been created ( kubectl apply -f CronJob.yaml )
when I get the list of cronjob I can see the cron job ( kubectl get cj ) and when I run "kubectl get pods" I can see the pod is being created, but pod crashes.
Can anyone help me to learn how I can create a CronJob inside the Kubernetes please ?
As correctly pointed out in the comments, you need to provide the script file in order to execute it via your CronJob. You can do that by mounting the file within a volume. For example, your CronJob could look like this:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- /myscript/test.sh
volumeMounts:
- name: script-dir
mountPath: /myscript
restartPolicy: OnFailure
volumes:
- name: script-dir
hostPath:
path: /path/to/my/script/dir
type: Directory
Example above shows how to use the hostPath type of volume in order to mount the script file.

access logs in cron jobs kubernetes

im running cron job in kubernetes, jobs completes successfully and i log output to log file inside(path: storage/logs) but i cannot access that file due to container is in completed here is my job yaml.
apiVersion: v1
items:
- apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
chart: cronjobs-0.1.0
name: cron-cronjob1
namespace: default
spec:
concurrencyPolicy: Forbid
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
metadata:
labels:
app: cron
cron: cronjob1
spec:
containers:
- args:
- /usr/local/bin/php
- -c
- /var/www/html/artisan bulk:import
env:
- name: DB_CONNECTION
value: postgres
- name: DB_HOST
value: postgres
- name: DB_PORT
value: "5432"
- name: DB_DATABASE
value: xxx
- name: DB_USERNAME
value: xxx
- name: DB_PASSWORD
value: xxxx
- name: APP_KEY
value: xxxxx
image: registry.xxxxx.com/xxxx:2ecb785-e927977
imagePullPolicy: IfNotPresent
name: cronjob1
ports:
- containerPort: 80
name: http
protocol: TCP
imagePullSecrets:
- name: xxxxx
restartPolicy: OnFailure
terminationGracePeriodSeconds: 30
schedule: '* * * * *'
successfulJobsHistoryLimit: 3
is there anyway i can get my log file content display on kubectl log command or other alternatives?
Cronjob runs pod according to the spec.schedule. After completing the task the pod's status will be set as completed, but the cronjob controller doesn't delete the pod after completing. And the log file content still there in the pod's container filesystem. So you need to do:
# here you can get the pod_name from the stdout of the cmd `kubectl get pods`
$ kubectl logs -f -n default <pod_name>
I guess you know that the pod is kept around as you have successfulJobsHistoryLimit: 3. Presumably your point is that your logging is going logged to a file and not stdout and so you don't see it with kubectl logs. If so maybe you could also log to stdout or put something into the job to log the content of the file at the end, for example in a PreStop hook.

How do I retain/access the custom log files from a completed cronjob?

I have a cronjob that is completing and outputting several log files.
I want to persist these files and be able access them after the pod has succeeded.
I've found I can access the stdout with oc logs -f <pod>, but I really need to access the log files.
I'm aware Openshift 2 apparently had an environment variable location OPENSHIFT_LOG_DIR that log files were written to, but Openshift 3.5 doesn't appear to have this.
What's my best way of logging and accessing the logs from the CronJob after the pod has succeeded and finished?
After a Job runs to completion, the Pod terminates, but it is not automatically deleted. Since it has completed, you need to use -a to see it. Once you have the Pod name, kubectl logs works as you would expect.
$ kubectl get pods -a
NAME READY STATUS RESTARTS AGE
curator-1499817660-6rzmf 0/1 Completed 0 28d
$ kubectl logs curator-1499817660-6rzmf
2017-07-12 00:01:10,409 INFO ...
A bit late but I hope this answer helps someone facing an almost similar context. For my case I needed to access some files generated by a CronJob and because the pod (and logs) are no longer accessible once the job completes I could not do so I was getting an error:
kubectl logs mongodump-backup-29087654-hj89
Error from server (NotFound): pods "mongodump-backup-27640120-n8p7z" not found
My solution was to deploy a Pod that could access the PVC. The Pod runs a busybox image as below :
apiVersion: v1
kind: Pod
metadata:
name: pvc-inspector
namespace: demos
spec:
containers:
- image: busybox
imagePullPolicy: "IfNotPresent"
name: pvc-inspector
command: ["tail"]
args: ["-f", "/dev/null"]
volumeMounts:
- mountPath: "/tmp"
name: pvc-mount
volumes:
- name: pvc-mount
persistentVolumeClaim:
claimName: mongo-backup-toolset-claim
After deploying this pod side by side to the CronJob I can exec into the pvc-inspector pod and then access files generated by the CronJob :
kubectl exec -it mongodump-backup-29087654-hj89 -- sh
cd tmp
ls
The pvc-inspector has to use the same persistentVolumeClaim as the CronJob and it also has to mount to the same directory as the CronJob.
The CronJob is a simple utility that is doing database backups of Mongo instances :
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mongodump-backup
spec:
schedule: "*/5 * * * *" #Cron job every 5 minutes
startingDeadlineSeconds: 60
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 2
jobTemplate:
spec:
template:
spec:
containers:
- name: mongodump-backup
image: golide/mongo-backup-toolset:0.0.2
imagePullPolicy: "IfNotPresent"
env:
- name: DATABASE_NAME
value: "claims"
- name: MONGODB_URI
value: mongodb://root:mypasswordhere#mongodb-af1/claims
volumeMounts:
- mountPath: "/tmp"
name: mongodump-volume
command: ['sh', '-c',"./dumpp.sh"]
restartPolicy: OnFailure
volumes:
- name: mongodump-volume
persistentVolumeClaim:
claimName: mongo-backup-toolset-claim

Cron Jobs in Kubernetes - connect to existing Pod, execute script

I'm certain I'm missing something obvious. I have looked through the documentation for ScheduledJobs / CronJobs on Kubernetes, but I cannot find a way to do the following on a schedule:
Connect to an existing Pod
Execute a script
Disconnect
I have alternative methods of doing this, but they don't feel right.
Schedule a cron task for: kubectl exec -it $(kubectl get pods --selector=some-selector | head -1) /path/to/script
Create one deployment that has a "Cron Pod" which also houses the application, and many "Non Cron Pods" which are just the application. The Cron Pod would use a different image (one with cron tasks scheduled).
I would prefer to use the Kubernetes ScheduledJobs if possible to prevent the same Job running multiple times at once and also because it strikes me as the more appropriate way of doing it.
Is there a way to do this by ScheduledJobs / CronJobs?
http://kubernetes.io/docs/user-guide/cron-jobs/
As far as I'm aware there is no "official" way to do this the way you want, and that is I believe by design. Pods are supposed to be ephemeral and horizontally scalable, and Jobs are designed to exit. Having a cron job "attach" to an existing pod doesn't fit that module. The Scheduler would have no idea if the job completed.
Instead, a Job can to bring up an instance of your application specifically for running the Job and then take it down once the Job is complete. To do this you can use the same Image for the Job as for your Deployment but use a different "Entrypoint" by setting command:.
If they job needs access to data created by your application then that data will need to be persisted outside the application/Pod, you could so this a few ways but the obvious ways would be a database or a persistent volume.
For example useing a database would look something like this:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: APP
spec:
template:
metadata:
labels:
name: THIS
app: THAT
spec:
containers:
- image: APP:IMAGE
name: APP
command:
- app-start
env:
- name: DB_HOST
value: "127.0.0.1"
- name: DB_DATABASE
value: "app_db"
And a job that connects to the same database, but with a different "Entrypoint" :
apiVersion: batch/v1
kind: Job
metadata:
name: APP-JOB
spec:
template:
metadata:
name: APP-JOB
labels:
app: THAT
spec:
containers:
- image: APP:IMAGE
name: APP-JOB
command:
- app-job
env:
- name: DB_HOST
value: "127.0.0.1"
- name: DB_DATABASE
value: "app_db"
Or the persistent volume approach would look something like this:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: APP
spec:
template:
metadata:
labels:
name: THIS
app: THAT
spec:
containers:
- image: APP:IMAGE
name: APP
command:
- app-start
volumeMounts:
- mountPath: "/var/www/html"
name: APP-VOLUME
volumes:
- name: APP-VOLUME
persistentVolumeClaim:
claimName: APP-CLAIM
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: APP-VOLUME
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
path: /app
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: APP-CLAIM
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
selector:
matchLabels:
service: app
With a job like this, attaching to the same volume:
apiVersion: batch/v1
kind: Job
metadata:
name: APP-JOB
spec:
template:
metadata:
name: APP-JOB
labels:
app: THAT
spec:
containers:
- image: APP:IMAGE
name: APP-JOB
command:
- app-job
volumeMounts:
- mountPath: "/var/www/html"
name: APP-VOLUME
volumes:
- name: APP-VOLUME
persistentVolumeClaim:
claimName: APP-CLAIM
Create a scheduled pod that uses the Kubernetes API to run the command you want on the target pods, via the exec function. The pod image should contain the client libraries to access the API -- many of these are available or you can build your own.
For example, here is a solution using the Python client that execs to each ZooKeeper pod and runs a database maintenance command:
import time
from kubernetes import config
from kubernetes.client import Configuration
from kubernetes.client.apis import core_v1_api
from kubernetes.client.rest import ApiException
from kubernetes.stream import stream
import urllib3
config.load_incluster_config()
configuration = Configuration()
configuration.verify_ssl = False
configuration.assert_hostname = False
urllib3.disable_warnings()
Configuration.set_default(configuration)
api = core_v1_api.CoreV1Api()
label_selector = 'app=zk,tier=backend'
namespace = 'default'
resp = api.list_namespaced_pod(namespace=namespace,
label_selector=label_selector)
for x in resp.items:
name = x.spec.hostname
resp = api.read_namespaced_pod(name=name,
namespace=namespace)
exec_command = [
'/bin/sh',
'-c',
'opt/zookeeper/bin/zkCleanup.sh -n 10'
]
resp = stream(api.connect_get_namespaced_pod_exec, name, namespace,
command=exec_command,
stderr=True, stdin=False,
stdout=True, tty=False)
print("============================ Cleanup %s: ============================\n%s\n" % (name, resp if resp else "<no output>"))
and the associated Dockerfile:
FROM ubuntu:18.04
ADD ./cleanupZk.py /
RUN apt-get update \
&& apt-get install -y python-pip \
&& pip install kubernetes \
&& chmod +x /cleanupZk.py
CMD /cleanupZk.py
Note that if you have an RBAC-enabled cluster, you may need to create a service account and appropriate roles to make this API call possible. A role such as the following is sufficient to list pods and to run exec, such as the example script above requires:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-list-exec
namespace: default
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "list"]
- apiGroups: [""] # "" indicates the core API group
resources: ["pods/exec"]
verbs: ["create", "get"]
An example of the associated cron job:
apiVersion: v1
kind: ServiceAccount
metadata:
name: zk-maint
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: zk-maint-pod-list-exec
namespace: default
subjects:
- kind: ServiceAccount
name: zk-maint
namespace: default
roleRef:
kind: Role
name: pod-list-exec
apiGroup: rbac.authorization.k8s.io
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: zk-maint
namespace: default
labels:
app: zk-maint
tier: jobs
spec:
schedule: "45 3 * * *"
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: zk-maint
image: myorg/zkmaint:latest
serviceAccountName: zk-maint
restartPolicy: OnFailure
imagePullSecrets:
- name: azure-container-registry
This seems like an anti-pattern. Why can't you just run your worker pod as a job pod?
Regardless you seem pretty convinced you need to do this. Here is what I would do.
Take your worker pod and wrap your shell execution in a simple webservice, it's 10 minutes of work with just about any language. Expose the port and put a service in front of that worker/workers. Then your job pods can simply curl ..svc.cluster.local:/ (unless you've futzed with dns).
It sounds as though you might want to run scheduled work within the pod itself rather than doing this at the Kubernetes level. I would approach this as a cronjob within the container, using traditional Linux crontab. Consider:
kind: Pod
apiVersion: v1
metadata:
name: shell
spec:
init-containers:
- name: shell
image: "nicolaka/netshoot"
command:
- /bin/sh
- -c
- |
echo "0 */5 * * * /opt/whatever/bin/do-the-thing" | crontab -
sleep infinity
If you want to track logs from those processes, that will require a fluentd type of mechanism to track those log files.
I managed to do this by creating a custom image with doctl (DigitalOcean's command line interface) and kubectl. The CronJob object would use these two commands to download the cluster configuration and run a command against a container.
Here is a sample CronJob:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: drupal-cron
spec:
schedule: "*/5 * * * *"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: drupal-cron
image: juampynr/digital-ocean-cronjob:latest
env:
- name: DIGITALOCEAN_ACCESS_TOKEN
valueFrom:
secretKeyRef:
name: api
key: key
command: ["/bin/bash","-c"]
args:
- doctl kubernetes cluster kubeconfig save drupster;
POD_NAME=$(kubectl get pods -l tier=frontend -o=jsonpath='{.items[0].metadata.name}');
kubectl exec $POD_NAME -c drupal -- vendor/bin/drush core:cron;
restartPolicy: OnFailure
Here is the Docker image that the CronJob uses: https://hub.docker.com/repository/docker/juampynr/digital-ocean-cronjob
If you are not using DigitalOcean, figure out how to download the cluster configuration so kubectl can use it. For example, with Google Cloud, you would have to download gcloud.
Here is the project repository where I implemented this https://github.com/juampynr/drupal8-do.
This one should help .
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/30 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
kubectl exec -it <podname> "sh script.sh ";
restartPolicy: OnFailure