run initContainer only once - kubernetes

I have a yamlmanifest with parallelism: 2, including one initContainers. The command in the initContainers, therefore, runs two times and cause problems to the main command. How can I make it run only once?
Here are the important parts of the yaml
kind: Job
apiVersion: batch/v1
metadata:
name: bankruptcy
spec:
parallelism: 2
template:
metadata:
labels:
app: bankruptcy
spec:
restartPolicy: Never
containers:
- name: bankruptcy
image: "myimage"
workingDir: /mount/
command: ["bash","./sweep.sh"]
resources:
limits:
nvidia.com/gpu: 1
initContainers:
- name: dev-init-sweep
image: 'myimage'
workingDir: /mount/
command: ['/bin/bash']
args:
- '--login'
- '-c'
- 'wandb sweep ./sweep.yaml 2>&1 | tee ./wandb/sweep-output.txt; echo `expr "$(cat ./wandb/sweep-output.txt)" : ".*\(wandb agent.*\)"` > ./sweep.sh;'

An initContainer runs once per Pod.
You can't make the initContainer run only once for a given number of pods. But you could implement a guard as part of your initContainer that detects that another one has already started and just returns without performing an own operation or waits until a condition is met.
You have to implement it yourself, though, there is no support from Kubernetes for this.

Related

Executing a Script using a Cronjob Kubernetes Cluster

I have a 3 node K8 v1.21 cluster in AWS and looking for SOLID config to run a script using a cronjob. I have seen many documents on here and Google using cronjob and hostPath to Persistent Volumes/Claims to using ConfigMaps, the list goes one.
I keep getting "Back-off restarting failed container/CrashLoopBackOff" errors.
Any help is much appreciated.
cronjob.yaml
The script I am trying to run is basic for testing only
#! /bin/<br/>
kubectl create deployment nginx --image=nginx
Still getting the same error.
kubectl describe pod/xxxx
This hostPath in AWS cluster created using eksctl works.
apiVersion: v1
kind: Pod
metadata:
name: redis-hostpath
spec:
containers:
- image: redis
name: redis-container
volumeMounts:
- mountPath: /test-mnt
name: test-vol
volumes:
- name: test-vol
hostPath:
path: /test-vol
UPDATE
Tried running your config in GCP on a fresh cluster. Only thing I changed was the /home/script.sh to /home/admin/script.sh
Did you test this on your cluster?
Warning FailedPostStartHook 5m27s kubelet Exec lifecycle hook ([/home/mchung/script.sh]) for Container "busybox" in Pod "dumb-job-1635012900-qphqr_default(305c4ed4-08d1-4585-83e0-37a2bc008487)" failed - error: rpc error: code = Unknown desc = failed to exec in container: failed to create exec "0f9f72ccc6279542f18ebe77f497e8c2a8fd52f8dfad118c723a1ba025b05771": cannot exec in a deleted state: unknown, message: ""
Normal Killing 5m27s kubelet FailedPostStartHook
Assuming you're running it in a remote multi-node cluster (since you mentioned AWS in your question), hostPath is NOT an option there for volume mount. Your best choice would be to use a ConfigMap and use it as volume mount.
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-script
data:
script.sh: |
# write down your script here
And then:
apiVersion: batch/v1
kind: CronJob
metadata:
name: redis-job
spec:
schedule: '*/5 * * * *'
jobTemplate:
spec:
template:
spec:
containers:
- name: redis-container
image: redis
args:
- /bin/sh
- -c
- /home/user/script.sh
volumeMounts:
- name: redis-data
mountPath: /home/user/script.sh
subPath: script.sh
volumes:
- name: redis-data
configMap:
name: redis-script
Hope this helps. Let me know if you face any difficulties.
Update:
I think you're doing something wrong. kubectl isn't something you should run from another container / pod. Because it requires the necessary binary to be existed into that container and an appropriate context set. I'm putting a working manifest below for you to understand the whole concept of running a script as a part of cron job:
apiVersion: v1
kind: ConfigMap
metadata:
name: script-config
data:
script.sh: |-
name=StackOverflow
echo "I love $name <3"
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: dumb-job
spec:
schedule: '*/1 * * * *' # every minute
jobTemplate:
spec:
template:
spec:
containers:
- name: busybox
image: busybox:stable
lifecycle:
postStart:
exec:
command:
- /home/script.sh
volumeMounts:
- name: some-volume
mountPath: /home/script.sh
volumes:
- name: some-volume
configMap:
name: script-config
restartPolicy: OnFailure
What it'll do is it'll print some texts in the STDOUT in every minute. Please note that I have put only the commands that container is capable to execute, and kubectl is certainly not one of them which exists in that container out-of-the-box. I hope that is enough to answer your question.

How do I ensure all nodes are running at the same time in K8S for jobs with parallelism

I need to run a job with parallelism, but I need all nodes/pods running at the same time. If I have only 4 nodes available, but I need 5 then they need to all remain pending or the submission needs to fail as a whole. How do I enforce this in the manifest file? Currently what I see is it takes up as many nodes as it can and leaves the rest in pending state.
Here's my manifest:
apiVersion: batch/v1
kind: Job
metadata:
name: myapp-pod
labels:
app: myapp
spec:
parallelism: 93 #expecting the job to stay pending or fail
activeDeadlineSeconds: 30
template:
metadata:
labels:
app: myapp
spec:
volumes:
- name: indatapt
hostPath:
path: /data # folder path in node, external to container
containers:
- name: myapp-container
image: busybox
imagePullPolicy: IfNotPresent
env:
- name: DEMO_GREETING
value: "Hello from the environment"
command: ['sh', '-c', 'echo "b" >> /indata/b.txt && /indata/test.sh && sleep 10s']
volumeMounts:
- name: indatapt
mountPath: /indata # path in the container
restartPolicy: Never

How do I manually trigger a kubernates job (not a cron) in k8s

I have sample k8s job as soon as you do kubectl apply the job gets triggered and the pods are created . How to control the pod creation?
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout
spec:
backoffLimit: 5
activeDeadlineSeconds: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
If you want to manually control the pod creation, you can achieve it through parallelism.
Documentation says:
The requested parallelism (.spec.parallelism) can be set to any non-negative value. If it is unspecified, it defaults to 1. If it is specified as 0, then the Job is effectively paused until it is increased.
You can set it to 0 while doing the kubectl apply. Configuration looks something like below
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-timeout
spec:
backoffLimit: 5
parallelism: 0
activeDeadlineSeconds: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
You can set it to 1 whenever you decide to run.
The trigger is running kubectl apply. When you create the Job, it runs. You might be looking for a more fully featured background task system like Airflow or Argo.

Does helm upgrade runs a Job post-install even if no update to values.yaml?

I have scripts that are mounted to a shared persistent volume. Part of the main Deployment chart is to run some bash scripts in the initContainers that will clone the scripts repository and copy/mount it to the shared persistent volume. My issue is sometimes there will be no changes in the main app or no update to the values.yaml file, so no helm upgrade will actually happen. I think this is fine but what I want to happen is have a task that will still clone the scripts repository and copy/mount it to the persistent volume.
I am reading about k8s Job (post-install hook) but I am not sure if this will accomplish what I need.
Since you are not changed anything in HELM side like values or spec/templates, the HELM will not perform any change.
In this case your code is a external source and looking by HELM perspective it is correct.
I can propose some alternatives to achieve what you want:
Use HELM with FORCE flag
Use helm upgrade --force to upgrade your deployment.
By Helm docs:
--force - force resource updates through a replacement strategy
In this case Helm will recreate all resources of your chart, consequently the pods, and then re-run initContainers executing your script again.
Use a Kubernetes CronJob
In this case you will spawn a pod that will mount your volume and run a script/command you want.
Example of a Kubernetes CronJob:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: nice-count
spec:
schedule: "*/2 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: nice-job
image: alpine
command: ['sh', '-c', 'echo "HelloWorld" > /usr/share/nginx/html/index.html']
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
restartPolicy: Never
In this example, a CronJob will run each 2 hours, mounting the volume task-pv-storage on /usr/share/nginx/html and executing the command echo "HelloWorld" > /usr/share/nginx/html/index.html.
You should trigger the CronJob mannually creating a Job with the command:
kubectl create job --from=cronjob/<CRON_JOB_NAME> <JOB_NAME>
In the example above, the command looks like this:
kubectl create job --from=cronjob/nice-count nice-count-job
Execute a Job manually or using CI/CD
You can execute the job directly or if you have a CI/CD solution you can create a job to run once instead use a CronJob, in this case you should use this template:
apiVersion: batch/v1
kind: Job
metadata:
name: nice-count-job
spec:
template:
spec:
containers:
- image: alpine
name: my-job
volumeMounts:
- mountPath: /usr/share/nginx/html
name: task-pv-storage
command:
- sh
- -c
- echo "hello" > /usr/share/nginx/html/index.html
restartPolicy: Never
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
I've tested this examples and works in both cases.
Please let me know if that helped!

How to run shell script using CronJobs in Kubernetes?

I am trying to run a shell script at regular interval of 1 minute using a CronJob.
I have created following Cron job in my openshift template:
- kind: CronJob
apiVersion: batch/v2alpha1
metadata:
name: "${APPLICATION_NAME}"
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: mycron-container
image: alpine:3
imagePullPolicy: IfNotPresent
command: [ "/bin/sh" ]
args: [ "/var/httpd-init/croyscript.sh" ]
volumeMounts:
- name: script
mountPath: "/var/httpd-init/"
volumes:
- name: script
configMap:
name: ${APPLICATION_NAME}-croyscript
restartPolicy: OnFailure
terminationGracePeriodSeconds: 0
concurrencyPolicy: Replace
The following is the configmap inserted as a volume in this job:
- kind: ConfigMap
apiVersion: v1
metadata:
name: ${APPLICATION_NAME}-croyscript
labels:
app: "${APPLICATION_NAME}"
data:
croyscript.sh: |
#!/bin/sh
if [ "${APPLICATION_PATH}" != "" ]; then
mkdir -p /var/httpd-resources/${APPLICATION_PATH}
fi
mkdir temp
cd temp
###### SOME CODE ######
This Cron job is running. as I can see the name of the job getting replaced every 1 min (as scheduled in my job). But it is not executing the shell script croyscript.sh
Am I doing anything wrong here? (Maybe I have inserted the configmap in a wrong way, so Job is not able to access the shell script)
Try below approach
Update permissions on configmap location
volumes:
- name: script
configMap:
name: ${APPLICATION_NAME}-croyscript
defaultMode: 0777
If this one doesnt work, most likely the script in mounted volume might have been with READONLY permissions.
use initContainer to copy the script to different location and set appropriate permissions and use that location in command parameter