How to mount PVC to a (katib) Job specification? - kubernetes

I'd like to mount a PVC to a (katib) Job specification but can't find anything in the documentation nor any example?
I'm pretty sure that this should be possible as a Job is orchestrating pods and pods can do so. Or am I missing something?
Please find below the respective (katib) job specification
apiVersion: batch/v1
kind: Job
spec:
template:
spec:
containers:
- env:
- name: training-container
image: docker.io/romeokienzler/claimed-train-mobilenet_v2:0.3
command:
- "ipython"
- "/train-mobilenet_v2.ipynb"
- "optimizer=${trialParameters.optimizer}"
restartPolicy: Never

You can add the volume and volume mount to your Katib job template so that all the HPO jobs on Katib can share the same volumes. e.g.
apiVersion: batch/v1
kind: Job
spec:
template:
spec:
containers:
- name: training-container
image: docker.io/romeokienzler/claimed-train-mobilenet_v2:0.4
command:
- "ipython"
- "/train-mobilenet_v2.ipynb"
- "optimizer=${trialParameters.optimizer}"
volumeMounts:
- mountPath: /data/
name: data-volume
restartPolicy: Never
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: data-pvc
Also make sure your pvc is read write many so the pod on different nodes can mount on the same volume at the same time.

Related

Executing a Script using a Cronjob Kubernetes Cluster

I have a 3 node K8 v1.21 cluster in AWS and looking for SOLID config to run a script using a cronjob. I have seen many documents on here and Google using cronjob and hostPath to Persistent Volumes/Claims to using ConfigMaps, the list goes one.
I keep getting "Back-off restarting failed container/CrashLoopBackOff" errors.
Any help is much appreciated.
cronjob.yaml
The script I am trying to run is basic for testing only
#! /bin/<br/>
kubectl create deployment nginx --image=nginx
Still getting the same error.
kubectl describe pod/xxxx
This hostPath in AWS cluster created using eksctl works.
apiVersion: v1
kind: Pod
metadata:
name: redis-hostpath
spec:
containers:
- image: redis
name: redis-container
volumeMounts:
- mountPath: /test-mnt
name: test-vol
volumes:
- name: test-vol
hostPath:
path: /test-vol
UPDATE
Tried running your config in GCP on a fresh cluster. Only thing I changed was the /home/script.sh to /home/admin/script.sh
Did you test this on your cluster?
Warning FailedPostStartHook 5m27s kubelet Exec lifecycle hook ([/home/mchung/script.sh]) for Container "busybox" in Pod "dumb-job-1635012900-qphqr_default(305c4ed4-08d1-4585-83e0-37a2bc008487)" failed - error: rpc error: code = Unknown desc = failed to exec in container: failed to create exec "0f9f72ccc6279542f18ebe77f497e8c2a8fd52f8dfad118c723a1ba025b05771": cannot exec in a deleted state: unknown, message: ""
Normal Killing 5m27s kubelet FailedPostStartHook
Assuming you're running it in a remote multi-node cluster (since you mentioned AWS in your question), hostPath is NOT an option there for volume mount. Your best choice would be to use a ConfigMap and use it as volume mount.
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-script
data:
script.sh: |
# write down your script here
And then:
apiVersion: batch/v1
kind: CronJob
metadata:
name: redis-job
spec:
schedule: '*/5 * * * *'
jobTemplate:
spec:
template:
spec:
containers:
- name: redis-container
image: redis
args:
- /bin/sh
- -c
- /home/user/script.sh
volumeMounts:
- name: redis-data
mountPath: /home/user/script.sh
subPath: script.sh
volumes:
- name: redis-data
configMap:
name: redis-script
Hope this helps. Let me know if you face any difficulties.
Update:
I think you're doing something wrong. kubectl isn't something you should run from another container / pod. Because it requires the necessary binary to be existed into that container and an appropriate context set. I'm putting a working manifest below for you to understand the whole concept of running a script as a part of cron job:
apiVersion: v1
kind: ConfigMap
metadata:
name: script-config
data:
script.sh: |-
name=StackOverflow
echo "I love $name <3"
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: dumb-job
spec:
schedule: '*/1 * * * *' # every minute
jobTemplate:
spec:
template:
spec:
containers:
- name: busybox
image: busybox:stable
lifecycle:
postStart:
exec:
command:
- /home/script.sh
volumeMounts:
- name: some-volume
mountPath: /home/script.sh
volumes:
- name: some-volume
configMap:
name: script-config
restartPolicy: OnFailure
What it'll do is it'll print some texts in the STDOUT in every minute. Please note that I have put only the commands that container is capable to execute, and kubectl is certainly not one of them which exists in that container out-of-the-box. I hope that is enough to answer your question.

k8s initContainer mountPath does not exist after kubectl pod deployment

Below is deployment yaml, after deployment, I could access the pod
and I can see the mountPath "/usr/share/nginx/html", but I could not find
"/work-dir" which should be created by initContainer.
Could someone explain me the reason?
Thanks and Rgds
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://kubernetes.io
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
The volume at "/work-dir" is mounted by the init container and the "/work-dir" location only exists in the init container. When the init container completes, its file system is gone so the "/work-dir" directory in that init container is "gone". The application (nginx) container mounts the same volume, too, (albeit at a different location) providing mechanism for the two containers to share its content.
Per the docs:
Init containers can run with a different view of the filesystem than
app containers in the same Pod.
The volume mount with a PVC allows you to share the contents of /work-dir/ and /use/share/nginx/html/ but it does not mean the nginx container will have the /work-dir folder. Given this, you may think that you could just mount the path / which would allow you to share all folders underneath. However, a mountPath does not work for /.
So, how do you solve your problem? You could have another pod mount /work-dir/ in case you actually need the folder. Here is an example (pvc and deployment with mounts):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: shared-fs-pvc
namespace: default
labels:
mojix.service: default-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
name: shared-fs
labels:
mojix.service: shared-fs
spec:
replicas: 1
selector:
matchLabels:
mojix.service: shared-fs
template:
metadata:
creationTimestamp: null
labels:
mojix.service: shared-fs
spec:
terminationGracePeriodSeconds: 3
containers:
- name: nginx-c
image: nginx:latest
volumeMounts:
- name: shared-fs-volume
mountPath: /var/www/static/
- name: alpine-c
image: alpine:latest
command: ["/bin/sleep", "10000s"]
lifecycle:
postStart:
exec:
command: ["/bin/mkdir", "-p", "/work-dir"]
volumeMounts:
- name: shared-fs-volume
mountPath: /work-dir/
volumes:
- name: shared-fs-volume
persistentVolumeClaim:
claimName: shared-fs-pvc

How to allow a Kubernetes Job access to a file on host

I've been though the Kubernetes documentation thoroughly but am still having problems interacting with a file on the host filesystem with an application running inside a K8 job launched pod. This happens with even the simplest utility so I have included an stripped down example of my yaml config. The local file, 'hello.txt', referenced here does exist in /tmp on the host (ie. outside the Kubernetes environment) and I have even chmod 777'd it. I've also tried different places in the hosts filesystem than /tmp.
The pod that is launched by the Kubernetes Job terminates with Status=Error and generates the log ls: /testing/hello.txt: No such file or directory
Because I ultimately want to use this programmatically as part of a much more sophisticated workflow it really needs to be a Job not a Deployment. I hope that is possible. My current config file which I am launching with kubectl just for testing is:
apiVersion: batch/v1
kind: Job
metadata:
name: kio
namespace: kmlflow
spec:
# ttlSecondsAfterFinished: 5
template:
spec:
containers:
- name: kio-ingester
image: busybox
volumeMounts:
- name: test-volume
mountPath: /testing
imagePullPolicy: IfNotPresent
command: ["ls"]
args: ["-l", "/testing/hello.txt"]
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /tmp
# this field is optional
# type: Directory
restartPolicy: Never
backoffLimit: 4
Thanks in advance for any assistance.
Looks like when the volume is mounted , the existing data can't be accessed.
You will need to make use of init container to pre-populate the data in the volume.
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: my-app:latest
volumeMounts:
- name: config-data
mountPath: /data
initContainers:
- name: config-data
image: busybox
command: ["echo","-n","{'address':'10.0.1.192:2379/db'}", ">","/data/config"]
volumeMounts:
- name: config-data
mountPath: /data
volumes:
- name: config-data
hostPath: {}
Reference:
https://medium.com/#jmarhee/using-initcontainers-to-pre-populate-volume-data-in-kubernetes-99f628cd4519

can i use a configmap created from an init container in the pod

I am trying to "pass" a value from the init container to a container. Since values in a configmap are shared across the namespace, I figured I can use it for this purpose. Here is my job.yaml (with faked-out info):
apiVersion: batch/v1
kind: Job
metadata:
name: installer-test
spec:
template:
spec:
containers:
- name: installer-test
image: installer-test:latest
env:
- name: clusterId
value: "some_cluster_id"
- name: in_artifactoryUrl
valueFrom:
configMapKeyRef:
name: test-config
key: artifactorySnapshotUrl
initContainers:
- name: artifactory-snapshot
image: busybox
command: ['kubectl', 'create configmap test-config --from-literal=artifactorySnapshotUrl=http://artifactory.com/some/url']
restartPolicy: Never
backoffLimit: 0
This does not seem to work (EDIT: although the statements following this edit note may still be correct, this is not working because kubectl is not a recognizable command in the busybox image), and I am assuming that the pod can only read values from a configmap created BEFORE the pod is created. Has anyone else come across the difficulty of passing values between containers, and what did you do to solve this?
Should I deploy the configmap in another pod and wait to deploy this one until the configmap exists?
(I know I can write files to a volume, but I'd rather not go that route unless it's absolutely necessary, since it essentially means our docker images must be coupled to an environment where some specific files exist)
You can create an EmptyDir volume, and mount this volume onto both containers. Unlike persistent volume, EmptyDir has no portability issue.
apiVersion: batch/v1
kind: Job
metadata:
name: installer-test
spec:
template:
spec:
containers:
- name: installer-test
image: installer-test:latest
env:
- name: clusterId
value: "some_cluster_id"
volumeMounts:
- name: tmp
mountPath: /tmp/artifact
initContainers:
- name: artifactory-snapshot
image: busybox
command: ['/bin/sh', '-c', 'cp x /tmp/artifact/x']
volumeMounts:
- name: tmp
mountPath: /tmp/artifact
restartPolicy: Never
volumes:
- name: tmp
emptyDir: {}
backoffLimit: 0
If for various reasons, you don't want to use share volume. And you want to create a configmap or a secret, here is a solution.
First you need to use a docker image which contains kubectl : gcr.io/cloud-builders/kubectl:latest for example. (docker image which contains kubectl manage by Google).
Then this (init)container needs enough rights to create resource on Kubernetes cluster. Ok by default, kubernetes inject a token of default service account named : "default" in container, but I prefer to make more explicit, then add this line :
...
initContainers:
- # Already true by default but if use it, prefer to make it explicit
automountServiceAccountToken: true
name: artifactory-snapshot
And add "edit" role to "default" service account:
kubectl create rolebinding default-edit-rb --clusterrole=edit --serviceaccount=default:myapp --namespace=default
Then complete example :
apiVersion: batch/v1
kind: Job
metadata:
name: installer-test
spec:
template:
spec:
initContainers:
- # Already true by default but if use it, prefer to make it explicit.
automountServiceAccountToken: true
name: artifactory-snapshot
# You need to use docker image which contains kubectl
image: gcr.io/cloud-builders/kubectl:latest
command:
- sh
- -c
# the "--dry-run -o yaml | kubectl apply -f -" is to make command idempotent
- kubectl create configmap test-config --from-literal=artifactorySnapshotUrl=http://artifactory.com/some/url --dry-run -o yaml | kubectl apply -f -
containers:
- name: installer-test
image: installer-test:latest
env:
- name: clusterId
value: "some_cluster_id"
- name: in_artifactoryUrl
valueFrom:
configMapKeyRef:
name: test-config
key: artifactorySnapshotUrl
First of all, kubectl is a binary. It was downloaded in your machine before you could use the command. But, In your POD, the kubectl binary doesn't exist. So, you can't use kubectl command from a busybox image.
Furthermore, kubectl uses some credential that is saved in your machine (probably in ~/.kube path). So, If you try to use kubectl from inside an image, this will fail because of missing credentials.
For your scenario, I will suggest the same as #ccshih, use volume sharing.
Here is the official doc about volume sharing between init-container and container.
The yaml that is used here is ,
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://kubernetes.io
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
Here init-containers saves a file in the volume and later the file was available in inside the container. Try the tutorial by yourself for better understanding.

hostPath as volume in kubernetes

I am trying to configure hostPath as the volume in kubernetes. I have logged into VM server, from where I usually use kubernetes commands such as kubectl.
Below is the pod yaml:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: helloworldanilhostpath
spec:
replicas: 1
template:
metadata:
labels:
run: helloworldanilhostpath
spec:
volumes:
- name: task-pv-storage
hostPath:
path: /home/openapianil/samplePV
type: Directory
containers:
- name: helloworldv1
image: ***/helloworldv1:v1
ports:
- containerPort: 9123
volumeMounts:
- name: task-pv-storage
mountPath: /mnt/sample
In VM server, I have created "/home/openapianil/samplePV" folder and I have a file present in it. It has a sample.txt file.
once I try to create this deployment. it does not happen with error -
Warning FailedMount 28s (x7 over 59s) kubelet, aks-nodepool1-39499429-1 MountVolume.SetUp failed for volume "task-pv-storage" : hostPath type check failed: /home/openapianil/samplePV is not a directory.
Can anyone please help me in understanding the problem here.
hostPath type volumes refer to directories on the Node (VM/machine) where your Pod is scheduled for running (aks-nodepool1-39499429-1 in this case). So you'd need to create this directory at least on that Node.
To make sure your Pod is consistently scheduled on that specific Node you need to set spec.nodeSelector in the PodTemplate:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: helloworldanilhostpath
spec:
replicas: 1
template:
metadata:
labels:
run: helloworldanilhostpath
spec:
nodeSelector:
kubernetes.io/hostname: aks-nodepool1-39499429-1
volumes:
- name: task-pv-storage
hostPath:
path: /home/openapianil/samplePV
type: Directory
containers:
- name: helloworldv1
image: ***/helloworldv1:v1
ports:
- containerPort: 9123
volumeMounts:
- name: task-pv-storage
mountPath: /mnt/sample
In most cases it's a bad idea to use this type of volume; there are some special use cases, but chance are yours is not one them!
If you need local storage for some reason then a slightly better solution is to use local PersistentVolumes.