OpenShift-Job to copy data from sftp to persistent volume - kubernetes

I would like to deploy a job which copies multiple files from sftp to a persistent volume and then completes.
My current version of this job looks like this:
apiVersion: batch/v1
kind: Job
metadata:
name: job
spec:
template:
spec:
containers:
- name: init-pv
image: w0wka91/ubuntu-sshpass
command: ["sshpass -p $PASSWORD scp -o StrictHostKeyChecking=no -P 22 -r user#sftp.mydomain.com:/RESSOURCES/* /mnt/myvolume"]
volumeMounts:
- mountPath: /mnt/myvolume
name: myvolume
envFrom:
- secretRef:
name: ftp-secrets
restartPolicy: Never
volumes:
- name: myvolume
persistentVolumeClaim:
claimName: myvolume
backoffLimit: 3
When I deploy the job, the pod starts but it always fails to create the container:
sshpass -p $PASSWORD scp -o StrictHostKeyChecking=no -P 22 -r user#sftp.mydomain.com:/RESSOURCES/* /mnt/myvolume: no such file or directory
It seems like the command gets executed before the volume is mounted but I couldnt find any documentation about it.
When I debug the pod and execute the command manually it all works fine so the command is definetely working.
Any ideas how to overcome this issue?

The volume mount is incorrect, change it to:
volumeMounts:
- mountPath: /mnt/myvolume
name: myvolume

Related

Kubernetes initContainers to copy file and execute [duplicate]

This question already has an answer here:
How to share a file from initContainer to base container in Kubernetes
(1 answer)
Closed 4 months ago.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
I have a situation where we will have list of IP addresses (which are coming from the config map) then we need to validate these IP Addresses (i.e. check if they are accessible from this machine) then return the first accessible IP address so that application can access this ip address to process further actions.
I got the know that we can use InitContainers to do this stuff. But my question is how can we run a shell script in the initcontainer to identify the accessible IP Address and set it in the Environmental variable so that application process this further.
InitContainers can communicate with other, normal containers through volumes.
You can use emptyDir volume type, which is a directory that allows the pod to store data for the duration of its life cycle.
apiVersion: v1
kind: Pod
metadata:
name: pod-name
spec:
volumes:
- name: addresses
emptyDir: {}
initContainers:
- name: ip-selector
image: your-image
volumeMounts:
- name: addresses
mountPath: /path/to/ip/addresses
containers:
- name: ip-handler
image: your-image
volumeMounts:
- name: addresses
mountPath: /path/to/ip/addresses/handler
readOnly: true # optional
Your initContainer can now save .env file with addresses in /path/to/ip/addresses path and then your normal container can read this file from /path/to/ip/addresses/handler path.
Option : 1
Once you get the IP inside the init container you can create the secret with value and use it.
initContainers:
- name: secret
image: gcr.io/cloud-builders/kubectl:latest
command:
- sh
- -c
- kubectl create secret mysecret ... -o yaml | kubectl apply -f -
containers:
- name: test-container
image: image-uri:latest
envFrom:
- secretRef:
name: mysecret
Option : 2
You can also use the shared mount option to write IP into the file, and inside the container, you can run the command to set the file contents to Env. So when your main container start it will start with the IP as Env.
command: [bash, -c, "source /tmp/env && service command"]
Example
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: busybox
image: busybox
command: [bash, -c, "source /tmp/env && service command"]
volumeMounts:
- name: workdir
mountPath: /tmp/env
initContainers:
- name: ip-check
image: busybox:1.28
command:
- Command or script to check the IP write to > /tmp/env
volumeMounts:
- name: workdir
mountPath: "/tmp"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}

The data is not being shared across containers

I am trying to create two containers within a pod with one container being an init container. The job of the init container is to download a jar and make it available for the app container. I am able to create everything and the logs look good but when i check, i do not see the jar in my app container. Below is my deployment yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-service-test
labels:
app: web-service-test
spec:
replicas: 1
selector:
matchLabels:
app: web-service-test
template:
metadata:
labels:
app: web-service-test
spec:
volumes:
- name: shared-data
emptyDir: {}
containers:
- name: web-service-test
image: some image
ports:
- containerPort: 8081
volumeMounts:
- name: shared-data
mountPath: /tmp/jar
initContainers:
- name: init-container
image: busybox
volumeMounts:
- name: shared-data
mountPath: /jdbc-jar
command:
- wget
- "https://repo1.maven.org/maven2/com/oracle/ojdbc/ojdbc8/19.3.0.0/ojdbc8-19.3.0.0.jar"
You need to save jar in the /jdbc-jar folder
try updating your yaml to following
command: ["/bin/sh"]
args: ["-c", "wget -O /pod-data/ojdbc8-19.3.0.0.jar https://repo1.maven.org/maven2/com/oracle/ojdbc/ojdbc8/19.3.0.0/ojdbc8-19.3.0.0.jar"]
Add following block of code to your init container section:
command: ["/bin/sh","-c"]
args: ["wget -O /jdbc-jar/ojdbc8-19.3.0.0.jar https://repo1.maven.org/maven2/com/oracle/ojdbc/ojdbc8/19.3.0.0/ojdbc8-19.3.0.0.jar"]
The command ["/bin/sh", "-c"] says "run a shell, and execute the following instructions". The args are then passed as commands to the shell. In shell scripting a semicolon separates commands. In the wget command I have added the -O flag to download the jar from the specified url and save it as /jdbc-jar/ojdbc8-19.3.0.0.jar .
To check if jar is persistent in container. Simply execute command:
$ kubectl exec -it web-service-test -- /bin/bash
Then go to folder /jdbc-jar ( $ cd jdbc-jar ) and list files in it ($ ls -al). You should see your jar there.
See examples: commands-in-containers, initcontainers-running.

how do scripts/files get mounted to kubernetes pods

I'd like to create a cronjob that runs a python script mounted as a pvc, but I don't understand how to put test.py into the container from my local file system
apiVersion: batch/v2alpha1
kind: CronJob
metadata:
name: update_db
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: update-fingerprints
image: python:3.6.2-slim
command: ["/bin/bash"]
args: ["-c", "python /client/test.py"]
volumeMounts:
- name: application-code
mountPath: /where/ever
restartPolicy: OnFailure
volumes:
- name: application-code
persistentVolumeClaim:
claimName: application-code-pv-claim
You have a volume called application-code. In there lies the test.py file. Now you mount the volume, but you are not setting the mountPath according to your shell command.
The argument is pyhton /client/test.py, so you expect the file to be placed in the /client directory. You just have to mount the volume with this path:
volumeMounts:
- name: application-code
mountPath: /client
Update
If you don't need the file outside the cluster it would be much easier to integrate it into your docker image. Here an example Dockerfile:
FROM python:3.6.2-slim
WORKDIR /data
COPY test.py .
ENTRYPOINT['/bin/bash', '-c', 'python /data/test.py']
Push the image to your docker registry and reference it from your yml.
containers:
- name: update-fingerprints
image: <your-container-registry>:<image-name>

How to allow a Kubernetes Job access to a file on host

I've been though the Kubernetes documentation thoroughly but am still having problems interacting with a file on the host filesystem with an application running inside a K8 job launched pod. This happens with even the simplest utility so I have included an stripped down example of my yaml config. The local file, 'hello.txt', referenced here does exist in /tmp on the host (ie. outside the Kubernetes environment) and I have even chmod 777'd it. I've also tried different places in the hosts filesystem than /tmp.
The pod that is launched by the Kubernetes Job terminates with Status=Error and generates the log ls: /testing/hello.txt: No such file or directory
Because I ultimately want to use this programmatically as part of a much more sophisticated workflow it really needs to be a Job not a Deployment. I hope that is possible. My current config file which I am launching with kubectl just for testing is:
apiVersion: batch/v1
kind: Job
metadata:
name: kio
namespace: kmlflow
spec:
# ttlSecondsAfterFinished: 5
template:
spec:
containers:
- name: kio-ingester
image: busybox
volumeMounts:
- name: test-volume
mountPath: /testing
imagePullPolicy: IfNotPresent
command: ["ls"]
args: ["-l", "/testing/hello.txt"]
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /tmp
# this field is optional
# type: Directory
restartPolicy: Never
backoffLimit: 4
Thanks in advance for any assistance.
Looks like when the volume is mounted , the existing data can't be accessed.
You will need to make use of init container to pre-populate the data in the volume.
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: my-app:latest
volumeMounts:
- name: config-data
mountPath: /data
initContainers:
- name: config-data
image: busybox
command: ["echo","-n","{'address':'10.0.1.192:2379/db'}", ">","/data/config"]
volumeMounts:
- name: config-data
mountPath: /data
volumes:
- name: config-data
hostPath: {}
Reference:
https://medium.com/#jmarhee/using-initcontainers-to-pre-populate-volume-data-in-kubernetes-99f628cd4519

can i use a configmap created from an init container in the pod

I am trying to "pass" a value from the init container to a container. Since values in a configmap are shared across the namespace, I figured I can use it for this purpose. Here is my job.yaml (with faked-out info):
apiVersion: batch/v1
kind: Job
metadata:
name: installer-test
spec:
template:
spec:
containers:
- name: installer-test
image: installer-test:latest
env:
- name: clusterId
value: "some_cluster_id"
- name: in_artifactoryUrl
valueFrom:
configMapKeyRef:
name: test-config
key: artifactorySnapshotUrl
initContainers:
- name: artifactory-snapshot
image: busybox
command: ['kubectl', 'create configmap test-config --from-literal=artifactorySnapshotUrl=http://artifactory.com/some/url']
restartPolicy: Never
backoffLimit: 0
This does not seem to work (EDIT: although the statements following this edit note may still be correct, this is not working because kubectl is not a recognizable command in the busybox image), and I am assuming that the pod can only read values from a configmap created BEFORE the pod is created. Has anyone else come across the difficulty of passing values between containers, and what did you do to solve this?
Should I deploy the configmap in another pod and wait to deploy this one until the configmap exists?
(I know I can write files to a volume, but I'd rather not go that route unless it's absolutely necessary, since it essentially means our docker images must be coupled to an environment where some specific files exist)
You can create an EmptyDir volume, and mount this volume onto both containers. Unlike persistent volume, EmptyDir has no portability issue.
apiVersion: batch/v1
kind: Job
metadata:
name: installer-test
spec:
template:
spec:
containers:
- name: installer-test
image: installer-test:latest
env:
- name: clusterId
value: "some_cluster_id"
volumeMounts:
- name: tmp
mountPath: /tmp/artifact
initContainers:
- name: artifactory-snapshot
image: busybox
command: ['/bin/sh', '-c', 'cp x /tmp/artifact/x']
volumeMounts:
- name: tmp
mountPath: /tmp/artifact
restartPolicy: Never
volumes:
- name: tmp
emptyDir: {}
backoffLimit: 0
If for various reasons, you don't want to use share volume. And you want to create a configmap or a secret, here is a solution.
First you need to use a docker image which contains kubectl : gcr.io/cloud-builders/kubectl:latest for example. (docker image which contains kubectl manage by Google).
Then this (init)container needs enough rights to create resource on Kubernetes cluster. Ok by default, kubernetes inject a token of default service account named : "default" in container, but I prefer to make more explicit, then add this line :
...
initContainers:
- # Already true by default but if use it, prefer to make it explicit
automountServiceAccountToken: true
name: artifactory-snapshot
And add "edit" role to "default" service account:
kubectl create rolebinding default-edit-rb --clusterrole=edit --serviceaccount=default:myapp --namespace=default
Then complete example :
apiVersion: batch/v1
kind: Job
metadata:
name: installer-test
spec:
template:
spec:
initContainers:
- # Already true by default but if use it, prefer to make it explicit.
automountServiceAccountToken: true
name: artifactory-snapshot
# You need to use docker image which contains kubectl
image: gcr.io/cloud-builders/kubectl:latest
command:
- sh
- -c
# the "--dry-run -o yaml | kubectl apply -f -" is to make command idempotent
- kubectl create configmap test-config --from-literal=artifactorySnapshotUrl=http://artifactory.com/some/url --dry-run -o yaml | kubectl apply -f -
containers:
- name: installer-test
image: installer-test:latest
env:
- name: clusterId
value: "some_cluster_id"
- name: in_artifactoryUrl
valueFrom:
configMapKeyRef:
name: test-config
key: artifactorySnapshotUrl
First of all, kubectl is a binary. It was downloaded in your machine before you could use the command. But, In your POD, the kubectl binary doesn't exist. So, you can't use kubectl command from a busybox image.
Furthermore, kubectl uses some credential that is saved in your machine (probably in ~/.kube path). So, If you try to use kubectl from inside an image, this will fail because of missing credentials.
For your scenario, I will suggest the same as #ccshih, use volume sharing.
Here is the official doc about volume sharing between init-container and container.
The yaml that is used here is ,
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://kubernetes.io
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
Here init-containers saves a file in the volume and later the file was available in inside the container. Try the tutorial by yourself for better understanding.