Kubernetes PersistentVolume on local machine, share data - kubernetes

I would like to spin up a Pod on my local machine. Inside the pod is a single container with a .jar file in it. That jar file can take in files, process then, and then output them. I would like to create a PersistentVolume and attach that to the Pod, so the container can accesss the files.
My Dockerfile:
FROM openjdk:11
WORKDIR /usr/local/dat
COPY . .
ENTRYPOINT ["java", "-jar", "./tool/DAT.jar"]
(Please note that the folder used inside the container is /usr/local/dat)
My PersistentVolume.yml file:
apiVersion: v1
kind: PersistentVolume
metadata:
name: dat-volume
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 150Mi
storageClassName: hostpath
hostPath:
path: /home/zoltanvilaghy/WORK/ctp/shared
My PersistentVolumeClaim.yml file:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dat-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
storageClassName: hostpath
volumeName: dat-volume
My Pod.yml file:
apiVersion: v1
kind: Pod
metadata:
name: dat-tool-pod
labels:
name: dat-tool-pod
spec:
containers:
- name: dat-tool
image: dat_docker
imagePullPolicy: Never
args: ["-in", "/usr/local/dat/shared/input/Archive", "-out", "/usr/local/dat/shared/output/Archive2", "-da"]
volumeMounts:
- mountPath: /usr/local/dat/shared
name: dat-volume
restartPolicy: Never
volumes:
- name: dat-volume
persistentVolumeClaim:
claimName: dat-pvc
If all worked well, after attaching the PersistentVolume (and putting the Archive folder inside the shared/input folder), by giving the arguments to the jar file it would be able to process the files and output them to the shared/output folder.
Instead, I get an error saying that the folder cannot be found. Unfortunately, after the error the container exists, so I can't look around inside the container to check the file structure. Can somebody help me identify the problem?
Edit: Output of kubectl get sc, pvc, pv :
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/hostpath (default) docker.io/hostpath Delete Immediate false 20d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/dat-pvc Bound dat-volume 150Mi RWO hostpath 4m52s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/dat-volume 150Mi RWO Retain Bound default/dat-pvc hostpath 4m55s

Assumed your sc/pvc/pv are all correct, here's how you can test:
apiVersion: v1
kind: Pod
metadata:
name: dat-tool-pod
labels:
name: dat-tool-pod
spec:
containers:
- name: dat-tool
image: busybox
imagePullPolicy: IfNotPresent
command: ["ash","-c","sleep 7200"]
volumeMounts:
- mountPath: /usr/local/dat/shared
name: dat-volume
restartPolicy: Never
volumes:
- name: dat-volume
persistentVolumeClaim:
claimName: dat-pvc
After the pod is created then you can kubectl exec -it dat-tool-pod -- ash and cd /usr/local/dat/shared. Here you can check the directory/files (incl. permission) to understand why your program complaint about missing directory/files.

For anyone else experiencing this problem, here is what helped me find a solution:
https://github.com/docker/for-win/issues/7023
(And actually the link inside the first comment in this issue.)
So my setup was a Windows 10 machine, using WSL2 to run docker containers and kubernetes cluster on my machine. No matter where I put the folder I wanted to share with my Pod, it didn't appear inside the pod. So based on the link above, I created my folder in /mnt/wsl, called /mnt/wsl/shared.
Because supposedly, this /mnt/wsl folder is where the DockerDesktop will start to look for the folder that you want to share. I changed my PersistentVolume.yml to the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: dat-volume
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 150Mi
storageClassName: hostpath
hostPath:
path: /run/desktop/mnt/host/wsl/shared
My understanding is that /run/desktop/mnt/host/wsl is the same as /mnt/wsl, and so I could finally pass files between my Pod and my machine.

Related

Why local persistent volumes not visible in EKS?

In order to test if I can get self written software deployed in amazon using docker images,
I have a test eks cluster.
I have written a small test script that reads and writes a file to see if I understand how to deploy. I have successfully deployed it in minikube, using three replica's. The replica's all use a shared directory on my local file system, and in minikube that is mounted into the pods with a volume
The next step was to deploy that in the eks cluster. However, I cannot get it working in eks. The problem is that the pods don't see the contents of the mounted directory.
This does not completely surprise me, since in minikube I had to create a mount first to a local directory on the server. I have not done something similar on the eks server.
My question is what I should do to make this working (if possible at all).
I use this yaml file to create a pod in eks:
apiVersion: v1
kind: PersistentVolume
metadata:
name: "pv-volume"
spec:
storageClassName: local-storage
capacity:
storage: "1Gi"
accessModes:
- "ReadWriteOnce"
hostPath:
path: /data/k8s
type: DirectoryOrCreate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: "pv-claim"
spec:
storageClassName: local-storage
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "500M"
---
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
persistentVolumeClaim:
claimName: pv-claim
So what I expect is that I have a local directory, /data/k8s, that is visible in the pods as path /config.
When I apply this yaml, I get a pod that gives an error message that makes clear the data in the /data/k8s directory is not visible to the pod.
Kubectl gives me this info after creation of the volume and claim
[rdgon#NL013-PPDAPP015 probeer]$ kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv-volume 1Gi RWO Retain Available 15s
persistentvolume/pvc-156edfef-d272-4df6-ae16-09b12e1c2f03 1Gi RWO Delete Bound default/pv-claim gp2 9s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pv-claim Bound pvc-156edfef-d272-4df6-ae16-09b12e1c2f03 1Gi RWO gp2 15s
Which seems to indicate everything is OK. But it seems that the filesystem of the master node, on which I run the yaml file to create the volume, is not the location where the pods look when they access the /config dir.
On EKS, there's no storage class named 'local-storage' by default.
There is only a 'gp2' storage class, which is also used when you don't specify a storageClassName.
The 'gp2' storage class creates a dedicated EBS volume and attaches it your Kubernetes Node when required, so it doesn't use a local folder. You also don't need to create the pv manually, just the pvc:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: "pv-claim"
spec:
storageClassName: gp2
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "500M"
---
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
persistentVolumeClaim:
claimName: pv-claim
If you want a folder on the Node itself, you can use a 'hostPath' volume, and you don't need a pv or pvc for that:
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
hostPath:
path: /data/k8s
This is a bad idea, since the data will be lost if another node starts up, and your pod is moved to the new node.
If it's for configuration only, you can also use a configMap, and put the files directly in your kubernetes manifest files.
apiVersion: v1
kind: ConfigMap
metadata:
name: ruud-config
data:
ruud.properties: |
my ruud.properties file content...
---
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
configMap:
name: ruud-config
Please check whether the pv got created and its "bound" to PVC by running below commands
kubectl get pv
kubectl get pvc
Which will give information whether the objects are created properly
The local path you refer to is not valid. Try:
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: /config
volumes:
- name: cmount
hostPath:
path: /data/k8s
type: DirectoryOrCreate # <-- You need this since the directory may not exist on the node.

Kubernetes fsGroup not changing file ownership on PersistentVolume

On the host, everything in the mounted directory (/opt/testpod) is owned by uid=0 gid=0. I need those files to be owned by whatever the container decides, i.e. a different gid, to be able to write there. Resources I'm testing with:
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv
labels:
name: pv
spec:
storageClassName: manual
capacity:
storage: 10Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/opt/testpod"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc
spec:
storageClassName: manual
selector:
matchLabels:
name: pv
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Mi
---
apiVersion: v1
kind: Pod
metadata:
name: testpod
spec:
nodeSelector:
foo: bar
securityContext:
runAsUser: 500
runAsGroup: 500
fsGroup: 500
volumes:
- name: vol
persistentVolumeClaim:
claimName: pvc
containers:
- name: testpod
image: busybox
command: [ "sh", "-c", "sleep 1h" ]
volumeMounts:
- name: vol
mountPath: /data
After the pod is running, I kubectl exec into it and ls -la /data shows everything still owned by gid=0. According to some Kuber docs, fsGroup is supposed to chown everything on the pod start but it doesn't happen. What am I doing wrong please?
The hostpath type PV doesn't support security context. You have to be root for the volume to be written in. It is described well in this github issue and this docs about hostPath:
The directories created on the underlying hosts are only writable by root. You either need to run your process as root in a privileged
container or modify the file permissions on the host to be able to write to a
hostPath volume
You may also want to check this github request describing why changing permission of host directory is dangerous.
The workaround people describe that it appears to be working is to grant your user sudo privileges but that actually makes the idea of running container as non root user useless.
Security context appears to be working well with emptyDir volume (described well in the k8s docs here)

PV file not saved on host

hi all quick question on host paths for persistent volumes
I created a PV and PVC here
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
and I ran a sample pod
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
i exec the pod and created a file
root#task-pv-pod:/# cd /usr/share/nginx/html
root#task-pv-pod:/usr/share/nginx/html# ls
tst.txt
However, when I go back to my host and try to ls the file , its not appearing. Any idea why? My PV and PVC are correct as I can see that it has been bounded.
ubuntu#ip-172-31-24-21:/home$ cd /mnt/data
ubuntu#ip-172-31-24-21:/mnt/data$ ls -lrt
total 0
A persistent volume (PV) is a kubernetes resource which has its own lifecycle independent of the pod pv documentation. Using a PVC to consume from a PV makes it visible in some other tool. For example azure files, ELB, a server with NFS, etc. My point here is that there is no reason why the PV should exist in the node.
If you want your persistence to be saved in the node use the hostPath option for PVs. Check this link. Though this is not a good production practice.
First of all, you don't need to create a PV if you are creating a PVC. PVCs create PV, if you have the right storageClass.
Second, hostPath is one delicate PV in Kubernetes world. That's the only PV that doen't need to be created to be mounted in a Pod. So you could have not created neither PV nor PVC and a hostPath volume would work just fine.
To make a test, delete your PV and PVC, and create your Pod like this:
apiVersion: v1
kind: Pod
metadata:
name: nginx-volume
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
ports:
- containerPort: 80
name: nginx-http
volumeMounts:
- name: nginx
mountPath: /root/nginx-volume # path in the pod
volumes:
- name: nginx
hostPath:
path: /var/test # path in the host machine
I know this is a confusing concept, but that's how it is.

How to have multiple pods access an existing NFS folder in Kubernetes?

I have a folder of TFRecords on a network that I want to expose to multiple pods. The folder has been exported via NFS.
I have tried creating a Persistent Volume, followed by a Persistent Volume Claim. However, that just creates a folder inside the NFS mount, which I don't want. Instead, I want to Pod to access the folder with the TFRecords.
I have listed the manifests for the PV and PVC.
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-tfrecord-pv
spec:
capacity:
storage: 30Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
path: /media/veracrypt1/
server: 1.2.3.4
readOnly: false
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-tfrecord-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs-tfrecord
resources:
requests:
storage: 1Gi
I figured it out. The issue was I was looking at the problem the wrong way. I didn't need any provisioning. Instead, what was need was to simply mount the NFS volume within the container:
kind: Pod
apiVersion: v1
metadata:
name: pod-using-nfs
spec:
containers:
- name: app
image: alpine
volumeMounts:
- name: data
mountPath: /mnt/data
command: ["/bin/sh"]
args: ["-c", "sleep 500000"]
volumes:
- name: data
nfs:
server: 1.2.3.4
path: /media/foo/DATA

Setup with Kubernetes hostPath but file doesn't show up in container

I'm trying to set up hostPath to share a file between pods.
I'm following this guide Configure a Pod to Use a PersistentVolume for Storage.
Here are the configuration files for pv,pvc,pod.
PV:
kind: PersistentVolume
apiVersion: v1
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp/data"
PVC:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
POD:
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
I'm adding a file to /tmp/data and but I can't see it in the container.
When I check the status of pv,pvc,pod, the result is as following:
Can some give me a clue on why I can't see the file? Any suggestion or command on how to debug this kind of issue is welcome.
MINIKUBE specific
I was getting the same error when running a local cluster on minikube for Mac.
Minikube actually creates a VM and then runs your containers on it. So the hostPath actually refers to paths inside that VM and not on your local machine. That is why all mounts show up as empty folders.
Solution:
Map your local path to minikube's VM by same name. That way you can refer it as is in you kubernetes Manifests.
minikube mount <source directory>:<target directory>
In this case:
minikube mount /tmp/data:/tmp/data
This should do the trick.
SOURCE: https://minikube.sigs.k8s.io/docs/handbook/mount/
I think I figure it out.
HostPath is only suitable for a one-node cluster. my cluster have 2 nodes. so the physical storage the PV use is on another computer.
when I first go through the documentation, I don't pay attention this:
"You need to have a Kubernetes cluster that has only one Node"
did you put a local file in /tmp/data in the first place ?