Setup with Kubernetes hostPath but file doesn't show up in container - kubernetes

I'm trying to set up hostPath to share a file between pods.
I'm following this guide Configure a Pod to Use a PersistentVolume for Storage.
Here are the configuration files for pv,pvc,pod.
PV:
kind: PersistentVolume
apiVersion: v1
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp/data"
PVC:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
POD:
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
I'm adding a file to /tmp/data and but I can't see it in the container.
When I check the status of pv,pvc,pod, the result is as following:
Can some give me a clue on why I can't see the file? Any suggestion or command on how to debug this kind of issue is welcome.

MINIKUBE specific
I was getting the same error when running a local cluster on minikube for Mac.
Minikube actually creates a VM and then runs your containers on it. So the hostPath actually refers to paths inside that VM and not on your local machine. That is why all mounts show up as empty folders.
Solution:
Map your local path to minikube's VM by same name. That way you can refer it as is in you kubernetes Manifests.
minikube mount <source directory>:<target directory>
In this case:
minikube mount /tmp/data:/tmp/data
This should do the trick.
SOURCE: https://minikube.sigs.k8s.io/docs/handbook/mount/

I think I figure it out.
HostPath is only suitable for a one-node cluster. my cluster have 2 nodes. so the physical storage the PV use is on another computer.
when I first go through the documentation, I don't pay attention this:
"You need to have a Kubernetes cluster that has only one Node"

did you put a local file in /tmp/data in the first place ?

Related

Kubernetes PersistentVolume on local machine, share data

I would like to spin up a Pod on my local machine. Inside the pod is a single container with a .jar file in it. That jar file can take in files, process then, and then output them. I would like to create a PersistentVolume and attach that to the Pod, so the container can accesss the files.
My Dockerfile:
FROM openjdk:11
WORKDIR /usr/local/dat
COPY . .
ENTRYPOINT ["java", "-jar", "./tool/DAT.jar"]
(Please note that the folder used inside the container is /usr/local/dat)
My PersistentVolume.yml file:
apiVersion: v1
kind: PersistentVolume
metadata:
name: dat-volume
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 150Mi
storageClassName: hostpath
hostPath:
path: /home/zoltanvilaghy/WORK/ctp/shared
My PersistentVolumeClaim.yml file:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dat-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
storageClassName: hostpath
volumeName: dat-volume
My Pod.yml file:
apiVersion: v1
kind: Pod
metadata:
name: dat-tool-pod
labels:
name: dat-tool-pod
spec:
containers:
- name: dat-tool
image: dat_docker
imagePullPolicy: Never
args: ["-in", "/usr/local/dat/shared/input/Archive", "-out", "/usr/local/dat/shared/output/Archive2", "-da"]
volumeMounts:
- mountPath: /usr/local/dat/shared
name: dat-volume
restartPolicy: Never
volumes:
- name: dat-volume
persistentVolumeClaim:
claimName: dat-pvc
If all worked well, after attaching the PersistentVolume (and putting the Archive folder inside the shared/input folder), by giving the arguments to the jar file it would be able to process the files and output them to the shared/output folder.
Instead, I get an error saying that the folder cannot be found. Unfortunately, after the error the container exists, so I can't look around inside the container to check the file structure. Can somebody help me identify the problem?
Edit: Output of kubectl get sc, pvc, pv :
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/hostpath (default) docker.io/hostpath Delete Immediate false 20d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/dat-pvc Bound dat-volume 150Mi RWO hostpath 4m52s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/dat-volume 150Mi RWO Retain Bound default/dat-pvc hostpath 4m55s
Assumed your sc/pvc/pv are all correct, here's how you can test:
apiVersion: v1
kind: Pod
metadata:
name: dat-tool-pod
labels:
name: dat-tool-pod
spec:
containers:
- name: dat-tool
image: busybox
imagePullPolicy: IfNotPresent
command: ["ash","-c","sleep 7200"]
volumeMounts:
- mountPath: /usr/local/dat/shared
name: dat-volume
restartPolicy: Never
volumes:
- name: dat-volume
persistentVolumeClaim:
claimName: dat-pvc
After the pod is created then you can kubectl exec -it dat-tool-pod -- ash and cd /usr/local/dat/shared. Here you can check the directory/files (incl. permission) to understand why your program complaint about missing directory/files.
For anyone else experiencing this problem, here is what helped me find a solution:
https://github.com/docker/for-win/issues/7023
(And actually the link inside the first comment in this issue.)
So my setup was a Windows 10 machine, using WSL2 to run docker containers and kubernetes cluster on my machine. No matter where I put the folder I wanted to share with my Pod, it didn't appear inside the pod. So based on the link above, I created my folder in /mnt/wsl, called /mnt/wsl/shared.
Because supposedly, this /mnt/wsl folder is where the DockerDesktop will start to look for the folder that you want to share. I changed my PersistentVolume.yml to the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: dat-volume
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 150Mi
storageClassName: hostpath
hostPath:
path: /run/desktop/mnt/host/wsl/shared
My understanding is that /run/desktop/mnt/host/wsl is the same as /mnt/wsl, and so I could finally pass files between my Pod and my machine.

how to find my persistent volume location

I tried creating persistent volume using the host path. I can bind it to a specific node using node affinity but I didn't provide that. My persistent volume YAML looks like this
apiVersion: v1
kind: PersistentVolume
metadata:
name: jenkins
labels:
type: fast
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: /mnt/data
After this I created PVC
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
And finally attached it onto the pod.
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: thinkingmonster/nettools
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
Now in describe command for pv or pvc it does not tell that on which node it has actually kept the volume /mnt/data
and I had to ssh to all nodes to locate the same.
And pod is smart enough to be created on that node only where Kubernetes had mapped host directory to PV
How can I know that on which node Kubernetes has created Persistent volume? Without the requirement to ssh the nodes or check that where is pod running.
It's only when a volume is bound to a claim that it's associated with a particular node. HostPath volumes are a bit different than the regular sort, making it a little less clear. When you get the volume claim, the annotations on it should give you a bunch of information, including what you're looking for. In particular, look for the:
volume.kubernetes.io/selected-node: ${NODE_NAME}
annotation on the PVC. You can see the annotations, along with the other computed configuration, by asking the Kubernetes api server for that info:
kubectl get pvc -o yaml -n ${NAMESPACE} ${PVC_NAME}

Kubernetes - Generate files on all the pods

I have Java API which exports the data to an excel and generates a file on the POD where the request is served.
Now the next request (to download the file) might go to a different POD and the download fails.
How do I get around this?
How do I generate files on all the POD? Or how do I make sure the subsequent request goes to the same POD where file was generated?
I cant give the direct POD URL as it will not be accessible to clients.
Thanks.
You need to use a persistent volumes to share the same files between your containers. You could use the node storage mounted on containers (easiest way) or other distributed file system like NFS, EFS (AWS), GlusterFS etc...
If you you need a simplest to share the file and your pods are in the same node, you could use hostpath to store the file and share the volume with other containers.
Assuming you have a kubernetes cluster that has only one Node, and you want to share the path /mtn/data of your node with your pods:
Create a PersistentVolume:
A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
Create a PersistentVolumeClaim:
Pods use PersistentVolumeClaims to request physical storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
Look at the PersistentVolumeClaim:
kubectl get pvc task-pv-claim
The output shows that the PersistentVolumeClaim is bound to your PersistentVolume, task-pv-volume.
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
task-pv-claim Bound task-pv-volume 10Gi RWO manual 30s
Create a deployment with 2 replicas for example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/mnt/data"
name: task-pv-storage
Now you can check inside both container the path /mnt/data has the same files.
If you have cluster with more than 1 node I recommend you to think about the other types of persistent volumes.
References:
Configure persistent volumes
Persistent volumes
Volume Types

PV file not saved on host

hi all quick question on host paths for persistent volumes
I created a PV and PVC here
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
and I ran a sample pod
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
i exec the pod and created a file
root#task-pv-pod:/# cd /usr/share/nginx/html
root#task-pv-pod:/usr/share/nginx/html# ls
tst.txt
However, when I go back to my host and try to ls the file , its not appearing. Any idea why? My PV and PVC are correct as I can see that it has been bounded.
ubuntu#ip-172-31-24-21:/home$ cd /mnt/data
ubuntu#ip-172-31-24-21:/mnt/data$ ls -lrt
total 0
A persistent volume (PV) is a kubernetes resource which has its own lifecycle independent of the pod pv documentation. Using a PVC to consume from a PV makes it visible in some other tool. For example azure files, ELB, a server with NFS, etc. My point here is that there is no reason why the PV should exist in the node.
If you want your persistence to be saved in the node use the hostPath option for PVs. Check this link. Though this is not a good production practice.
First of all, you don't need to create a PV if you are creating a PVC. PVCs create PV, if you have the right storageClass.
Second, hostPath is one delicate PV in Kubernetes world. That's the only PV that doen't need to be created to be mounted in a Pod. So you could have not created neither PV nor PVC and a hostPath volume would work just fine.
To make a test, delete your PV and PVC, and create your Pod like this:
apiVersion: v1
kind: Pod
metadata:
name: nginx-volume
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
ports:
- containerPort: 80
name: nginx-http
volumeMounts:
- name: nginx
mountPath: /root/nginx-volume # path in the pod
volumes:
- name: nginx
hostPath:
path: /var/test # path in the host machine
I know this is a confusing concept, but that's how it is.

How do I create a persistent volume on an in-house kubernetes cluster

I have a 3-node Kubernetes cluster running on vagrant using the oracle Kubernetes vagrant boxes from http://github.com/oracle/vagrant-boxes.git.
I want to add a pod including an Oracle database and persist the data so that in case all nodes go down, I don't lose my data.
According to how I read the Kubernetes documentation persistent volumes cannot be created on a local filesystem only on a cloud-backed device. I want to configure the persistent volume and persistent volume claim on my vagrant boxes as a proof of concept and training exercise for my Kubernetes learning.
Are there any examples of how I might go about creating the PV and PVC in this configuration?
As a complete Kubernetes newbie, any code samples would be greatly appreciated.
Use host path:
create PV:
kind: PersistentVolume
apiVersion: v1
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data
create PVC:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
Use it in a pod:
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
documentation
This is just an example, for testing only.
For production use case, you will need dynamic provisioning using the StorageClass for PVC, so that the volume/data is available when the pod moves across the cluster.