Multiple Persistent Volumes with the same mount path Kubernetes - kubernetes

I have created 3 CronJobs in Kubernetes. The format is exactly the same for every one of them except the names. These are the following specs:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: test-job-1 # for others it's test-job-2 and test-job-3
namespace: cron-test
spec:
schedule: "* * * * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: test-job-1 # for others it's test-job-2 and test-job-3
image: busybox
imagePullPolicy: IfNotPresent
command:
- "/bin/sh"
- "-c"
args:
- cd database-backup && touch $(date +%Y-%m-%d:%H:%M).test-job-1 && ls -la # for others the filename includes test-job-2 and test-job-3 respectively
volumeMounts:
- mountPath: "/database-backup"
name: test-job-1-pv # for others it's test-job-2-pv and test-job-3-pv
volumes:
- name: test-job-1-pv # for others it's test-job-2-pv and test-job-3-pv
persistentVolumeClaim:
claimName: test-job-1-pvc # for others it's test-job-2-pvc and test-job-3-pvc
And also the following Persistent Volume Claims and Persistent Volume:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-job-1-pvc # for others it's test-job-2-pvc or test-job-3-pvc
namespace: cron-test
spec:
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
resources:
requests:
storage: 1Gi
volumeName: test-job-1-pv # depending on the name it's test-job-2-pv or test-job-3-pv
storageClassName: manual
volumeMode: Filesystem
apiVersion: v1
kind: PersistentVolume
metadata:
name: test-job-1-pv # for others it's test-job-2-pv and test-job-3-pv
namespace: cron-test
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/database-backup"
So all in all there are 3 CronJobs, 3 PersistentVolumes and 3 PersistentVolumeClaims. I can see that the PersistentVolumeClaims and PersistentVolumes are bound correctly to each other. So test-job-1-pvc <--> test-job-1-pv, test-job-2-pvc <--> test-job-2-pv and so on. Also the pods associated with each PVC are are the corresponding pods created by each CronJob. For example test-job-1-1609066800-95d4m <--> test-job-1-pvc and so on. After letting the cron jobs run for a bit I create another pod with the following specs to inspect test-job-1-pvc:
apiVersion: v1
kind: Pod
metadata:
name: data-access
namespace: cron-test
spec:
containers:
- name: data-access
image: busybox
command: ["sleep", "infinity"]
volumeMounts:
- name: data-access-volume
mountPath: /database-backup
volumes:
- name: data-access-volume
persistentVolumeClaim:
claimName: test-job-1-pvc
Just a simple pod that keeps running all the time. When I get inside that pod with exec and see inside the /database-backup directory I see all the files created from all the pods created by the 3 CronJobs.
What I exepected to see?
I expected to see only the files created by test-job-1.
Is this something expected to happen? And if so how can you separate the PersistentVolumes to avoid something like this?

I suspect this is caused by the PersistentVolume definition: if you really only changed the name, all volumes are mapped to the same folder on the host.
hostPath:
path: "/database-backup"
Try giving each volume a unique folder, e.g.
hostPath:
path: "/database-backup/volume1"

Related

Why local persistent volumes not visible in EKS?

In order to test if I can get self written software deployed in amazon using docker images,
I have a test eks cluster.
I have written a small test script that reads and writes a file to see if I understand how to deploy. I have successfully deployed it in minikube, using three replica's. The replica's all use a shared directory on my local file system, and in minikube that is mounted into the pods with a volume
The next step was to deploy that in the eks cluster. However, I cannot get it working in eks. The problem is that the pods don't see the contents of the mounted directory.
This does not completely surprise me, since in minikube I had to create a mount first to a local directory on the server. I have not done something similar on the eks server.
My question is what I should do to make this working (if possible at all).
I use this yaml file to create a pod in eks:
apiVersion: v1
kind: PersistentVolume
metadata:
name: "pv-volume"
spec:
storageClassName: local-storage
capacity:
storage: "1Gi"
accessModes:
- "ReadWriteOnce"
hostPath:
path: /data/k8s
type: DirectoryOrCreate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: "pv-claim"
spec:
storageClassName: local-storage
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "500M"
---
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
persistentVolumeClaim:
claimName: pv-claim
So what I expect is that I have a local directory, /data/k8s, that is visible in the pods as path /config.
When I apply this yaml, I get a pod that gives an error message that makes clear the data in the /data/k8s directory is not visible to the pod.
Kubectl gives me this info after creation of the volume and claim
[rdgon#NL013-PPDAPP015 probeer]$ kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv-volume 1Gi RWO Retain Available 15s
persistentvolume/pvc-156edfef-d272-4df6-ae16-09b12e1c2f03 1Gi RWO Delete Bound default/pv-claim gp2 9s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pv-claim Bound pvc-156edfef-d272-4df6-ae16-09b12e1c2f03 1Gi RWO gp2 15s
Which seems to indicate everything is OK. But it seems that the filesystem of the master node, on which I run the yaml file to create the volume, is not the location where the pods look when they access the /config dir.
On EKS, there's no storage class named 'local-storage' by default.
There is only a 'gp2' storage class, which is also used when you don't specify a storageClassName.
The 'gp2' storage class creates a dedicated EBS volume and attaches it your Kubernetes Node when required, so it doesn't use a local folder. You also don't need to create the pv manually, just the pvc:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: "pv-claim"
spec:
storageClassName: gp2
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "500M"
---
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
persistentVolumeClaim:
claimName: pv-claim
If you want a folder on the Node itself, you can use a 'hostPath' volume, and you don't need a pv or pvc for that:
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
hostPath:
path: /data/k8s
This is a bad idea, since the data will be lost if another node starts up, and your pod is moved to the new node.
If it's for configuration only, you can also use a configMap, and put the files directly in your kubernetes manifest files.
apiVersion: v1
kind: ConfigMap
metadata:
name: ruud-config
data:
ruud.properties: |
my ruud.properties file content...
---
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: "/config"
volumes:
- name: cmount
configMap:
name: ruud-config
Please check whether the pv got created and its "bound" to PVC by running below commands
kubectl get pv
kubectl get pvc
Which will give information whether the objects are created properly
The local path you refer to is not valid. Try:
apiVersion: v1
kind: Pod
metadata:
name: ruudtest
spec:
containers:
- name: ruud
image: MYIMAGE
volumeMounts:
- name: cmount
mountPath: /config
volumes:
- name: cmount
hostPath:
path: /data/k8s
type: DirectoryOrCreate # <-- You need this since the directory may not exist on the node.

Kubernetes Persistent Volume: MountPath directory created but empty

I have 2 pods, one that is writing files to a persistent volume and the other one supposedly reads those files to make some calculations.
The first pod writes the files successfully and when I display the content of the persistent volume using print(os.listdir(persistent_volume_path)) I get all the expected files. However, the same command on the second pod shows an empty directory. (The mountPath directory /data is created but empty.)
This is the TFJob yaml file:
apiVersion: kubeflow.org/v1
kind: TFJob
metadata:
name: pod1
namespace: my-namespace
spec:
cleanPodPolicy: None
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: my-image:latest
imagePullPolicy: Always
command:
- "python"
- "./program1.py"
- "--data_path=./dataset.csv"
- "--persistent_volume_path=/data"
volumeMounts:
- mountPath: "/data"
name: my-pv
volumes:
- name: my-pv
persistentVolumeClaim:
claimName: my-pvc
(respectively pod2 and program2.py for the second pod)
And this is the volume configuration:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
namespace: my-namespace
labels:
type: local
app: tfjob
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
namespace: my-namespace
labels:
type: local
app: tfjob
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data"
Does anyone have any idea where's the problem exactly and how to fix it?
When two pods should access a shared Persistent Volume with access mode ReadWriteOnce, concurrently - then the two pods must be running on the same node since the volume can only be mounted on a single node at a time with this access mode.
To achieve this, some form of Pod Affinity must be applied, such that they are scheduled to the same node.

PV file not saved on host

hi all quick question on host paths for persistent volumes
I created a PV and PVC here
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
and I ran a sample pod
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
i exec the pod and created a file
root#task-pv-pod:/# cd /usr/share/nginx/html
root#task-pv-pod:/usr/share/nginx/html# ls
tst.txt
However, when I go back to my host and try to ls the file , its not appearing. Any idea why? My PV and PVC are correct as I can see that it has been bounded.
ubuntu#ip-172-31-24-21:/home$ cd /mnt/data
ubuntu#ip-172-31-24-21:/mnt/data$ ls -lrt
total 0
A persistent volume (PV) is a kubernetes resource which has its own lifecycle independent of the pod pv documentation. Using a PVC to consume from a PV makes it visible in some other tool. For example azure files, ELB, a server with NFS, etc. My point here is that there is no reason why the PV should exist in the node.
If you want your persistence to be saved in the node use the hostPath option for PVs. Check this link. Though this is not a good production practice.
First of all, you don't need to create a PV if you are creating a PVC. PVCs create PV, if you have the right storageClass.
Second, hostPath is one delicate PV in Kubernetes world. That's the only PV that doen't need to be created to be mounted in a Pod. So you could have not created neither PV nor PVC and a hostPath volume would work just fine.
To make a test, delete your PV and PVC, and create your Pod like this:
apiVersion: v1
kind: Pod
metadata:
name: nginx-volume
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
ports:
- containerPort: 80
name: nginx-http
volumeMounts:
- name: nginx
mountPath: /root/nginx-volume # path in the pod
volumes:
- name: nginx
hostPath:
path: /var/test # path in the host machine
I know this is a confusing concept, but that's how it is.

sharing data between a cronjob and pod

Right now I have a cronjob that downloads data and I want to share it to another container that does the processing for the data as new ones are uploaded. I wanted to know if there was a way without any external services to share this data between the cronjob pod and my main pod?
I've tried creating a persistent volume and persistent volume claim to share the data but when the cronjob downloads the data it doesn't appear in the other pod even though the volume is mounted.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: download
spec:
concurrencyPolicy: Forbid
suspend: false
schedule: "* * * * *"
jobTemplate:
spec:
template:
spec:
volumes:
- name: downloaded-data-claim
persistentVolumeClaim:
claimName: downloaded-data-claim
#container and image is here where it downloads
kind: PersistentVolume
metadata:
name: downloaded-data
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
claimRef:
name: downloaded-data-claim
namespace: default
hostPath:
path: "/tmp/"
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: downloaded-data-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
volumeName: downloaded-data
and then the pod mounts the volume
volumes:
- name: downloaded-data-claim
presistentVolumeClaim:
claimName: downloaded-data-claim
- name: output
emptyDir: {}
containers:
- name: "rand"
image: <filler>
imagePullPolicy: <filler>
volumeMounts:
- name: downloaded-data-claim
mountPath: /input
- name: output
mountPath: /output
resources:
Make sure you have created CronJob in right namespace - where your pod and pv are.
Take notice if you have access to directory where you want your data to be stored.
Actually I don't think there is other possiblity than using external services.
Most useful are nfs volumes. But there are based on services and external nfs servers.
NFS stands for Network File System – it's a shared filesystem that can be accessed over the network.
The NFS must already exist – Kubernetes doesn't run the NFS, pods in just access it.

Multiple Volume mounts with Kubernetes: one works, one doesn't

I am trying to create a Kubernetes pod with a single container which has two external volumes mounted on it. My .yml pod file is:
apiVersion: v1
kind: Pod
metadata:
name: my-project
labels:
name: my-project
spec:
containers:
- image: my-username/my-project
name: my-project
ports:
- containerPort: 80
name: nginx-http
- containerPort: 443
name: nginx-ssl-https
imagePullPolicy: Always
volumeMounts:
- mountPath: /home/projects/my-project/media/upload
name: pd-data
- mountPath: /home/projects/my-project/backups
name: pd2-data
imagePullSecrets:
- name: vpregistrykey
volumes:
- name: pd-data
persistentVolumeClaim:
claimName: pd-claim
- name: pd2-data
persistentVolumeClaim:
claimName: pd2-claim
I am using Persistent Volumes and Persisten Volume Claims, as such:
PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: pd-disk
labels:
name: pd-disk
spec:
capacity:
storage: 250Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: "pd-disk"
fsType: "ext4"
PVC
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pd-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 250Gi
I have initially created my disks using the command:
$ gcloud compute disks create --size 250GB pd-disk
Same goes for the second disk and second PV and PVC. Everything seems to work ok when I create the pod, no errors are thrown. Now comes the weird part: one of the paths is being mounted correctly (and is therefor persistent) and the other one is being erased every time I restart the pod...
I have tried re-creating everything from scratch, but nothing changes. Also, from the pod description, both volumes seem to be correctly mounted:
$ kubectl describe pod my-project
Name: my-project
...
Volumes:
pd-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pd-claim
ReadOnly: false
pd2-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pd2-claim
ReadOnly: false
Any help is appreciated. Thanks.
The Kubernetes documentation states:
Volumes can not mount onto other volumes or have hard links to other
volumes
I had the same issue and in my case the problem was that both volume mounts had overlapping mountPaths, i.e. both started with /var/.
They mounted without issues after fixing that.
I do not see any direct problem for which such behavior as explained above has occurred! But what I can rather ask you to try is to use a "Deployment" instead of a "Pod" as suggested by many here, especially when using PVs and PVCs. Deployment takes care of many things to maintain the "Desired State". I have attached my code below for your reference which works and both the volumes are persistent even after deleting/terminating/restarting as this is managed by the Deployment's desired state.
Two difference which you would find in my code from yours are:
I have a deployment object instead of pod
I am using GlusterFs for my volume.
Deployment yml.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
namespace: platform
labels:
component: nginx
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
template:
metadata:
labels:
component: nginx
spec:
nodeSelector:
role: app-1
containers:
- name: nginx
image: vip-intOAM:5001/nginx:1.15.3
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: "/etc/nginx/conf.d/"
name: nginx-confd
- mountPath: "/var/www/"
name: nginx-web-content
volumes:
- name: nginx-confd
persistentVolumeClaim:
claimName: glusterfsvol-nginx-confd-pvc
- name: nginx-web-content
persistentVolumeClaim:
claimName: glusterfsvol-nginx-web-content-pvc
One of my PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: glusterfsvol-nginx-confd-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
glusterfs:
endpoints: gluster-cluster
path: nginx-confd
readOnly: false
persistentVolumeReclaimPolicy: Retain
claimRef:
name: glusterfsvol-nginx-confd-pvc
namespace: platform
PVC for the above
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: glusterfsvol-nginx-confd-pvc
namespace: platform
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi