I create a deployment yaml for a microservice.
I am using hostpath volume type for persistentVolume and I have to copy data to a path in host. But I want to mount a directory from container into the host because data is in the container and I need this data in host.
My deployment yaml:
#create persistent volume
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-vol
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /opt/storage/app
#create persistent volume clame
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: app-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
#create Deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment
spec:
replicas: 1
selector:
matchLabels:
deploy: app
template:
metadata:
labels:
deploy: app
spec:
hostname: app
hostNetwork: false
containers:
- name: app
image: 192.168.10.10:2021/project/app:latest
volumeMounts:
- mountPath: /opt/app
name: project-volume
volumes:
- name: project-volume
persistentVolumeClaim:
claimName: app-pv-claim
Due to information gaps, I am writing a general answer.
First of all you should know:
HostPath volumes present many security risks, and it is a best practice to avoid the use of HostPaths when possible. When a HostPath volume must be used, it should be scoped to only the required file or directory, and mounted as ReadOnly.
But the use of hostPath also offers a powerful escape hatch for some applications.
If you still want to use it, firstly you should check if both pods (the one that created the data and the second one that want to access the data) are on the same node. The following command will show you that.
kubectl get pods -o wide
All data created by any of pods should stay in hostPath directory and be available for every pod as long as they are running on the same node.
See also this documentation about hostPath.
Related
I have 2 pods, one that is writing files to a persistent volume and the other one supposedly reads those files to make some calculations.
The first pod writes the files successfully and when I display the content of the persistent volume using print(os.listdir(persistent_volume_path)) I get all the expected files. However, the same command on the second pod shows an empty directory. (The mountPath directory /data is created but empty.)
This is the TFJob yaml file:
apiVersion: kubeflow.org/v1
kind: TFJob
metadata:
name: pod1
namespace: my-namespace
spec:
cleanPodPolicy: None
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: my-image:latest
imagePullPolicy: Always
command:
- "python"
- "./program1.py"
- "--data_path=./dataset.csv"
- "--persistent_volume_path=/data"
volumeMounts:
- mountPath: "/data"
name: my-pv
volumes:
- name: my-pv
persistentVolumeClaim:
claimName: my-pvc
(respectively pod2 and program2.py for the second pod)
And this is the volume configuration:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
namespace: my-namespace
labels:
type: local
app: tfjob
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
namespace: my-namespace
labels:
type: local
app: tfjob
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data"
Does anyone have any idea where's the problem exactly and how to fix it?
When two pods should access a shared Persistent Volume with access mode ReadWriteOnce, concurrently - then the two pods must be running on the same node since the volume can only be mounted on a single node at a time with this access mode.
To achieve this, some form of Pod Affinity must be applied, such that they are scheduled to the same node.
I am using kubeadm localy at two physical machines. I don't have any cloud resources, and i want to build a mongodb auto scaling (localy for start, maybe later at cloud). So i have to use the local storage of my two physical machines. I suppose i have to create a local storage class and volumes. I am very new to kubernetes so dont judge me hard. As i read here https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/ local persisent volumes are only for one node? Is there any way to take advance of my both physical machines storages and build a simple mongo db scaling, using kubernetes mongo operator and ops manager? I made a few tests here, but i could achieve my goal. pod has unbound immediate PersistentVolumeClaims ops manager
What i was thinking in first place, was to "break" my two hard drives into many piecies, and use sharding for mongo dv scaling
thanks in advace.
Well, you can use a NFS Server with the same volume mounted in both nodes sharing the same mount point.
Please be aware this approach is not recommended for production.
There are tons of howtos of how configure nfs server, example:
https://www.tecmint.com/install-nfs-server-on-ubuntu/
https://www.tecmint.com/how-to-setup-nfs-server-in-linux/
With NFS working you can use the hostPath to mount the nfs in you pods:
Create the PV and the PVC:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/nfs/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
And use the volume in your deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-pv
spec:
replicas: 1
selector:
matchLabels:
app: test-pv
template:
metadata:
labels:
app: test-pv
spec:
containers:
- image: nginx
name: nginx
volumeMounts:
- mountPath: /data
name: pv-storage
volumes:
- name: pv-storage
persistentVolumeClaim:
claimName: pv-claim
I have Java API which exports the data to an excel and generates a file on the POD where the request is served.
Now the next request (to download the file) might go to a different POD and the download fails.
How do I get around this?
How do I generate files on all the POD? Or how do I make sure the subsequent request goes to the same POD where file was generated?
I cant give the direct POD URL as it will not be accessible to clients.
Thanks.
You need to use a persistent volumes to share the same files between your containers. You could use the node storage mounted on containers (easiest way) or other distributed file system like NFS, EFS (AWS), GlusterFS etc...
If you you need a simplest to share the file and your pods are in the same node, you could use hostpath to store the file and share the volume with other containers.
Assuming you have a kubernetes cluster that has only one Node, and you want to share the path /mtn/data of your node with your pods:
Create a PersistentVolume:
A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
Create a PersistentVolumeClaim:
Pods use PersistentVolumeClaims to request physical storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
Look at the PersistentVolumeClaim:
kubectl get pvc task-pv-claim
The output shows that the PersistentVolumeClaim is bound to your PersistentVolume, task-pv-volume.
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
task-pv-claim Bound task-pv-volume 10Gi RWO manual 30s
Create a deployment with 2 replicas for example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/mnt/data"
name: task-pv-storage
Now you can check inside both container the path /mnt/data has the same files.
If you have cluster with more than 1 node I recommend you to think about the other types of persistent volumes.
References:
Configure persistent volumes
Persistent volumes
Volume Types
I have configured the Postgres pod with static provisioning of persistence volume in my local environment . It works fine at the first time but when i delete the namespace and
rerun the pod then its status is pending and give me error
pod has unbound immediate persistentvolumeclaims
I tried to remove the storageClassName from Persistance Volume claim but not works
I also tried to change the storeageclass from manual to block storage but same problem
my yaml file
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
namespace: manhattan
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/opt/manhattan/current/pgdata"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
namespace: manhattan
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: postgres
namespace: manhattan
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: dbr-postgres
image: postgres-custome
tty: true
volumeMounts:
- mountPath: "/var/lib/pgsql/9.3/data"
name: task-pv-storage
nodeSelector:
kubernetes.io/hostname: k8s-master
I want my pod to be running even when i delete the namespace and rerun the pod.yaml file
Data will be kept in the kubernetes node because hostpath uses the node filesystem to store the data. The problem is that if you have multiple nodes, then your pod can start on any other node. To solve this, you can either specify the node where you want your pod to start or implement a nfs or glusterfs in your kubernetes nodes. This might be the cause of your problem.
There is one more thing I can think of that might be your issue. When you remove a namespace all the kubernetes resources inside it are removed as well. There is no easy way to recover those. This means that you have to create the pv, pvc and pod in the new namespace.
I solved this issue by using persistentVolumeReclaimPolicy to recycle. Now I can rebound the persistence volume even after deleting the namespace and recreating it
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: "/opt/manhattan/current/pgdata"
I'm starting with GKE (and kubernetes in general) and I want to mount a persistent volume on a pod using a gcePersistentDisk.
I first created a Persistent Disk (project-data) in Compute Engine, then created a PersistentVolume and a PersistentVolumeClaim like so:
apiVersion: v1
kind: PersistentVolume
metadata:
name: project-data
spec:
storageClassName: standard
capacity:
storage: 20G
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: project-data
fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: project-data-claim
spec:
storageClassName: standard
volumeName: project-data
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20G
selector:
matchLabels:
app: myapp
After applying this config, I see in GKE/Storage that my PVC is "Bound", but I can't find a way to access my volume in myapp.
I tried to edit the deployment yaml in the console by adding:
volumeMounts:
- mountPath: /data
name: project-data
...but this modification is refused by the console (it seems that this kind of edit is forbidden).
How can I finally see my PersistentVolume as a filesystem in my app?
First of all, PVC should be defined in the volumes section:
volumes:
- name: project-data
persistentVolumeClaim:
claimName: project-data-claim
And if it's refuesed to edit the pod directly, you can edit the yaml file, then apply it:
$ kubectl apply -f your.yaml
Also, since you have the selector defined in your pvc configuration, I think you should have label defined in your pv configuration.