My GKE deployment consists of N pods (possibly on different nodes) and a shared volume, which is dynamically provisioned by pd.csi.storage.gke.io and is a Persistent Disk in GCP. I need to initialize this disk with data before the pods go live.
My problem is I need to set accessModes to ReadOnlyMany and be able to mount it to all pods across different nodes in read-only mode, which I assume effectively would make it impossible to mount it in write mode to the initContainer.
Is there a solution to this issue? Answer to this question suggests a good solution for a case when each pod has their own disk mounted, but I need to have one disk shared among all pods since my data is quite large.
With GKE 1.21 and later, you can enable the managed Filestore CSI driver in your clusters. You can enable the driver for new clusters
gcloud container clusters create CLUSTER_NAME \
--addons=GcpFilestoreCsiDriver ...
or update existing clusters:
gcloud container clusters update CLUSTER_NAME \
--update-addons=GcpFilestoreCsiDriver=ENABLED
Once you've done that, create a storage class (or have or platform admin do it):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: filestore-example
provisioner: filestore.csi.storage.gke.io
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
tier: standard
network: default
After that, you can use PVCs and dynamic provisioning:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: podpvc
spec:
accessModes:
- ReadWriteMany
storageClassName: filestore-example
resources:
requests:
storage: 1Ti
...I need to have one disk shared among all pods
You can try Filestore. First your create a FileStore instance and save your data on a FileStore volume. Then you install FileStore driver on your cluster. Finally you share the data with pods that needs to read the data using a PersistentVolume referring the FileStore instance and volume above.
Related
I have k8tes cluster in which I am facing issues while mounting the existing volume to the pod in the new deployment. I have the existing deployments where I am mounting the same existing PV and PVCs. But facing issues only new deployment.
What could be the reason? How can I mount(NFS) volume to the new deployments because both PV and PVC statuses are bound and claimed respectively?
you can not ideally If your mount mode is set to ReadWriteOnce.
If you are planning to use the NFS and want to attach multiple PODs to a single mount you have to use the ReadWriteMany.
Example :
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-data
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
storageClassName: nfs
A PersistentVolumeClaim (PVC) is a request for storage by a user. It
is similar to a Pod. Pods consume node resources and PVCs consume PV
resources. Pods can request specific levels of resources (CPU and
Memory). Claims can request specific size and access modes (e.g., they
can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see
AccessModes).
Acces mode : https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes
GKE example : https://medium.com/platformer-blog/nfs-persistent-volumes-with-kubernetes-a-case-study-ce1ed6e2c266
Hi i have this problem in kubernetes. I install a deployment with helm that consists of 2 pods i have to put them in two nodes but i want that the persistent volume used by the pods are in the same nodes as the pod are deployed. Is feasible in helm? Thanks
I think you can use Local Persistent Volume
See more: local-pv, local-pv-comparision.
Usage of Local Persistent Volumes:
The local volumes must still first be set up and mounted on the local
node by an administrator. The administrator needs to mount the local
volume into a configurable “discovery directory” that the local volume
manager recognizes. Directories on a shared file system are supported,
but they must be bind-mounted into the discovery directory.
This local volume manager monitors the discovery directory, looking
for any new mount points. The manager creates a PersistentVolume
object with the appropriate storageClassName, path, nodeAffinity, and
capacity for any new mount point that it detects. These
PersistentVolume objects can eventually be claimed by
PersistentVolumeClaims, and then mounted in Pods.
After a Pod is done using the volume and deletes the
PersistentVolumeClaim for it, the local volume manager cleans up the
local mount by deleting all files from it, then deleting the
PersistentVolume object. This triggers the discovery cycle: a new
PersistentVolume is created for the volume and can be reused by a new
PersistentVolumeClaim.
Local volume can be requested in exactly the same way as any other
PersistentVolume type: through a PersistentVolumeClaim. Just specify
the appropriate StorageClassName for local volumes in the
PersistentVolumeClaim object, and the system takes care of the rest!
In your case I will create manually storageclass and use it during chart installation.
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-local-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: example-local-storage
local:
path: /mnt/disks/v1
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: example-local-storage
provisioner: kubernetes.io/no-provisioner
and override default PVC storageClassName configuration during helm install like this:
$ helm install --name chart-name --set persistence.storageClass=example-local-storage
Take a look: using-local-pv, pod-local-pv, kubernetes-1.19-lv.
I have a docker image that when created should check if the volume is empty, in case it should initialize it with some data.
This saved data must remain available for other pods with the same or different image.
What do you recommend me to do?
You have 2 options:
First option is to mount the pod into the node and save the data in the node so when new pod will create in the same node it will have an access to the same volume (persistent storage location).
Potential problem: 2 pods on the same node can create deadlock for the same resource (so you have to manage the resource).
Shared storage meaning create one storage and every pod will claim storage in the same storage.
I strongly suggest that you will take the next 55 minutes and see the webinar below:
https://www.youtube.com/watch?v=n06kKYS6LZE
I assume you create your pods using Deployment object in Kubernetes. What you want to look into is a StatefulSet, which, in opposite to deployments, retains some identity aspects for recreated pods including to some extent network and storage.
It was introduced specifically as a means to run services that need to keep their state in kube cluster (ie. running databases queues etc.)
Looking at the answers, would it not be simpler to create an NFS Persistent Volume and then allow the pods to mount the PV's?
You can use the writemany which should alleviate a deadlock.
apiVersion: v1
kind: PersistentVolume
metadata:
name: shared-volume
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /tmp
server: 172.17.0.2
Persistent Volumes
I want to setup a PVC on AWS, where I need ReadWriteMany as access mode. Unfortunately, EBS only supports ReadWriteOnce.
How could I solve this?
I have seen that there is a beta provider for AWS EFS which supports ReadWriteMany, but as said, this is still beta, and its installation looks somewhat flaky.
I could use node affinity to force all pods that rely on the EBS volume to a single node, and stay with ReadWriteOnce, but this limits scalability.
Are there any other ways of how to solve this? Basically, what I need is a way to store data in a persistent way to share it across pods that are independent of each other.
Using EFS without automatic provisioning
The EFS provisioner may be beta, but EFS itself is not. Since EFS volumes can be mounted via NFS, you can simply create a PersistentVolume with a NFS volume source manually -- assuming that automatic provisioning is not a hard requirement on your side:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-efs-volume
spec:
capacity:
storage: 100Gi # Doesn't really matter, as EFS does not enforce it anyway
volumeMode: Filesystem
accessModes:
- ReadWriteMany
mountOptions:
- hard
- nfsvers=4.1
- rsize=1048576
- wsize=1048576
- timeo=600
- retrans=2
nfs:
path: /
server: fs-XXXXXXXX.efs.eu-central-1.amazonaws.com
You can then claim this volume using a PersistentVolumeClaim and use it in a Pod (or multiple Pods) as usual.
Alternative solutions
If automatic provisioning is a hard requirement for you, there are alternative solutions you might look at: There are several distributed filesystems that you can roll out on yourcluster that offer ReadWriteMany storage on top of Kubernetes and/or AWS. For example, you might take a look at Rook (which is basically a Kubernetes operator for Ceph). It's also officially still in a pre-release phase, but I've already worked with it a bit and it runs reasonably well.
There's also the GlusterFS operator, which already seems to have a few stable releases.
You can use Amazon EFS to create PersistentVolume with ReadWriteMany access mode.
Amazon EKS Announced support for the Amazon EFS CSI Driver on Sep 19 2019, which makes it simple to configure elastic file storage for both EKS and self-managed Kubernetes clusters running on AWS using standard Kubernetes interfaces.
Applications running in Kubernetes can
use EFS file systems to share data between pods in a scale-out group,
or with other applications running within or outside of Kubernetes.
EFS can also help Kubernetes applications be highly available because
all data written to EFS is written to multiple AWS Availability zones.
If a Kubernetes pod is terminated and relaunched, the CSI driver will
reconnect the EFS file system, even if the pod is relaunched in a
different AWS Availability Zone.
You can deploy the Amazon EFS CSI Driver to an Amazon EKS cluster following the EKS-EFS-CSI user guide, basically like this:
Step 1: Deploy the Amazon EFS CSI Driver
kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"
Note: This command requires version 1.14 or greater of kubectl.
Step 2: Create an Amazon EFS file system for your Amazon EKS cluster
Step 2.1: Create a security group that allows inbound NFS traffic for your Amazon EFS mount points.
Step 2.2: Add a rule to your security group to allow inbound NFS traffic from your VPC CIDR range.
Step 2.3: Create the Amazon EFS file system configured with the security group you just created.
Now you are good to use EFS with ReadWriteMany access mode in your EKS Kubernetes project with the following sample manifest files:
1. efs-storage-class.yaml: Create the storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
kubectl apply -f efs-storage-class.yaml
2. efs-pv.yaml: Create PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: ftp-efs-pv
spec:
storageClassName: efs-sc
persistentVolumeReclaimPolicy: Retain
capacity:
storage: 10Gi # Doesn't really matter, as EFS does not enforce it anyway
volumeMode: Filesystem
accessModes:
- ReadWriteMany
csi:
driver: efs.csi.aws.com
volumeHandle: fs-642da695
Note: you need to replace the volumeHandle value with your Amazon EFS file system ID.
3. efs-pvc.yaml: Create PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ftp-pv-claim
labels:
app: ftp-storage-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: efs-sc
That should be it. You need to refer to the aforementioned official user guide for detailed explanation, where your can also find an example app to verify your setup.
As you mention EBS volume with affinity & node selector will stop scalability however with EBS only ReadWriteOnce will work.
Sharing my experience, if you are doing many operations on the file system and frequently pushing & fetching files it might could be slow with EFS which can degrade application performance. operation rate on EFS is slow.
However, you can use GlusterFs in back it will be provisioning EBS volume. GlusterFS also support ReadWriteMany and it will be faster compared to EFS as it's block storage (SSD).
Use case:
I have a NFS directory available and I want to use it to persist data for multiple deployments & pods.
I have created a PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
nfs:
server: http://mynfs.com
path: /server/mount/point
I want multiple deployments to be able to use this PersistentVolume, so my understanding of what is needed is that I need to create multiple PersistentVolumeClaims which will all point at this PersistentVolume.
kind: PersistentVolumeClaim
apiVersion: v1
metaData:
name: nfs-pvc-1
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Mi
I believe this to create a 50MB claim on the PersistentVolume. When I run kubectl get pvc, I see:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc-1 Bound nfs-pv 10Gi RWX 35s
I don't understand why I see 10Gi capacity, not 50Mi.
When I then change the PersistentVolumeClaim deployment yaml to create a PVC named nfs-pvc-2 I get this:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc-1 Bound nfs-pv 10Gi RWX 35s
nfs-pvc-2 Pending 10s
PVC2 never binds to the PV. Is this expected behaviour? Can I have multiple PVCs pointing at the same PV?
When I delete nfs-pvc-1, I see the same thing:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc-2 Pending 10s
Again, is this normal?
What is the appropriate way to use/re-use a shared NFS resource between multiple deployments / pods?
Basically you can't do what you want, as the relationship PVC <--> PV is one-on-one.
If NFS is the only storage you have available and would like multiple PV/PVC on one nfs export, use Dynamic Provisioning and a default storage class.
It's not in official K8s yet, but this one is in the incubator and I've tried it and it works well: https://github.com/kubernetes-incubator/external-storage/tree/master/nfs-client
This will enormously simplify your volume provisioning as you only need to take care of the PVC, and the PV will be created as a directory on the nfs export / server that you have defined.
From: https://docs.openshift.org/latest/install_config/storage_examples/shared_storage.html
As Baroudi Safwen mentioned, you cannot bind two pvc to the same pv, but you can use the same pvc in two different pods.
volumes:
- name: nfsvol-2
persistentVolumeClaim:
claimName: nfs-pvc-1 <-- USE THIS ONE IN BOTH PODS
A persistent volume claim is exclusively bound to a persistent volume.
You cannot bind 2 pvc to the same pv. I guess you are interested in the dynamic provisioning. I faced this issue when I was deploying statefulsets, which require dynamic provisioning for pods. So you need to deploy an NFS provisioner in your cluster, the NFS provisioner(pod) will have access to the NFS folder(hostpath), and each time a pod requests a volume, the NFS provisioner will mount it in the NFS directory on behalf of the pod. Here is the github repository to deploy it:
https://github.com/kubernetes-incubator/external-storage/tree/master/nfs/deploy/kubernetes
You have to be careful though, you must ensure the nfs provisioner always runs on the same machine where you have the NFS folder by making use of the node selector since you the volume is of type hostpath.
For my future-self and everyone else looking for the official documentation:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#binding
Once bound, PersistentVolumeClaim binds are exclusive, regardless of
how they were bound. A PVC to PV binding is a one-to-one mapping,
using a ClaimRef which is a bi-directional binding between the
PersistentVolume and the PersistentVolumeClaim.
a few points on dynamic provisioning..
using dynamic provisioning of nfs prevents you for changing any of the default nfs mount options. On my platform this uses rsize/wsize of 1M. this can cause huge problems in some applications using small files or block reading. (I've just hit this issue in a big way)
dynamic is a great option if it suits your needs. I'm now stuck with creating 250 pv/pvc pairs for my application that was being handled by dynamic due to the 1-1 relationship.