I am using Azure Kubernetes and I have created Persistent Volume, Claims and Storage Class.
I want to deploy pods on the Persistent Volume so we can increase the Volume anytime as per the requirement. Right now our Pods are deployed in the Virtual Machines OS Disk. Since we are using the default Pods deployment on the VM Disk when we run out of the disk space the whole cluster will be destroyed and created again.
Please let me know how can I configure Pods to deploy in Azure (Managed) Disk.
Thanks,
Mrugesh
You don't have to create a Persistent Volume manually, if you want Azure Disk, this can be created dynamically for you.
From Azure Built-in storage classes:
The default storage class provisions a standard SSD Azure disk.
Standard storage is backed by Standard SSDs and delivers cost-effective storage while still delivering reliable performance.
You only have to create the PersistentVolumeClaim with the storage class you want to use, e.g.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: default
resources:
requests:
storage: 5Gi
and then refer to that PVC in your Deployment or Pods.
Related
My GKE deployment consists of N pods (possibly on different nodes) and a shared volume, which is dynamically provisioned by pd.csi.storage.gke.io and is a Persistent Disk in GCP. I need to initialize this disk with data before the pods go live.
My problem is I need to set accessModes to ReadOnlyMany and be able to mount it to all pods across different nodes in read-only mode, which I assume effectively would make it impossible to mount it in write mode to the initContainer.
Is there a solution to this issue? Answer to this question suggests a good solution for a case when each pod has their own disk mounted, but I need to have one disk shared among all pods since my data is quite large.
With GKE 1.21 and later, you can enable the managed Filestore CSI driver in your clusters. You can enable the driver for new clusters
gcloud container clusters create CLUSTER_NAME \
--addons=GcpFilestoreCsiDriver ...
or update existing clusters:
gcloud container clusters update CLUSTER_NAME \
--update-addons=GcpFilestoreCsiDriver=ENABLED
Once you've done that, create a storage class (or have or platform admin do it):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: filestore-example
provisioner: filestore.csi.storage.gke.io
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
tier: standard
network: default
After that, you can use PVCs and dynamic provisioning:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: podpvc
spec:
accessModes:
- ReadWriteMany
storageClassName: filestore-example
resources:
requests:
storage: 1Ti
...I need to have one disk shared among all pods
You can try Filestore. First your create a FileStore instance and save your data on a FileStore volume. Then you install FileStore driver on your cluster. Finally you share the data with pods that needs to read the data using a PersistentVolume referring the FileStore instance and volume above.
I have got a deployment.yaml and it uses a persistentvolumeclaim like so
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mautic-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: standard
I am trying to scale my deployment horizontally using (Horizontal Pod Scheduler) but when I scale my deployment, the rest of the pods are in ContainerCreating process and this is the error I get when I describe the pod
Unable to attach or mount volumes: unmounted volume
What am I doing wrong here?
Using Deployment is great if your app can scale horizontally. However, using a Persistent Volume with a PersistentVolumeClaim can be challenging when scaling horizontally.
Persistent Volume Claim - Access Modes
A PersistentVolumeClaim can be requested for a few different Access Modes:
ReadWriteOnce (most common)
ReadOnlyMany
ReadWriteMany
Where ReadWriteOnce is the most commonly available and is typical behavior for a local disk. But to scale your app horizontally - you need a volume that is available from multiple nodes at the same time, so only ReadOnlyMany and ReadWriteMany is viable options. You need to check what what access modes are available for your storage system.
In addition, you use a regional cluster from a cloud provider, it spans over three Availability Zones and a volume typically only live in one Availability Zone, so even if you use ReadOnlyMany or ReadWriteMany access modes, it makes your volume available on multiple nodes in the same AZ, but not available in all three AZs in your cluster. You might consider using a storage class from your cloud provider that is replicated to multiple Availability Zones, but it typically costs more and is slower.
Alternatives
Since only ReadWriteOnce is commonly available, you might look for better alternatives for your app.
Object Storage
Object Storage, or Buckets, is a common way to handle file storage in the cloud instead of using filesystem volumes. With Object Storage you access you files via an API over HTTP. See e.g. AWS S3 or Google Cloud Storage.
StatefulSet
You could also consider StatefulSet where each instance of your app get its own volume. This makes your app distributed but typically not horizontally scalable. Here, your app typically needs to implement replication of data, typically using Raft and is a more advanced alterantive.
I am trying to mount a GCE persistent disk that was created by a Kubernetes PersistentVolumeClaim resource (on GKE) to my local machine.
I created a PersistentVolumeClaim (that creates a persistent volume in GCE):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: profiler-disk
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
I tried to use gcsfuse to mount the disk as is written in the documentation:
You can use the Google Cloud Storage FUSE tool to mount a Cloud Storage bucket to your Compute Engine instance. The mounted bucket behaves similarly to a persistent disk even though Cloud Storage buckets are object storage.
with the command:
gcsfuse profiler-disk hello
but I am getting:
daemonize.Run: readFromProcess: sub-process: mountWithArgs: mountWithConn: setUpBucket: OpenBucket: Unknown bucket "profiler-disk"
I was able to load an actual bucket, so this is not an authorization/authentication issue.
Does anyone know how to achieve this?
I was able to copy the data using kubectl cp
kubectl <pod-name>:/path <local-path> -c <container-name>
I have a docker image that when created should check if the volume is empty, in case it should initialize it with some data.
This saved data must remain available for other pods with the same or different image.
What do you recommend me to do?
You have 2 options:
First option is to mount the pod into the node and save the data in the node so when new pod will create in the same node it will have an access to the same volume (persistent storage location).
Potential problem: 2 pods on the same node can create deadlock for the same resource (so you have to manage the resource).
Shared storage meaning create one storage and every pod will claim storage in the same storage.
I strongly suggest that you will take the next 55 minutes and see the webinar below:
https://www.youtube.com/watch?v=n06kKYS6LZE
I assume you create your pods using Deployment object in Kubernetes. What you want to look into is a StatefulSet, which, in opposite to deployments, retains some identity aspects for recreated pods including to some extent network and storage.
It was introduced specifically as a means to run services that need to keep their state in kube cluster (ie. running databases queues etc.)
Looking at the answers, would it not be simpler to create an NFS Persistent Volume and then allow the pods to mount the PV's?
You can use the writemany which should alleviate a deadlock.
apiVersion: v1
kind: PersistentVolume
metadata:
name: shared-volume
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /tmp
server: 172.17.0.2
Persistent Volumes
I want to setup a PVC on AWS, where I need ReadWriteMany as access mode. Unfortunately, EBS only supports ReadWriteOnce.
How could I solve this?
I have seen that there is a beta provider for AWS EFS which supports ReadWriteMany, but as said, this is still beta, and its installation looks somewhat flaky.
I could use node affinity to force all pods that rely on the EBS volume to a single node, and stay with ReadWriteOnce, but this limits scalability.
Are there any other ways of how to solve this? Basically, what I need is a way to store data in a persistent way to share it across pods that are independent of each other.
Using EFS without automatic provisioning
The EFS provisioner may be beta, but EFS itself is not. Since EFS volumes can be mounted via NFS, you can simply create a PersistentVolume with a NFS volume source manually -- assuming that automatic provisioning is not a hard requirement on your side:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-efs-volume
spec:
capacity:
storage: 100Gi # Doesn't really matter, as EFS does not enforce it anyway
volumeMode: Filesystem
accessModes:
- ReadWriteMany
mountOptions:
- hard
- nfsvers=4.1
- rsize=1048576
- wsize=1048576
- timeo=600
- retrans=2
nfs:
path: /
server: fs-XXXXXXXX.efs.eu-central-1.amazonaws.com
You can then claim this volume using a PersistentVolumeClaim and use it in a Pod (or multiple Pods) as usual.
Alternative solutions
If automatic provisioning is a hard requirement for you, there are alternative solutions you might look at: There are several distributed filesystems that you can roll out on yourcluster that offer ReadWriteMany storage on top of Kubernetes and/or AWS. For example, you might take a look at Rook (which is basically a Kubernetes operator for Ceph). It's also officially still in a pre-release phase, but I've already worked with it a bit and it runs reasonably well.
There's also the GlusterFS operator, which already seems to have a few stable releases.
You can use Amazon EFS to create PersistentVolume with ReadWriteMany access mode.
Amazon EKS Announced support for the Amazon EFS CSI Driver on Sep 19 2019, which makes it simple to configure elastic file storage for both EKS and self-managed Kubernetes clusters running on AWS using standard Kubernetes interfaces.
Applications running in Kubernetes can
use EFS file systems to share data between pods in a scale-out group,
or with other applications running within or outside of Kubernetes.
EFS can also help Kubernetes applications be highly available because
all data written to EFS is written to multiple AWS Availability zones.
If a Kubernetes pod is terminated and relaunched, the CSI driver will
reconnect the EFS file system, even if the pod is relaunched in a
different AWS Availability Zone.
You can deploy the Amazon EFS CSI Driver to an Amazon EKS cluster following the EKS-EFS-CSI user guide, basically like this:
Step 1: Deploy the Amazon EFS CSI Driver
kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"
Note: This command requires version 1.14 or greater of kubectl.
Step 2: Create an Amazon EFS file system for your Amazon EKS cluster
Step 2.1: Create a security group that allows inbound NFS traffic for your Amazon EFS mount points.
Step 2.2: Add a rule to your security group to allow inbound NFS traffic from your VPC CIDR range.
Step 2.3: Create the Amazon EFS file system configured with the security group you just created.
Now you are good to use EFS with ReadWriteMany access mode in your EKS Kubernetes project with the following sample manifest files:
1. efs-storage-class.yaml: Create the storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
kubectl apply -f efs-storage-class.yaml
2. efs-pv.yaml: Create PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: ftp-efs-pv
spec:
storageClassName: efs-sc
persistentVolumeReclaimPolicy: Retain
capacity:
storage: 10Gi # Doesn't really matter, as EFS does not enforce it anyway
volumeMode: Filesystem
accessModes:
- ReadWriteMany
csi:
driver: efs.csi.aws.com
volumeHandle: fs-642da695
Note: you need to replace the volumeHandle value with your Amazon EFS file system ID.
3. efs-pvc.yaml: Create PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ftp-pv-claim
labels:
app: ftp-storage-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: efs-sc
That should be it. You need to refer to the aforementioned official user guide for detailed explanation, where your can also find an example app to verify your setup.
As you mention EBS volume with affinity & node selector will stop scalability however with EBS only ReadWriteOnce will work.
Sharing my experience, if you are doing many operations on the file system and frequently pushing & fetching files it might could be slow with EFS which can degrade application performance. operation rate on EFS is slow.
However, you can use GlusterFs in back it will be provisioning EBS volume. GlusterFS also support ReadWriteMany and it will be faster compared to EFS as it's block storage (SSD).