Does the storage class dynamically provision persistent volume per pod? - kubernetes

Kubernetes newbie here, so my question might not make sense. Please bear with me.
So my question is, given I have setup Storage Class in my cluster, then I have a PVC (Which uses that Storage Class). If I use that PVC into my Deployment, and that Deployment have 5 replicas, will the Storage Class create 5 PV? one per Pod? Or only 1 PV shared by all Pods under that Deployment?
Edit: Also I have 3 Nodes in this cluster
Thanks in advance.

The Persistent Volume Claim resource is specified separately from a deployment. It doesn't matter how many replicas the deployment has, kubernetes will only have the number of PVC resources that you define.
If you are looking for multiple stateful containers that create their own PVC's, use a StatefulSet instead. This includes a VolumeClaimTemplate definition.
If you want all deployment replicas to share a PVC, the storage class provider plugin will need to be either ReadOnlyMany or ReadWriteMany

To answer my question directly.
The Storage Class in this case will only provision one PV and is shared across all pods under the Deployment which uses that PVC.
The accessModes of the PVC does not dictate whether to create one PV for each pod. You can set the accessModes to either ReadWriteOnce/ReadOnlyMany/ReadWriteMany and it will always create 1 PV.
If you want that each Pod will have its own PV, you can not do that under a Deployment
You will need to use StatefulSet using volumeClaimTemplates.
It is Important that the StatefulSet uses volumeClaimTemplates or else, it will still act the same as the Deployment, that is the Storage Class will just provision one PV that is shared across all pods under that StatefulSet.
References:
Kubernetes Deployments vs StatefulSets
Is there a way to create a persistent volume per pod in a kubernetes deployment (or statefulset)?

Related

Understanding Persistent Volumes vs. Persistent Volume Claims and how they bind

I understand that a PV is the physical storage for a k8s cluster and that a PVC is just a request for storage tied to a deployment/pod that will look at available PVs and claim one.
Where I'm confused is how/if a mount will rebind to the PV if the deployment is started up. Are there cases when, if I restart my pod, that a PVC will bind to a DIFFERENT PV? Will I lose my data that's mounted in the deployment or pod? Or does that bind happen when I deploy my PVC and then just remain static regardless of the state of the pod?
I haven't really found any documentation that spells this out so any clarification would be helpful!
...PVC will bind to a DIFFERENT PV?
To ensure your PVC always bind to the same PV, you can pre-bind the PVC/PV and the instruction is here.

Does the Storage class need to be created in Kubernetes before referring them in PV/PVC?

I have a PV alpha-pv in the kubernetes cluster and have created a PVC matching the PV specs. The PV uses the Storage Class: slow. However, when I check the existence of Storage Class in Cluster there is no Storage Class existing and still my PVC was Bound to the PV.
How is this Possible when the Storage Class referred in the PV/PVC does not exists in the cluster?
If I don't mention the Storage Class in PVC, I get error message stating Storage Class Set. There is already an existing PV in the cluster which has RWO access modes, 1Gi storage size and with the Storage class named slow. But on checking the Storage Class details, there is no Storage Class resource in cluster.
If I add the Storage Class name slow in my PVC mysql-alpha-pvc, then the PVC binds to the PV. But I'm not clear how this happens when the Storage Class referred in PV/PVC named slow doesn't exist in the cluster.
Short answer
It depends.
Theory
One of the main purpose of using a storageClass is dynamic provisioning. That means that persistent volumes will be automatically provisioned once persistent volume claim requests for the storage: immediately or after pod using this PVC is created. See Volume binding mode.
Also:
A StorageClass provides a way for administrators to describe the
"classes" of storage they offer. Different classes might map to
quality-of-service levels, or to backup policies, or to arbitrary
policies determined by the cluster administrators. Kubernetes itself
is unopinionated about what classes represent. This concept is
sometimes called "profiles" in other storage systems.
Reference.
How it works
If for instance kubernetes is used in cloud (Google GKE, Azure AKS or AWS EKS), they have already had predefined storageClasses, for example this is from Google GKE:
$ kubectl get storageclasses
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
premium-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 27d
standard (default) kubernetes.io/gce-pd Delete Immediate true 27d
standard-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 27d
So you can create PVC's and refer to storageClass, PV will be created for you.
Another scenario which you faced is you can create PVC and PV with any custom storageClassName only for binding purposes. Usually it's used for testing something locally. This is also called static provisioning.
In this case you can create "fake" storage class which won't exist in kubernetes server.
Please see an example with such type of binding:
It defines the StorageClass name manual for the PersistentVolume,
which will be used to bind PersistentVolumeClaim requests to this
PersistentVolume.
Useful links:
Kubernetes storage classes
Kubernetes dynamic provisioning
Kubernetes persistent volumes
Hello I already faced the same challenge but solved,
Please Make sure :
Your PVC configuration ( RW mode, Size, Name) is matching what is in the PV configuration
Claim name in your Deployment is equal to your PVC
Scale your deployment to (0) then to (1) you will find that it is
working smoothly
if you are facing any challenges you could run ( kubectl get events ) to know what is the blocker.

Kubernetes: hostPath Static Storage with PV vs hard coded hostPath in Pod Volume

I'm learning Kubernetes and there is something I don't get well.
There are 3 ways of setting up static storage:
Pods with volumes you attach diretctly the storage to
Pods with a PVC attached to its volume
StatefulSets with also PVC inside
I can understand the power of PVC when working together with StorageClass, but not when working with static storage and local storage like hostPath
To me, it sounds very similar:
In the first case I have a volume directly attached to a pod.
In the second case I have a volume statically attached to a PVC, which is also manually attached to a Pod. In the end, the volume will be statically attached to the Pod.
On both cases, the data will remain when the Pod is terminates and will be adopted by the next Pod which the corresponing definition, right?
The only profit I see from using PVCs over plain Pod is that you can define the acces mode. Apart of that. Is there a difference when working with hostpath?
On the other hand, the advantage of using a StatefulSet instead of a PVC is (if understood properly) that it get a headless service, and that the rollout and rollback mechanism works differently. Is that the point?
Thank you in advance!
Extracted from this blog:
The biggest difference is that the Kubernetes scheduler understands
which node a Local Persistent Volume belongs to. With HostPath
volumes, a pod referencing a HostPath volume may be moved by the
scheduler to a different node resulting in data loss. But with Local
Persistent Volumes, the Kubernetes scheduler ensures that a pod using
a Local Persistent Volume is always scheduled to the same node.
Using hostPath does not garantee that a pod will restart on the same node. So you pod can attach /tmp/storage on k8s-node-1, then if you delete and re-create the pod, it may attach tmp/storage on k8s-node-[2-n]
On the contrary, if you use PVC/PV with local persistent storage class, then if you delete and re-create a pod, it will stick on the node which handle the local persistent storage.
StatefulSet creates pods and has volumeClaimTemplate field, which creates a dedicated PVC for each pod. So each pod created by the statefulSet will have its own dedicated storage, linked with Pod->PVC->PV->Storage. So StatefulSet use also the PVC/PV mechanism.
More details are available here.

How to share persistent volume of a StatefulSet with another StatefulSet?

I have a StatefulSet-1 running with 3 replicas & each pod writing logs to its own persistent volume say pv1,pv2,pv3 (achieved using volumeClaimTemplates:)
I have another StatefulSet-2 running with 3 replicas & I want each POD of StatefulSet-2 access already created StatefulSet-1's volumes i.e. pv1,pv2 & pv3 for processing seperate logs written by each pod of StatefulSet-1.
So pv1,pv2,pv3 should be using by both StatefulSet1 & StatefulSet2 since pv1,pv2,pv3 created as part of StatefulSet-1 deployment! pv1,pv2,pv3 will ofcourse takes POD's name of StatefulSet-1 which is ok for StatefulSet-2.
How to configure StatefulSet2 to achieve the above scenario? please help!
Thanks & Regards,
Sudhir
This won't work.
1. PVs backed by GCE disks are in readWriteOnce mode so 1 pvc per pod.
2. You are achieving the statefulset pods with PVCs using PVC templates which rely on dynamic volume provisioning to create the appropriate PVs and PVCs.
If you need these pods to share the PVC, your best bet is to use a readWriteMany PV such as one backed by NFS. You will also need to create the pods of statefulSet-2 manually to have them mount the appropriate PVCs. You could achieve this by creating a single pod deployment for each one.
Something else to consider, can you have the containers of each statefulSet run together in the same pods? Normally this is not recommended, but it would allow them both to share the same volumes (as long as they are not using the same ports)

Kubernetes Volume, PersistentVolume, PersistentVolumeClaim

I've been working with Kubernetes for quite a while, but still often got confused about Volume, PersistentVolume and PersistemtVolumeClaim. It would be nice if someone could briefly summarize the difference of them.
Volume - For a pod to reference a storage that is external , it needs volume spec. This volume can be from configmap, from secrets, from persistantvolumeclaim, from hostpath etc
PeristentVolume - It is representation of a storage that is made avaliable. The plugins for cloud provider enable to create this resource.
PeristentVolumeClaim - This claims specific resources and if the persistent volume is avaliable in namespaces match the claim requirement, then claim get tied to that Peristentvolume
At this point this PVC/PV aren't used. Then in Pod spec, pod makes use of claim as volumes and then the storage is attached to Pod
These are all in a Kubernetes application context. Too keep applications portable between different Kubernetes platforms, it is good to abstract away the infrastructure from the application. Here I will explain the Kubernetes objects that belongs to Application config and also to the Platform config. If your application runs on both e.g. GCP and AWS, you will need two sets of platform configs, one for GCP and one for AWS.
Application config
Volume
A pod may mount volumes. The source for volumes can be different things, e.g. a ConfigMap, Secret or a PersistentVolumeClaim
PersistentVolumeClaim
A PersistentVolumeClaim represents a claim of a specific PersistentVolume instance. For portability this claim can be for a specific StorageClass, e.g. SSD.
Platform config
StorageClass
A StorageClass represents PersistentVolume type with specific properties. It can be e.g. SSD. But the StorageClass is different on each platform, e.g. one definition on AWS, Azure, another on GCP or on Minikube.
PersistentVolume
This is a specific volume on the platform. And it may be different on platforms, e.g. awsElasticBlockStore or gcePersistentDisk. This is the instance that holds the actual data.
Minikube example
See Configure a Pod to Use a PersistentVolume for Storage for a full example on how to use PersistentVolume, StorageClass and Volume for a Pod using Minikube and a hostPath.