Understanding Persistent Volumes vs. Persistent Volume Claims and how they bind - kubernetes

I understand that a PV is the physical storage for a k8s cluster and that a PVC is just a request for storage tied to a deployment/pod that will look at available PVs and claim one.
Where I'm confused is how/if a mount will rebind to the PV if the deployment is started up. Are there cases when, if I restart my pod, that a PVC will bind to a DIFFERENT PV? Will I lose my data that's mounted in the deployment or pod? Or does that bind happen when I deploy my PVC and then just remain static regardless of the state of the pod?
I haven't really found any documentation that spells this out so any clarification would be helpful!

...PVC will bind to a DIFFERENT PV?
To ensure your PVC always bind to the same PV, you can pre-bind the PVC/PV and the instruction is here.

Related

Does the Storage class need to be created in Kubernetes before referring them in PV/PVC?

I have a PV alpha-pv in the kubernetes cluster and have created a PVC matching the PV specs. The PV uses the Storage Class: slow. However, when I check the existence of Storage Class in Cluster there is no Storage Class existing and still my PVC was Bound to the PV.
How is this Possible when the Storage Class referred in the PV/PVC does not exists in the cluster?
If I don't mention the Storage Class in PVC, I get error message stating Storage Class Set. There is already an existing PV in the cluster which has RWO access modes, 1Gi storage size and with the Storage class named slow. But on checking the Storage Class details, there is no Storage Class resource in cluster.
If I add the Storage Class name slow in my PVC mysql-alpha-pvc, then the PVC binds to the PV. But I'm not clear how this happens when the Storage Class referred in PV/PVC named slow doesn't exist in the cluster.
Short answer
It depends.
Theory
One of the main purpose of using a storageClass is dynamic provisioning. That means that persistent volumes will be automatically provisioned once persistent volume claim requests for the storage: immediately or after pod using this PVC is created. See Volume binding mode.
Also:
A StorageClass provides a way for administrators to describe the
"classes" of storage they offer. Different classes might map to
quality-of-service levels, or to backup policies, or to arbitrary
policies determined by the cluster administrators. Kubernetes itself
is unopinionated about what classes represent. This concept is
sometimes called "profiles" in other storage systems.
Reference.
How it works
If for instance kubernetes is used in cloud (Google GKE, Azure AKS or AWS EKS), they have already had predefined storageClasses, for example this is from Google GKE:
$ kubectl get storageclasses
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
premium-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 27d
standard (default) kubernetes.io/gce-pd Delete Immediate true 27d
standard-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 27d
So you can create PVC's and refer to storageClass, PV will be created for you.
Another scenario which you faced is you can create PVC and PV with any custom storageClassName only for binding purposes. Usually it's used for testing something locally. This is also called static provisioning.
In this case you can create "fake" storage class which won't exist in kubernetes server.
Please see an example with such type of binding:
It defines the StorageClass name manual for the PersistentVolume,
which will be used to bind PersistentVolumeClaim requests to this
PersistentVolume.
Useful links:
Kubernetes storage classes
Kubernetes dynamic provisioning
Kubernetes persistent volumes
Hello I already faced the same challenge but solved,
Please Make sure :
Your PVC configuration ( RW mode, Size, Name) is matching what is in the PV configuration
Claim name in your Deployment is equal to your PVC
Scale your deployment to (0) then to (1) you will find that it is
working smoothly
if you are facing any challenges you could run ( kubectl get events ) to know what is the blocker.

Does in Kubernetes a PV/PVC guarantees sticky mounting of pods?

I would like to understand if through PVC/PV a pod that is using a volume after a failure will be always re-attached to the same volume or not. Essentially I know that this can be a case for Statefulset but I am trying to understand if this can be also achieved with PVC and PV. Essentially assuming that a Pod_A is attached to Volume_X, then Pod_A fails but in the meantime a Volume_Y was added to the cluster that can potentially fulfil the PVC requirements. So what does it happen when Pod_A is re-created, does it get always mounted to Volume_X or is there any chance that it gets mounted to the new Volume_Y?
a pod that is using a volume after a failure will be always re-attached to the same volume or not
yes, the Pod will be re-attached to the same volume, because it still has the same PVC declared in its manifest.
Essentially assuming that a Pod_A is attached to Volume_X, then Pod_A fails but in the meantime a Volume_Y was added to the cluster that can potentially fulfil the PVC requirements.
The Pod still has the same PVC in its manifest, so it will use the same volume. But if you create a new PVC, it might be bound to the new volume.
So what does it happen when Pod_A is re-created, does it get always mounted to Volume_X or is there any chance that it gets mounted to the new Volume_Y?
The Pod still has the same PVC in its manifest, so it will use the volume that is bound by that PVC. Only when you create a new PVC, that claim can be bound the new volume.

Kubernetes: hostPath Static Storage with PV vs hard coded hostPath in Pod Volume

I'm learning Kubernetes and there is something I don't get well.
There are 3 ways of setting up static storage:
Pods with volumes you attach diretctly the storage to
Pods with a PVC attached to its volume
StatefulSets with also PVC inside
I can understand the power of PVC when working together with StorageClass, but not when working with static storage and local storage like hostPath
To me, it sounds very similar:
In the first case I have a volume directly attached to a pod.
In the second case I have a volume statically attached to a PVC, which is also manually attached to a Pod. In the end, the volume will be statically attached to the Pod.
On both cases, the data will remain when the Pod is terminates and will be adopted by the next Pod which the corresponing definition, right?
The only profit I see from using PVCs over plain Pod is that you can define the acces mode. Apart of that. Is there a difference when working with hostpath?
On the other hand, the advantage of using a StatefulSet instead of a PVC is (if understood properly) that it get a headless service, and that the rollout and rollback mechanism works differently. Is that the point?
Thank you in advance!
Extracted from this blog:
The biggest difference is that the Kubernetes scheduler understands
which node a Local Persistent Volume belongs to. With HostPath
volumes, a pod referencing a HostPath volume may be moved by the
scheduler to a different node resulting in data loss. But with Local
Persistent Volumes, the Kubernetes scheduler ensures that a pod using
a Local Persistent Volume is always scheduled to the same node.
Using hostPath does not garantee that a pod will restart on the same node. So you pod can attach /tmp/storage on k8s-node-1, then if you delete and re-create the pod, it may attach tmp/storage on k8s-node-[2-n]
On the contrary, if you use PVC/PV with local persistent storage class, then if you delete and re-create a pod, it will stick on the node which handle the local persistent storage.
StatefulSet creates pods and has volumeClaimTemplate field, which creates a dedicated PVC for each pod. So each pod created by the statefulSet will have its own dedicated storage, linked with Pod->PVC->PV->Storage. So StatefulSet use also the PVC/PV mechanism.
More details are available here.

Does the storage class dynamically provision persistent volume per pod?

Kubernetes newbie here, so my question might not make sense. Please bear with me.
So my question is, given I have setup Storage Class in my cluster, then I have a PVC (Which uses that Storage Class). If I use that PVC into my Deployment, and that Deployment have 5 replicas, will the Storage Class create 5 PV? one per Pod? Or only 1 PV shared by all Pods under that Deployment?
Edit: Also I have 3 Nodes in this cluster
Thanks in advance.
The Persistent Volume Claim resource is specified separately from a deployment. It doesn't matter how many replicas the deployment has, kubernetes will only have the number of PVC resources that you define.
If you are looking for multiple stateful containers that create their own PVC's, use a StatefulSet instead. This includes a VolumeClaimTemplate definition.
If you want all deployment replicas to share a PVC, the storage class provider plugin will need to be either ReadOnlyMany or ReadWriteMany
To answer my question directly.
The Storage Class in this case will only provision one PV and is shared across all pods under the Deployment which uses that PVC.
The accessModes of the PVC does not dictate whether to create one PV for each pod. You can set the accessModes to either ReadWriteOnce/ReadOnlyMany/ReadWriteMany and it will always create 1 PV.
If you want that each Pod will have its own PV, you can not do that under a Deployment
You will need to use StatefulSet using volumeClaimTemplates.
It is Important that the StatefulSet uses volumeClaimTemplates or else, it will still act the same as the Deployment, that is the Storage Class will just provision one PV that is shared across all pods under that StatefulSet.
References:
Kubernetes Deployments vs StatefulSets
Is there a way to create a persistent volume per pod in a kubernetes deployment (or statefulset)?

Where I can find Kubernetes PV on the host filesystem?

I am trying to understand how Kubernetes handles the persistent volumes on the node's filesystem.
For example, if I have a minikube as my Kubernetes cluster node, and I create multiple PVs with PVC for may pods and if I ssh to minikube, where I can find the PV on minikube's filesystem?
If I type
lsblk
I get
sda 8:0 0 19.5G 0 disk
but no PV disks are listed.
Thank you for your answers.
You will not see it because it's inside API as an API Object.
I recommend reading Kubernetes documentation regarding Persistent Volumes.
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as performance, for different problems. Cluster administrators need to be able to offer a variety of PersistentVolumes that differ in more ways than just size and access modes, without exposing users to the details of how those volumes are implemented. For these needs there is the StorageClass resource.
Please see the detailed walkthrough with working examples.
You can also have a look at the Kubernetes Volumes Guide which explains the types of storage, how long do they last and how to use them in examples.
Because they are hostPath, you will not see them in lsblk. Use "kubectl describe pv PV_NAME" to understand where they are located.