Kubernetes persistent volume claim - Not sure what is the meaning of "capacity" - kubernetes

I am wondering what does "capacity" mean for Kubernetes persistent volume claim. Does that mean Kubernetes will schedule a pod with the volume you specified (e.g. 20 GB) on your cluster and that 20 GB space will be isolated or does it automatically scale down if you didn't fully use the 20 GB.

From the docs:
A PersistentVolumeClaim (PVC) is a request for storage by a user. It
is similar to a Pod. Pods consume node resources and PVCs consume PV
resources.
So, a PVC is not an actual storage but a request for the storage. A Persistent Volume (PV) is the actual storage:
A PersistentVolume (PV) is a piece of storage in the cluster that has
been provisioned by an administrator or dynamically provisioned using
Storage Classes.
A PVC resource gets bound to the actual PV resource if there is a matching PV available. The physical storage can then be used by the pods by using the reference to the PVC in the pods spec persistentVolumeClaim field under volumes.
I am wondering what does "capacity" mean for Kubernetes persistent
volume claim.
The PVC resource do not have a capacity field but has a storage field which implies that this PVC resource will bound to a PV object if the matching capacity on the PV object has at least the amount of storage requested by the PVC object if provisioned statically.
PVC can also be configured to allow dynamic provision which implies the actual storage will be provisioned as part of the PVC creation if there is not a matching PV already in the cluster.
Does that mean Kubernetes will schedule a pod with the volume you
specified (e.g. 20 GB) on your cluster and that 20 GB space will be
isolated or does it automatically scale down if you didn't fully use
the 20 GB.
If you have configured dynamic provisioning for the cluster and create a PVC object then a matching PV object is created internally and the PVC gets bound to the same. A pod needs to reference the PVC object to actually use it. Further, storage resources do not get resized automatically based on the actual consumption. Once created they will be persisted until deleted explicitly and they have the lifecycle independent of the pods.

Related

Does the Storage class need to be created in Kubernetes before referring them in PV/PVC?

I have a PV alpha-pv in the kubernetes cluster and have created a PVC matching the PV specs. The PV uses the Storage Class: slow. However, when I check the existence of Storage Class in Cluster there is no Storage Class existing and still my PVC was Bound to the PV.
How is this Possible when the Storage Class referred in the PV/PVC does not exists in the cluster?
If I don't mention the Storage Class in PVC, I get error message stating Storage Class Set. There is already an existing PV in the cluster which has RWO access modes, 1Gi storage size and with the Storage class named slow. But on checking the Storage Class details, there is no Storage Class resource in cluster.
If I add the Storage Class name slow in my PVC mysql-alpha-pvc, then the PVC binds to the PV. But I'm not clear how this happens when the Storage Class referred in PV/PVC named slow doesn't exist in the cluster.
Short answer
It depends.
Theory
One of the main purpose of using a storageClass is dynamic provisioning. That means that persistent volumes will be automatically provisioned once persistent volume claim requests for the storage: immediately or after pod using this PVC is created. See Volume binding mode.
Also:
A StorageClass provides a way for administrators to describe the
"classes" of storage they offer. Different classes might map to
quality-of-service levels, or to backup policies, or to arbitrary
policies determined by the cluster administrators. Kubernetes itself
is unopinionated about what classes represent. This concept is
sometimes called "profiles" in other storage systems.
Reference.
How it works
If for instance kubernetes is used in cloud (Google GKE, Azure AKS or AWS EKS), they have already had predefined storageClasses, for example this is from Google GKE:
$ kubectl get storageclasses
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
premium-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 27d
standard (default) kubernetes.io/gce-pd Delete Immediate true 27d
standard-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 27d
So you can create PVC's and refer to storageClass, PV will be created for you.
Another scenario which you faced is you can create PVC and PV with any custom storageClassName only for binding purposes. Usually it's used for testing something locally. This is also called static provisioning.
In this case you can create "fake" storage class which won't exist in kubernetes server.
Please see an example with such type of binding:
It defines the StorageClass name manual for the PersistentVolume,
which will be used to bind PersistentVolumeClaim requests to this
PersistentVolume.
Useful links:
Kubernetes storage classes
Kubernetes dynamic provisioning
Kubernetes persistent volumes
Hello I already faced the same challenge but solved,
Please Make sure :
Your PVC configuration ( RW mode, Size, Name) is matching what is in the PV configuration
Claim name in your Deployment is equal to your PVC
Scale your deployment to (0) then to (1) you will find that it is
working smoothly
if you are facing any challenges you could run ( kubectl get events ) to know what is the blocker.

Kubernetes: hostPath Static Storage with PV vs hard coded hostPath in Pod Volume

I'm learning Kubernetes and there is something I don't get well.
There are 3 ways of setting up static storage:
Pods with volumes you attach diretctly the storage to
Pods with a PVC attached to its volume
StatefulSets with also PVC inside
I can understand the power of PVC when working together with StorageClass, but not when working with static storage and local storage like hostPath
To me, it sounds very similar:
In the first case I have a volume directly attached to a pod.
In the second case I have a volume statically attached to a PVC, which is also manually attached to a Pod. In the end, the volume will be statically attached to the Pod.
On both cases, the data will remain when the Pod is terminates and will be adopted by the next Pod which the corresponing definition, right?
The only profit I see from using PVCs over plain Pod is that you can define the acces mode. Apart of that. Is there a difference when working with hostpath?
On the other hand, the advantage of using a StatefulSet instead of a PVC is (if understood properly) that it get a headless service, and that the rollout and rollback mechanism works differently. Is that the point?
Thank you in advance!
Extracted from this blog:
The biggest difference is that the Kubernetes scheduler understands
which node a Local Persistent Volume belongs to. With HostPath
volumes, a pod referencing a HostPath volume may be moved by the
scheduler to a different node resulting in data loss. But with Local
Persistent Volumes, the Kubernetes scheduler ensures that a pod using
a Local Persistent Volume is always scheduled to the same node.
Using hostPath does not garantee that a pod will restart on the same node. So you pod can attach /tmp/storage on k8s-node-1, then if you delete and re-create the pod, it may attach tmp/storage on k8s-node-[2-n]
On the contrary, if you use PVC/PV with local persistent storage class, then if you delete and re-create a pod, it will stick on the node which handle the local persistent storage.
StatefulSet creates pods and has volumeClaimTemplate field, which creates a dedicated PVC for each pod. So each pod created by the statefulSet will have its own dedicated storage, linked with Pod->PVC->PV->Storage. So StatefulSet use also the PVC/PV mechanism.
More details are available here.

Does the storage class dynamically provision persistent volume per pod?

Kubernetes newbie here, so my question might not make sense. Please bear with me.
So my question is, given I have setup Storage Class in my cluster, then I have a PVC (Which uses that Storage Class). If I use that PVC into my Deployment, and that Deployment have 5 replicas, will the Storage Class create 5 PV? one per Pod? Or only 1 PV shared by all Pods under that Deployment?
Edit: Also I have 3 Nodes in this cluster
Thanks in advance.
The Persistent Volume Claim resource is specified separately from a deployment. It doesn't matter how many replicas the deployment has, kubernetes will only have the number of PVC resources that you define.
If you are looking for multiple stateful containers that create their own PVC's, use a StatefulSet instead. This includes a VolumeClaimTemplate definition.
If you want all deployment replicas to share a PVC, the storage class provider plugin will need to be either ReadOnlyMany or ReadWriteMany
To answer my question directly.
The Storage Class in this case will only provision one PV and is shared across all pods under the Deployment which uses that PVC.
The accessModes of the PVC does not dictate whether to create one PV for each pod. You can set the accessModes to either ReadWriteOnce/ReadOnlyMany/ReadWriteMany and it will always create 1 PV.
If you want that each Pod will have its own PV, you can not do that under a Deployment
You will need to use StatefulSet using volumeClaimTemplates.
It is Important that the StatefulSet uses volumeClaimTemplates or else, it will still act the same as the Deployment, that is the Storage Class will just provision one PV that is shared across all pods under that StatefulSet.
References:
Kubernetes Deployments vs StatefulSets
Is there a way to create a persistent volume per pod in a kubernetes deployment (or statefulset)?

How can I copy data from a bound persistent volume?

I've got a persistent volume currently bound to a deployment, I've set the replica count to 0 which I was hoping would unbound the volume - so I could mount it on another pod but it remains in a Bound status.
How can I copy the data from this?
I'd like to transfer it via scp to another location.
I've got a persistent volume currently bound to a deployment, I've set the replica count to 0 which I was hoping would unbound the volume - so I could mount it on another pod but it remains in a Bound status.
"Bound" does not mean it is attached to a Node, nor to a Pod (which is pragmatically the same thing); Bound means that the cloud provider has created a Persistent Volume (out of thin air) in order to fulfill a Persistent Volume Claim for some/all of its allocatable storage. "Bound" relates to its cloud status, not its Pod status. That term exists because kubernetes supports reclaiming volumes, to avoid creating a new cloud storage object if there are existing ones that no one is claiming and fulfill the resource request boundaries.
There's nothing, at least in your question, that prevents you from launching another Pod (in the same Namespace as the Deployment) with a volumes: that points to the persistentVolumeClaim and it will launch that Pod with the volume just as it did in your Deployment. You can then do whatever you'd like in that Pod to egress the data.

who can create a persistent volume in kubernetes?

Its mentioned in kubernetes official website as below for PV and PVC.
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
who is adminstrator here? when they mention it in persistent volume perspective?
An administrator in this context is the admin of the cluster. Whomever is deploying the PV/PVC. (An operations engineer, system engineer, SysAdmin)
For example - an engineer can configure AWS Elastic File System to have space available in the Kubernetes cluster, then use a PV/PVC to make that available to a specific pod container in the cluster. This means that if the pod is destroyed for whatever reason, the data in the PVC persists and is available to other resources.