I'm studying k8s and got a question about PV and PVC binding.
PVC defines the specs it wants (capacity, access mode etc..) in the YAML file
and find appropriate PV in the cluster to bind each other.
Here, let's say our PVC wants at least 5GB capacity and RWO (ReadWriteOnce) mode.
And there are two PVs
PV1: 5GB, RWO
PV2: 10GB, RWO
which one would bind to the PVC? Both of them meets the spec of PVC.
Plus, what if we the pod fails and recreated?
If PV works as we want(in retain mode), I think the same PV should be bound to the PVC(pod) again to preserve the data. Does k8s guarantees this work?
If there's something ambiguous in my question, please let me know.
Thank you.
which one would bind to the PVC? Both of them meets the spec of PVC.
You cannot specify "at least 5 GiB" of storage. The number provided in the PVC specification will always be a concrete value and a PV that better fits the requirement should be the one bound. In this case, it will be PV1: 5GiB RWO.
If PV works as we want(in retain mode), I think the same PV should be bound to the PVC(pod) again to preserve the data. Does k8s guarantees this work
Yes, it is guaranteed. However, you will first need to ensure that you manually 'bind' the PVC to the PV using reservation.
Also, understand that a pod dying/restarting has no effect on a PVC->PV mapping. That is the entire point of having PersistentVolumes in the first place, they should be isolated from crashes in the pods that mount them. As soon as the pod comes back up, the PVC will be mounted as a volume again, and everything will be restored.
You can always learn more from the official documentation.
Related
I have created several PersistenVolumes through PersistentColumeClaims on the "azure-file" StorageClass in AKS. Now the mount options of the StorageClass as it is delivered by Azure didn't fit our needs, and I had to update/reconfigure it with different MountOptions.
Do I have to manually destroy bound PersistentVolumes now to force a recreation and a reconfiguration (different mount) or is the provisioner taking care of that?
What would be the best way of forcing that?
Delete the PersistentVolume itself?
Delete the Claim?
Delete the where the volumes are bound (I guess not)
Delete and recreate the whole StatefulSet?
#SahadatHossain is right with his answer but I would like to expand it a bit with more details and sources.
It is important to understand the Lifecycle of a volume and claim. The interaction between PVs and PVCs follows this lifecycle:
Provisioning - which can be static or dynamic.
Binding
Using
Reclaiming
The Reclaiming step brings us to your actual use case:
When a user is done with their volume, they can delete the PVC objects
from the API that allows reclamation of the resource. The reclaim
policy for a PersistentVolume tells the cluster what to do with the
volume after it has been released of its claim. Currently, volumes can
either be Retained, Recycled, or Deleted.
Retain - The Retain reclaim policy allows for manual reclamation of the resource.
Delete - For volume plugins that support the Delete reclaim policy, deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset in the external infrastructure.
Recycle - If supported by the underlying volume plugin, the Recycle reclaim policy performs a basic scrub (rm -rf /thevolume/*) on the volume and makes it available again for a new claim. Warning: The Recycle reclaim policy is deprecated. Instead, the recommended approach is to use dynamic provisioning.
When it comes to updating Pod specs, you can consider Updating a Deployment (if possible) with a various update strategies like for example Rolling Update:
The Deployment updates Pods in a rolling update fashion when
.spec.strategy.type==RollingUpdate. You can specify maxUnavailable
and maxSurge to control the rolling update process.
Basically if you delete a PVC then the state of PV will be according to it's ReclaimPolicy. PV can have three reclaim policies, named: Retain, Recycle, and Delete.
For Delete, the PV will be deleted automatically when the respected PVC is deleted. But remember a pv cannot be deleted without deleted it's bounded pvc. Also for dynamic provisioning the default policy is Delete. Again, pvc cannot be deleted if currently any pod is using it.
Now, things depends on you.
I have a persistent volume.
I want to force Kubernetes to recreate it, as the contents is corrupted. Alternatively, if there's a way to fix that, it would be a solution.
I have checked that the persistent volume is working as expected using:
kubectl describe pv -n
And my pod was previously using it. However, my pod is now failing due to a corrupted file within the persistent volume.
I would like to recreate the persistent volume.
If I delete the persistent volume, will Kubernetes create a new one, or will I have to manually create a new one to attach?
If you delete a persistent volume then kubernetes will not create a new one for you, you have to manually create a new one. Basically it is the simple answer of your question.
But there are basically three options when you are done with your pv, you can delete the PVC object then depending on the PV reclaim policy you will have three options: Delete, Retain, Recycle. Now it depends on what policy is set in your pv reclaim policy.
As kubernetes official docs stated:
When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted.
for more you can look at the persistent volume docs of kubernetes.
You should be able to check if the volume is in usable state by accessing it from the host on which the volume is present. Simply try creating and reading a file in it to check.
You could also do a fsck on the block device to check if the Filesystem can be fixed. For example: # fsck /dev/sda3
If it is corrupted for good, the only way would be to recover from backup if available. Or else the data is gone, and you'd need to create a new volume.
Volume creation in Kubernetes can be done manually. When you use options such as hostPath, awsElasticBlockStore, etc., under volumes section of the pod definition, volume creation is static. In this case the volume which must be already present is assigned to the pod - Kubernetes will not create new volume for the pod.
If you want volumes to be created dynamically, then you must use Persistent Volume Claim under volume section of the pod definition, combined with Storage Classes. Storage Classes allow to use provisioners such as AWSElasticBlockStore, AzureFile, AzureDisk, CephFS, GlusterFS, etc., which provision volume on demand.
I have a Kubernetes Deployment app with 3 replicas, which needs a 7GB storage for each replica, I want to be able to attach a new empty azureDisk storage to be mounted into each pod/replica created in this deployment.
Basically I have the following restrictions:
I must use Deployment, not a Statefulset
Each time a pod dies and a new pod is up, it shouldn't have a state, and it will have a new empty azureDisk attached to it.
the pods do not share their storage, each pod has its own 7GB storage.
the pods need to use azureDisk because I need a 7GB storage on demand, which means, dynamically creating azureStorage when I scale my deployment replicas.
When using azureDisk, I need to use it with Access mode type ReadWriteOnce (as says in the docs ) and it will attach the only 1 pod to this disk, that's found, but, that only works if I have 1 pod, if I have more than 1 pod, I can't use the same claim... is there any way to dynamically ask for more storages like the one in the first claim?
NOTE 1: I know there is a volumeClaimTemplates, but that's only related to a Statefulset.
NOTE 2: I don't care if a pod restarts 100 times, and this in turn creates 100 PV which only 1 is used, that is fine.
I'm not sure why you need to use a StatefulSet but the only I see to do this is to create your own operator for your application. The operator would have a controller that manages your pods similar to what a ReplicaSet does but with the exception that for every new pod that is instantiated a new PVC is created.
It might just be better to figure out how to run your application in a StatefulSet and use VolumeClaimTemplates
✌️
The main question is - Why? "if I have an application which doesn't have state, still I need a large volume for each pod"
Looking at this explanation you should focus on StateFull application. From my point of view it looks like you are forcing to use Deployment instead of StateFullSet for StateFull application
In your example probably you need pv which support different access modes.
The main problem you have experienced is that using pv with supported mode ReadWriteOnce you can bind at the same time only one pv by single node. So your pods in different nodes will not start due to failing volume mounting. You can use this approach only for ReadOnlyMany/ReadWriteMany scenario.
Please refer to other providers which have different capabilities for access modes like: filestore(gcp), AzureFile(azure), Glusterfs, NFS
Deployments vs. StatefulSets
I'm trying to get my head around Persistent Volumes & Persistent Volume Claims and how it should be done in Helm...
The TLDR version of the question is: How do I create a PVC in helm that I can attach future releases (whether upgrades or brand new installs) to?
My current understanding:
PV is an interface to a piece of physical storage.
PVC is how a pod claims the existence of a PV for its own use. When the pod is deleted, the PVC is also deleted, but the PV is maintained - and is therefore persisted. But then how I do use it again?
I know it is possible to dynamically provision PVs. Like with Google Cloud as an example if you create ONLY a PVC, it will automatically create a PV for you.
Now this is the part I'm stuck on...
I've created a helm chart that explicitly creates the PVC & thus has a dynamically created PV as part of a release. I then later delete the release, which will then also remove the PVC. The cloud provider will maintain the PV. On a subsequent install of the same chart with a new release... How do I reuse the old PV? Is there a way to actually do that?
I did find this question which kind of answers it... However, it implies that you need to pre-create PVs for each PVC you're going to need, and the whole point of the replicas & auto-scaling is that all of those should be generated on demand.
The use case is - as always - for test/dev environments where I want my data to be persisted, but I don't always want the servers running.
Thank you in advance! My brain's hurting a bit cause I just can't figure it out... >.<
It will be a headache indeed.
Let's start with how you should do it to achieve scalable deployments with RWO storages that are attached to your singular pods when they are brought up. This is where volumeClaimTemplates come into play. You can have PVC created dynamicaly as your Deployment scales. This however suits well situation when your pod needs storage that is attached to a pod, but not really needed any longer when pod goes away (volume can be reused following reclaim policy.
If you need the data like this reatached when pod fails, you should think of StatefulSets which solve that part at least.
Now, if you precreate PVC explicitly, you have more control over what happens, but dynamic scalability will have problems with this for RWO. This and manual PV management as in the response you linked can actually achieve volume reuse, and it's the only mechanism that would allow it that I can think of.
After you hit a wall like this it's time to think about alternatives. For example, why not use a StatefulSet that will give you storage retention in running cluster and instead of deleting the chart, set all it's replicas to 0, retaining non-compute resources in place but scaling it down to nothing. Then when you scale up a still bound PVC should get reattached to rescaled pods.
I'm running a MySQL deployment on Kubernetes however seems like my allocated space was not enough, initially I added a persistent volume of 50GB and now I'd like to expand that to 100GB.
I already saw the a persistent volume claim is immutable after creation, but can I somehow just resize the persistent volume and then recreate my claim?
Yes, as of 1.11, persistent volumes can be resized on certain cloud providers. To increase volume size:
Edit the PVC (kubectl edit pvc $your_pvc) to specify the new size. The key to edit is spec.resources.requests.storage:
Terminate the pod using the volume.
Once the pod using the volume is terminated, the filesystem is expanded and the size of the PV is increased. See the above link for details.
It is possible in Kubernetes 1.9 (alpha in 1.8) for some volume types: gcePersistentDisk, awsElasticBlockStore, Cinder, glusterfs, rbd
It requires enabling the PersistentVolumeClaimResize admission plug-in and storage classes whose allowVolumeExpansion field is set to true.
See official docs at https://kubernetes.io/docs/concepts/storage/persistent-volumes/#expanding-persistent-volumes-claims
Update: volume expansion is available as a beta feature starting Kubernetes v1.11 for in-tree volume plugins. It is also available as a beta feature for volumes backed by CSI drivers as of Kubernetes v1.16.
If the volume plugin or CSI driver for your volume support volume expansion, you can resize a volume via the Kubernetes API:
Ensure volume expansion is enabled for the StorageClass (allowVolumeExpansion: true is set on the StorageClass) associated with your PVC.
Request a change in volume capacity by editing your PVC (spec.resources.requests).
For more information, see:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#expanding-persistent-volumes-claims
https://kubernetes-csi.github.io/docs/volume-expansion.html
No, Kubernetes does not support automatic volume resizing yet.
Disk resizing is an entirely manual process at the moment.
Assuming that you created a Kubernetes PV object with a given capacity and the PV is bound to a PVC, and then attached/mounted to a node for use by a pod. If you increase the volume size, pods would continue to be able to use the disk without issue, however they would not have access to the additional space.
To enable the additional space on the volume, you must manually resize the partitions. You can do that by following the instructions here. You'd have to delete the pods referencing the volume first, wait for it to detach, than manually attach/mount the volume to some VM instance you have access to, and run through the required steps to resize it.
Opened issue #35941 to track the feature request.
There is some support for this in 1.8 and above, for some volume types, including gcePersistentDisk and awsBlockStore, if certain experimental features are enabled on the cluster.
For other volume types, it must be done manually for now. In addition, support for doing this automatically while pods are online (nice!) is coming in a future version (currently slated for 1.11):
For now, these are the steps I followed to do this manually with an AzureDisk volume type (for managed disks) which currently does not support persistent disk resize (but support is coming for this too):
Ensure PVs have reclaim policy "Retain" set.
Delete the stateful set and related pods. Kubernetes should release the PVs, even though the PV and PVC statuses will remain Bound. Take special care for stateful sets that are managed by an operator, such as Prometheus -- the operator may need to be disabled temporarily. It may also be possible to use Scale to do one pod at a time. This may take a few minutes, be patient.
Resize the underlying storage for the PV(s) using the Azure API or portal.
Mount the underlying storage on a VM (such as the Kubernetes master) by adding them as a "Disk" in the VM settings. In the VM, use e2fsck and resize2fs to resize the filesystem on the PV (assuming an ext3/4 FS). Unmount the disks.
Save the JSON/YAML configuration of the associated PVC.
Delete the associated PVC. The PV should change to status Released.
Edit the YAML config of the PV, after which the PV status should be Available:
specify the new volume size in spec.capacity.storage,
remove the spec.claimref uid and resourceVersion fields, and
remove status.phase.
Edit the saved PVC configuration:
remove the metadata.resourceVersion field,
remove the metadata pv.kubernetes.io/bind-completed and pv.kubernetes.io/bound-by-controller annotations, and
change the spec.resources.requests.storage field to the updated PV size, and
remove all fields inside status.
Create a new resource using the edited PVC configuration. The PVC should start in Pending state, but both the PV and PVC should transition relatively quickly to Bound.
Recreate the StatefulSet and/or change the stateful set configuration to restart pods.
In terms of PVC/PV 'resizing', that's still not supported in k8s, though I believe it could potentially arrive in 1.9
It's possible to achieve the same end result by dealing with PVC/PV and (e.g.) GCE PD though..
For example, I had a gitlab deployment, with a PVC and a dynamically provisioned PV via a StorageClass resource. Here are the steps I ran through:
Take a snapshot of the PD (provided you care about the data)
Ensure the ReclaimPolicy of the PV is "Retain", patch if necessary as detailed here: https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/
kubectl describe pv <name-of-pv> (useful when creating the PV manifest later)
Delete the deployment/pod (probably not essential, but seems cleaner)
Delete PVC and PV
Ensure PD is recognised as being not in use by anything (e.g. google console, compute/disks page)
Resize PD with cloud provider (with GCE, for example, this can actually be done at an earlier stage, even if the disk is in use)
Create k8s PersistentVolume manifest (this had previously been done dynamically via the use of the StorageClass resource). In the PersistentVolume yaml spec, I had "gcePersistentDisk: pdName: <name-of-pd>" defined, along with other details that I'd grabbed at step 3. make sure you update the spec.capacity.storage to the new capacity you want the PV to have (although not essential, and has no effect here, you may want to update the storage capacity/value in your PVC manifest, for posterity)
kubectl apply (or equivalent) to recreate your deployment/pod, PVC and PV
note: some steps may not be essential, such as deleting some of the existing deployment/pod.. resources, though I personally prefer to remove them, seeing as I know the ReclaimPolicy is Retain, and I have a snapshot.
The first thing you can do is, check for the storage class that you are using, see if allowVolumeExpansion is set to `true. If yes then simply update PVC with requested volume and check for status in PVCs.
If this doesn't work for you then try this (for AWS users).
Check for the attached volume id in the PV (under awsElasticBlockStore -> `volume).
Go to Volumes in AWS, and modify volume to whatever is required
SSH into the node to which is volume is currently attached (to find node name describe pod and check for node key)
use lsblk to list the volume attached
Run resize2fs or xfs_growfs based on what type of volume you have.
exec into the pod run df -h and check the volume.
Note: You can only modify a volume once in 6 hours.
Edit the PVC (kubectl edit pvc $your_pvc) to specify the new size. The key to edit is spec.resources.requests.storage:
Even though this answer worked quite well for one pvc of my statefulset, the others didn't managed to resize. I guess it's because the pods restarted too quick, leaving no time for the resizing process to start due to the backoff. In fact, the pods started fast but took some time to be considered as ready (increasing backoff).
Here's my workaround:
Update the pvc
Backup the sts spec
k get sts <sts-name> -o yaml > sts.yaml
Then delete the sts with cascade=orphan. Thus, the pods will still run
kubectl delete sts --cascade=orphan <sts-name>
Then delete one of the pod whose pvc wouldn't resize
kubectl delete pod <pod-name>
Wait for the pvc to resize
kubectl get pvc -w
Reapply the sts so the pod comes back
kubectl apply -f sts.yaml
Wait for the pod to come back
Repeat until all pvc are resized!
Below is how we can expand the volume size of azure disks mounted on statefulset(STS) pod when storage class is used.(AWS EBS and GCP Persistent volumes should be similar).
Summary:
Delete the statefulset.
Update the volume size on the PVC. Wait till the condition message prompts to start up the pods.
Apply new statefulset with updated volume size and you should see the volume getting resized when the pod starts up.
Complete Steps:
Check if volume resize is enabled in the storage class.
kubectl get storageclass
First, delete the statefulset. This is required because
The volumes should be unmounted and detached from the node before it can be resized.
The volume size on the STS YAML is immutable (cant be updated).
We will have to create a new STS with higher volume size later on. Don't forget to backup the STS YAML if you don't have it in your repo's.
After deleting the STS, wait for some time so that k8s can detach the volume from the node.
Next, modify the PVC with higher value for volume size.
At this point, if the volume is still attached, you will see below warning message in the PVC events.
Either the volume is still mounted to the pod or you just have to wait and give some time to k8s.
Next, run the describe command on the PVC, you should now see a message(in conditions) prompting you to start up the pod.
kubectl describe pvc app-master-volume-app-master-0
In the earlier step, we had deleted the statefulset. Now we need to create and apply a new STS with higher volume size. This should match the value earlier modified in the PVC spec.
When the new pod gets created, you will see pod event like shown below which indicates that the volume resize is successful.
Yes, it can be, after version 1.8, have a look at volume expansion here
Volume expansion was introduced in v1.8 as an Alpha feature
I have persistent volume with self created StorageClass (allowVolumeExpansion: true).
PV spec: accessMode: readWriteOnce
PVC spec: same
When I upgrade PV, changes are not reflected in PVC.