How do you reuse a volume in Kubernetes? - kubernetes

Let's say that you wanted to create a Jenkins Deployment. As Jenkins uses a local XML file for configuration and state, you would want to create a PersistentVolume so that your data could be saved across Pod evictions and Deployment deletions. I know that the Retain reclaimPolicy will result in the data persisting on the detached PersistentVolume, but the documentation says this is just so that you can manually reclaim the data on it later on, and seems to say nothing about the volume being automatically reused if its mounting Pods are ever brought back up.
It is difficult to articulate what I am even trying to ask, so forgive me if this seems like a nebulous question, but:
If you delete the Jenkins deployment, then later decide to recreate it where you left off, how do you get it to re-mount that exact PersistentVolume on which that specific XML configuration is still stored?
Is this a case where you would want to use a StatefulSet? It seems like, in this case, Jenkins would be considered "stateful."
Is the PersistentVolumeClaim the basis of a volume's "identity"? In other words, is the expectation for the PersistentVolumeClaim to be the stable identifier by which an application can bind to a specific volume with specific data on it?

you can use stateful sets. scaling down deletes the pod, leaving the claims alone. Persistent volume claims can be deleted only manually, in order to release the underlying PersistentVolume
a scale-up can reattach the same claim along with the bound Persistent Volume and its contents to the newly created pod instance.
If you have accidentally scaled down a StatefulSet, you can scale up again and the new pod will have the same persisted state again.

If you delete the Jenkins deployment, then later decide to recreate it
where you left off, how do you get it to re-mount that exact
PersistentVolume on which that specific XML configuration is still
stored?
By using the PersistentVolumeClaim that was bound to that PersistentVolume, assuming the PersistentVolumeClaim and its PersistentVolume haven't been deleted. You should be able to try it :-)
Is this a case where you would want to use a StatefulSet? It seems
like, in this case, Jenkins would be considered "stateful."
Yes, you could use StatefulSet for its stable storage. With no need for persistent identities and stable hostnames, though, I'm not sure of the benefits compared to a master and dynamic slaves Deployment. Unless the idea is to partition the work (e.g. "areas" of the source control repo) across several Jenkins masters and their slaves...
Is the PersistentVolumeClaim the basis of a volume's "identity"? In
other words, is the expectation for the PersistentVolumeClaim to be
the stable identifier by which an application can bind to a specific
volume with specific data on it?
Yes (see my answer to the first question) - the PersistentVolumeClaim is like a stable identifier by which an application can mount the specific volume the claim is bound to.

Related

Restore a single StatefulSet pod's PersistentVolume using a VolumeSnapshot

Hi we're looking for a way to restore a VolumeSnapshot to a single pod of a StatefulSet without scaling down the StatefulSet.
I think it might be doable by deleting the PersistentVolumeClaim, PersistentVolume, and then pod, and making sure a new pvc is available with the right name.
Is there another more direct way to do this? In kubernetes, is it safe to reprovision things like PVs by destroying them and expecting the controller to recreate the resources (is this approach fundamentally safe)?
Thanks.
The answer: Yes it is. There are several approaches:
The approach above: prepare the PVC ahead of time as described, or
Set the dataSource field in the statefulset to be a volume snapshot. To restore a replica, simply delete the pvc and the pod, and when the pod is reinitialized it will restore from the snapshot. You will need to delete and recreate this volume snapshot to update the "active" snapshot however - its not possible to modify the value of the volumeClaimTemplates after the stateful set is created.

Do I have to recreate already bound PersistentVolumes after reconfiguring StorageClass

I have created several PersistenVolumes through PersistentColumeClaims on the "azure-file" StorageClass in AKS. Now the mount options of the StorageClass as it is delivered by Azure didn't fit our needs, and I had to update/reconfigure it with different MountOptions.
Do I have to manually destroy bound PersistentVolumes now to force a recreation and a reconfiguration (different mount) or is the provisioner taking care of that?
What would be the best way of forcing that?
Delete the PersistentVolume itself?
Delete the Claim?
Delete the where the volumes are bound (I guess not)
Delete and recreate the whole StatefulSet?
#SahadatHossain is right with his answer but I would like to expand it a bit with more details and sources.
It is important to understand the Lifecycle of a volume and claim. The interaction between PVs and PVCs follows this lifecycle:
Provisioning - which can be static or dynamic.
Binding
Using
Reclaiming
The Reclaiming step brings us to your actual use case:
When a user is done with their volume, they can delete the PVC objects
from the API that allows reclamation of the resource. The reclaim
policy for a PersistentVolume tells the cluster what to do with the
volume after it has been released of its claim. Currently, volumes can
either be Retained, Recycled, or Deleted.
Retain - The Retain reclaim policy allows for manual reclamation of the resource.
Delete - For volume plugins that support the Delete reclaim policy, deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset in the external infrastructure.
Recycle - If supported by the underlying volume plugin, the Recycle reclaim policy performs a basic scrub (rm -rf /thevolume/*) on the volume and makes it available again for a new claim. Warning: The Recycle reclaim policy is deprecated. Instead, the recommended approach is to use dynamic provisioning.
When it comes to updating Pod specs, you can consider Updating a Deployment (if possible) with a various update strategies like for example Rolling Update:
The Deployment updates Pods in a rolling update fashion when
.spec.strategy.type==RollingUpdate. You can specify maxUnavailable
and maxSurge to control the rolling update process.
Basically if you delete a PVC then the state of PV will be according to it's ReclaimPolicy. PV can have three reclaim policies, named: Retain, Recycle, and Delete.
For Delete, the PV will be deleted automatically when the respected PVC is deleted. But remember a pv cannot be deleted without deleted it's bounded pvc. Also for dynamic provisioning the default policy is Delete. Again, pvc cannot be deleted if currently any pod is using it.
Now, things depends on you.

Attach new azure disk volume per pod in Kubernetes deployment

I have a Kubernetes Deployment app with 3 replicas, which needs a 7GB storage for each replica, I want to be able to attach a new empty azureDisk storage to be mounted into each pod/replica created in this deployment.
Basically I have the following restrictions:
I must use Deployment, not a Statefulset
Each time a pod dies and a new pod is up, it shouldn't have a state, and it will have a new empty azureDisk attached to it.
the pods do not share their storage, each pod has its own 7GB storage.
the pods need to use azureDisk because I need a 7GB storage on demand, which means, dynamically creating azureStorage when I scale my deployment replicas.
When using azureDisk, I need to use it with Access mode type ReadWriteOnce (as says in the docs ) and it will attach the only 1 pod to this disk, that's found, but, that only works if I have 1 pod, if I have more than 1 pod, I can't use the same claim... is there any way to dynamically ask for more storages like the one in the first claim?
NOTE 1: I know there is a volumeClaimTemplates, but that's only related to a Statefulset.
NOTE 2: I don't care if a pod restarts 100 times, and this in turn creates 100 PV which only 1 is used, that is fine.
I'm not sure why you need to use a StatefulSet but the only I see to do this is to create your own operator for your application. The operator would have a controller that manages your pods similar to what a ReplicaSet does but with the exception that for every new pod that is instantiated a new PVC is created.
It might just be better to figure out how to run your application in a StatefulSet and use VolumeClaimTemplates
✌️
The main question is - Why? "if I have an application which doesn't have state, still I need a large volume for each pod"
Looking at this explanation you should focus on StateFull application. From my point of view it looks like you are forcing to use Deployment instead of StateFullSet for StateFull application
In your example probably you need pv which support different access modes.
The main problem you have experienced is that using pv with supported mode ReadWriteOnce you can bind at the same time only one pv by single node. So your pods in different nodes will not start due to failing volume mounting. You can use this approach only for ReadOnlyMany/ReadWriteMany scenario.
Please refer to other providers which have different capabilities for access modes like: filestore(gcp), AzureFile(azure), Glusterfs, NFS
Deployments vs. StatefulSets

What happens with recreate strategy for database deployment and Kubernetes?

The setup is on GCP GKE. I deploy a Postgres database with a persistent volume (retain reclaim policy), and:
strategy:
type: Recreate
Will the data be retained or re-initialized if the database pod gets deleted?
The update strategy has nothing to do with the on-delete behavior. That's used when a change to the pod template triggers an update. Basically does it nuke the old ReplicaSet all at once or gradually scale things up/down. You almost always way RollingUpdate unless you are working with software that requires all nodes be on exactly the same version and understand this will cause downtime on any change.
As for the Retain volume mode, this is mostly a safety net for admins. Assuming you used a PVC, deleting the pod will have no effect on the data since the volume is tied to the claim rather than the pod itself (obviously things will go down while the pod restarts but that's unrelated). If you delete the PVC, a Retain volume will be kept on the backend but if you wanted to do anything with it you would have to go in and do it manually. It's like a "oops" protection, requires two steps to actually delete the data.
The update strategy has nothing to do with the on-delete behavior. <...>
deleting the pod will have no effect on the data since the volume is tied to the claim rather than the pod itself
I totally agree with coderanger, you should consider the data from Postgres independently. Usually, people create a separate volume (with the PVC) mounted on /usr/local/pgsql/data. When you delete/re-create a new Postgres pod, you still claim the same volume to mount it back without affecting your data.

Persistent Volumes & Claims & Replicas in Helm recommended approach

I'm trying to get my head around Persistent Volumes & Persistent Volume Claims and how it should be done in Helm...
The TLDR version of the question is: How do I create a PVC in helm that I can attach future releases (whether upgrades or brand new installs) to?
My current understanding:
PV is an interface to a piece of physical storage.
PVC is how a pod claims the existence of a PV for its own use. When the pod is deleted, the PVC is also deleted, but the PV is maintained - and is therefore persisted. But then how I do use it again?
I know it is possible to dynamically provision PVs. Like with Google Cloud as an example if you create ONLY a PVC, it will automatically create a PV for you.
Now this is the part I'm stuck on...
I've created a helm chart that explicitly creates the PVC & thus has a dynamically created PV as part of a release. I then later delete the release, which will then also remove the PVC. The cloud provider will maintain the PV. On a subsequent install of the same chart with a new release... How do I reuse the old PV? Is there a way to actually do that?
I did find this question which kind of answers it... However, it implies that you need to pre-create PVs for each PVC you're going to need, and the whole point of the replicas & auto-scaling is that all of those should be generated on demand.
The use case is - as always - for test/dev environments where I want my data to be persisted, but I don't always want the servers running.
Thank you in advance! My brain's hurting a bit cause I just can't figure it out... >.<
It will be a headache indeed.
Let's start with how you should do it to achieve scalable deployments with RWO storages that are attached to your singular pods when they are brought up. This is where volumeClaimTemplates come into play. You can have PVC created dynamicaly as your Deployment scales. This however suits well situation when your pod needs storage that is attached to a pod, but not really needed any longer when pod goes away (volume can be reused following reclaim policy.
If you need the data like this reatached when pod fails, you should think of StatefulSets which solve that part at least.
Now, if you precreate PVC explicitly, you have more control over what happens, but dynamic scalability will have problems with this for RWO. This and manual PV management as in the response you linked can actually achieve volume reuse, and it's the only mechanism that would allow it that I can think of.
After you hit a wall like this it's time to think about alternatives. For example, why not use a StatefulSet that will give you storage retention in running cluster and instead of deleting the chart, set all it's replicas to 0, retaining non-compute resources in place but scaling it down to nothing. Then when you scale up a still bound PVC should get reattached to rescaled pods.