How to manual recover a PV - kubernetes

according to the official docs https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/ with the “Retain” policy a PV can be manually recovered . What does that actually mean and is there a tool how I can read the data from that "retained" PV and write it into to another PV , or does it mean you can mount that volume manual in order to gain access ?

The process to manually recover the volume is as below.
You can use the same PV to mount to different pod along with the data even after the PVC is deleted (PV must exist, will typically exist if the reclaim policy of storageclass is Retain)
Verify that PV is in released state. (ie no pvc has claimed it currently)
➜ ~ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-eae6acda-59c7-11e9-ab12-06151ee9837e 16Gi RWO Retain Released default/dhanvi-test-pvc gp2 52m
Edit the PV (kubectl edit pv pvc-eae6acda-59c7-11e9-ab12-06151ee9837e) and remove the spec.claimRef part. The PV claim would be unset like below.
➜ ~ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-eae6acda-59c7-11e9-ab12-06151ee9837e 16Gi RWO Retain Available gp2 57m
Then claim the PV using PVC as below.
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: dhanvi-test-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 16Gi
volumeName: "pvc-eae6acda-59c7-11e9-ab12-06151ee9837e"
Can be used in the pods as below.
volumes:
- name: dhanvi-test-volume
persistentVolumeClaim:
claimName: dhanvi-test-pvc
Update: Volume cloning might help https://kubernetes.io/blog/2019/06/21/introducing-volume-cloning-alpha-for-kubernetes/

There are three reclaim policies which define what happens with the persistent volume after deletion of the bound volume claim
Retain
Delete
Recycle
Delete means the persistent volume as well as the associated storage asset in the external infrastructure is deleted.
Recycle will clean up the volume rm -rf /thevolume/* and after that it will be available for new persistent volume claims.
Retain leaves persistent volume in state released which does not allow for new persistent volume claims to reclaim it. The whole reclaim process is manual. You need to delete the persistent volume yourself. You can backup the data from the storage asset and delete the data afterwards. Then you can either delete the storage asset or create a new persistent volume for this asset.
If you want to write the data to another persistent volume using Kubernetes you could use a Job to copy the data.
In that case make sure you use persistent volume access modes ROX - ReadOnlyMany or RWX - ReadWriteMany and start a Job running a container which claims the persistent volume to be backed-up using a selector and claim another destination backup volume. Then copy the data via the container.
Alternatively, you can do the backup outside Kubernetes. Your method does then depend on the type of storage asset you are using. E.g., if you are using NFS you could mount source and destination and copy the data via command line.
Both options I've framed are more or less manual backup strategy. If you aim for a more sophisticated backup strategy for production workloads you might have a look at Stash - Backup for your disks for production workloads in Kubernetes

Like stated in the answer by Tummala Dhanvi, the spec.claimRef section has to be tackled with. While removing the whole spec.claimRef can be useful if you have only one PV, this will prove very nasty if you have multiple PVs to be "rescued".
The first step is to ensure the PV has the Retain reclaim policy before deleting the PVC. You can edit or patch the PV to achieve that
kubectl edit pv pvc-73e9252b-67ed-4350-bed0-7f27c92ce826
find the spec.persistentVolumeReclaimPolicy key
input Retain for its value
save & exit
or, in one command kubectl patch pv pvc-73e9252b-67ed-4350-bed0-7f27c92ce826 -p "{\"spec\":{\"persistentVolumeReclaimPolicy\":\"Retain\"}}"
Now you can delete the PVC(s) (either by using helm or otherwise) and the PV(s) will not be deleted.
To successfully re-mount a PV to the desired pod you have to edit the PV configuration once again, this time the spec.claimRef section. But do not delete the whole section. Instead, delete only the resourceVersion and uid keys. The resulting section would then look something like this
...
capacity:
storage: 16Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: database
namespace: staging
nodeAffinity:
...
Repeat this for all of your PVs and their status in the kubectl get pv output will be Available afterwards. By leaving the spec.claimRef.name and spec.claimRef.namespace keys intact, we ensured that a new PVC with the corresponding spec (staging/database in my case), will be bound to the exact PV it is supposed to.
Also, make sure your new claim does not specify a larger storage capacity than the PV actually has (it seems though that the new claims' capacity may be less than the existing PV's). If the new PVC claims a larger storage, a new PV will be created instead. Best to keep it the same.
To digress: If the storageClass you're using allows for volume resizing, you can resize it later; here it's explained how: https://kubernetes.io/blog/2018/07/12/resizing-persistent-volumes-using-kubernetes/
My experience with this was pretty stressful. I've had 6 PVs, thakfully in Retain mode. For some reason a new deployment rollout got stuck, two pods just wouldn't want to terminate. In the end I ended up with deleting the whole deployment (using helm), restarting the cluster nodes, and then redeploying anew. This caused 6 new PVs to be created!
I found this thread, and went on to delete the spec.claimRef of all the PVs. Deleting and deploying the installation once again resulted in the PVs being reused, but they were not mounted where they supposed to, the data was not there. After a good amount of digging, I figured out that the database volume was mounted to a RabbitMQ pod, mongodb was mounted to ElasticSearch etc.
It took me about a dozen times around, to get this right. Luckily, for me the mixed-up mounting of volumes did not destroy any of the original data. The pods initializations did not clean-out the volume, just wrote their stuff there.
Hope this saves some seriuos headaches out there!

I found this question for slightly different reasons than the existing answers address, so I'll weigh in.
In my case I had a self-managed Microk8s cluster for dev and testing purposes with
manual, local storage (mounted path on one of the Nodes), like so:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-1
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data-1"
There was also a PVC and a Deployment using this PV. Then I nuked the namespace where these resources lived by accident.
What I wanted to achieve is to recreate the whole namespace and have it use this PersistentVolume with the same data that was still present on the server.
In my case of local, manual storage all it took was just to delete the PV and create it again. Deleting the PV does NOT delete the actual data on the server (at least with the Retain policy). Recreating the PV with a mount into a path where data already exists also works fine - the data is just used.

Related

GKE Can’t scale up nodes due of PersistentVolume

I'm getting a strange problem on my Terraformed GKE cluster,
I have a deployment that request a GcePersistentVolume with a PVC, when it got created, I have a Can’t scale up nodes notification on my GCloud console.
If I inspect the log, it say that :
reason: {
messageId: "no.scale.up.mig.failing.predicate"
parameters: [
0: ""
1: "pod has unbound immediate PersistentVolumeClaims"
Without creating this deployment, I have no Scale UP error at all.
The PVC in question :
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
finalizers:
- kubernetes.io/pvc-protection
name: nfs
namespace: nfs
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
volumeMode: Filesystem
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
phase: Bound
My deployment is running fine and the PV is directly created and bound to my PVC.
So I find this Can't Scale up Nodes really strange ?
(It's a single zone cluster, with a single NodePool).
Any idea for me ?
Thanks a lot
I'm having the same problem. It is weird because if you are creating a PVC in GKE, the PV is created dynamically (and indeed it is), so you go and check with kubectl get pv,pvc --all-namespaces and everything seems normal. But it seems that when a deployment (that uses a PVC) is created and while waiting for the creation of the PVC this error appears and the cluster acknowledges it and displays the alert (creating some false positive alerts). It seems like a timing issue.
One turnaround is to change the value of the storageClassName definition. If instead of standard you use standard-rwo (both appear as default in Storage Classes tab in Storage) the problem seems to disappear. The consequence of this is that the type of the underlying disk changes from Standard persistent disk to Balanced persistent disk. Anyhow, the latter one performs better.
EDIT:
It is about Storage Classes. The volumeBindingMode of the default standard class is Immediate. According to the documentation:
The Immediate mode indicates that volume binding and dynamic
provisioning occurs once the PersistentVolumeClaim is created. For
storage backends that are topology-constrained and not globally
accessible from all Nodes in the cluster, PersistentVolumes will be
bound or provisioned without knowledge of the Pod's scheduling
requirements. This may result in unschedulable Pods.
A cluster administrator can address this issue by specifying the
WaitForFirstConsumer mode which will delay the binding and
provisioning of a PersistentVolume until a Pod using the
PersistentVolumeClaim is created. PersistentVolumes will be selected
or provisioned conforming to the topology that is specified by the
Pod's scheduling constraints. These include, but are not limited to,
resource requirements, node selectors, pod affinity and anti-affinity,
and taints and tolerations.
So, if all the properties of the standard Storage class are required to be kept, another solution would be to create another Storage class:
Download the YAML of the standard Storage class
Change the name definition
Change the property from volumeBindingMode: Immediate to volumeBindingMode: WaitForFirstConsumer.
Apply it (kubectl apply -f <file path> )
And in the storageClassName definition of the PVC, change it to the name of the step #2

How to create a volume in kubernetes that is not destroyed when the pods die?

I have a docker image that when created should check if the volume is empty, in case it should initialize it with some data.
This saved data must remain available for other pods with the same or different image.
What do you recommend me to do?
You have 2 options:
First option is to mount the pod into the node and save the data in the node so when new pod will create in the same node it will have an access to the same volume (persistent storage location).
Potential problem: 2 pods on the same node can create deadlock for the same resource (so you have to manage the resource).
Shared storage meaning create one storage and every pod will claim storage in the same storage.
I strongly suggest that you will take the next 55 minutes and see the webinar below:
https://www.youtube.com/watch?v=n06kKYS6LZE
I assume you create your pods using Deployment object in Kubernetes. What you want to look into is a StatefulSet, which, in opposite to deployments, retains some identity aspects for recreated pods including to some extent network and storage.
It was introduced specifically as a means to run services that need to keep their state in kube cluster (ie. running databases queues etc.)
Looking at the answers, would it not be simpler to create an NFS Persistent Volume and then allow the pods to mount the PV's?
You can use the writemany which should alleviate a deadlock.
apiVersion: v1
kind: PersistentVolume
metadata:
name: shared-volume
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /tmp
server: 172.17.0.2
Persistent Volumes

Cancel or undo deletion of Persistent Volumes in kubernetes cluster

Accidentally tried to delete all PV's in cluster but thankfully they still have PVC's that are bound to them so all PV's are stuck in Status: Terminating.
How can I get the PV's out of the "terminating" status and back to a healthy state where it is "bound" to the pvc and is fully working?
The key here is that I don't want to lose any data and I want to make sure the volumes are functional and not at risk of being terminated if claim goes away.
Here are some details from a kubectl describe on the PV.
$ kubectl describe pv persistent-vol-1
Finalizers: [kubernetes.io/pv-protection foregroundDeletion]
Status: Terminating (lasts 1h)
Claim: ns/application
Reclaim Policy: Delete
Here is the describe on the claim.
$ kubectl describe pvc application
Name: application
Namespace: ns
StorageClass: standard
Status: Bound
Volume: persistent-vol-1
It is, in fact, possible to save data from your PersistentVolume with Status: Terminating and RetainPolicy set to default (delete). We have done so on GKE, not sure about AWS or Azure but I guess that they are similar
We had the same problem and I will post our solution here in case somebody else has an issue like this.
Your PersistenVolumes will not be terminated while there is a pod, deployment or to be more specific - a PersistentVolumeClaim using it.
The steps we took to remedy our broken state:
Once you are in the situation lke the OP, the first thing you want to do is to create a snapshot of your PersistentVolumes.
In GKE console, go to Compute Engine -> Disks and find your volume there (use kubectl get pv | grep pvc-name) and create a snapshot of your volume.
Use the snapshot to create a disk: gcloud compute disks create name-of-disk --size=10 --source-snapshot=name-of-snapshot --type=pd-standard --zone=your-zone
At this point, stop the services using the volume and delete the volume and volume claim.
Recreate the volume manually with the data from the disk:
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: name-of-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
gcePersistentDisk:
fsType: ext4
pdName: name-of-disk
persistentVolumeReclaimPolicy: Retain
Now just update your volume claim to target a specific volume, the last line of the yaml file:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
namespace: my-namespace
labels:
app: my-app
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
volumeName: name-of-pv
Edit: This only applies if you deleted the PVC and not the PV. Do not follow these instructions if you deleted the PV itself or the disk may be deleted!
I found myself in this same situation due to a careless mistake. It was with a statefulset on Google Cloud/GKE. My PVC said terminating because the pod referencing it was still running and the PV was configured with a retain policy of Deleted. I ended up finding a simpler method to get everything straightened out that also preserved all of the extra Google/Kubernetes metadata and names.
First, I would make a snapshot of your disk as suggested by another answer. You won't need it, but if something goes wrong, the other answer here can then be used to re-create a disk from it.
The short version is that you just need reconfigure the PV to "Retain", allow the PVC to get deleted, then remove the previous claim from the PV. A new PVC can then be bound to it and all is well.
Details:
Find the full name of the PV:
kubectl get pv
Reconfigure your PV to set the reclaim policy to "Retain": (I'm doing this on Windows so you may need to handle the quotes differently depending on OS)
kubectl patch pv <your-pv-name-goes-here> -p "{\"spec\":{\"persistentVolumeReclaimPolicy\":\"Retain\"}}"
Verify that the status of the PV is now Retain.
Shutdown your pod/statefulset (and don't allow it to restart). Once that's finished, your PVC will get removed and the PV (and the disk it references!) will be left intact.
Edit the PV:
kubectl edit pv <your-pv-name-goes-here>
In the editor, remove the entire "claimRef" section. Remove all of the lines from (and including) "claimRef:" until the next tag with the same indentation level. The lines to remove should look more or less like this:
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: my-app-pvc-my-app-0
namespace: default
resourceVersion: "1234567"
uid: 12345678-1234-1234-1234-1234567890ab
Save the changes and close the editor. Check the status of the PV and it should now show "Available".
Now you can re-create your PVC exactly as you originally did. That should then find the now "Available" PV and bind itself to it. In my case, I have the PVC defined with my statefulset as a volumeClaimTemplate so all I had to do was "kubectl apply" my statefulset.
You can check out this tool, it will update the Terminating PV's status in etcd back to Bound.
The way it works has been mentioned by Anirudh Ramanathan in his answer.
Be sure to back up your PV first.
Do not attempt this if you don't know what you're doing
There is another fairly hacky way of undeleting PVs. Directly editing the objects in etcd. Note that the following steps work only if you have control over etcd - this may not be true on certain cloud providers or managed offerings. Also note that you can screw things up much worse easily; since objects in etcd were never meant to be edited directly - so please approach this with caution.
We had a situation wherein our PVs had a policy of delete and I accidentally ran a command deleting a majority of them, on k8s 1.11. Thanks to storage-object-in-use protection, they did not immediately disappear, but they hung around in a dangerous state. Any deletion or restarts of the pods that were binding the PVCs would have caused the kubernetes.io/pvc-protection finalizer to get removed and thereby deletion of the underlying volume (in our case, EBS). New finalizers also cannot be added when the resource is in terminating state - From a k8s design standpoint, this is necessary in order to prevent race conditions.
Below are the steps I followed:
Back up the storage volumes you care about. This is just to cover yourself against possible deletion - AWS, GCP, Azure all provide mechanisms to do this and create a new snapshot.
Access etcd directly - if it's running as a static pod, you can ssh into it and check the http serving port. By default, this is 4001. If you're running multiple etcd nodes, use any one.
Port-forward 4001 to your machine from the pod.
kubectl -n=kube-system port-forward etcd-server-ip-x.y.z.w-compute.internal 4001:4001
Use the REST API, or a tool like etcdkeeper to connect to the cluster.
Navigate to /registry/persistentvolumes/ and find the corresponding PVs. The deletion of resources by controllers in k8s is done by setting the .spec.deletionTimeStamp field in the controller spec. Delete this field in order to have the controllers stop trying to delete the PV. This will revert them to the Bound state, which is probably where they were before you ran a delete.
You can also carefully edit the reclaimPolicy to Retain and then save the objects back to etcd. The controllers will re-read the state soon and you should see it reflected in kubectl get pv output as well shortly.
Your PVs should go back to the old undeleted state:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-b5adexxx 5Gi RWO Retain Bound zookeeper/datadir-zoo-0 gp2 287d
pvc-b5ae9xxx 5Gi RWO Retain Bound zookeeper/datalogdir-zoo-0 gp2 287d
As a general best practice, it is best to use RBAC and the right persistent volume reclaim policy to prevent accidental deletion of PVs or the underlying storage.
Unfortunately, you can't save your PV's and data in this case.
All you may do is recreate PV with Reclaim Policy: Retain - this will prevent data loss in the future.
You can read more about reclaim Policies here and here.
What happens if I delete a PersistentVolumeClaim (PVC)? If the volume
was dynamically provisioned, then the default reclaim policy is set to
“delete”. This means that, by default, when the PVC is deleted, the
underlying PV and storage asset will also be deleted. If you want to
retain the data stored on the volume, then you must change the reclaim
policy from “delete” to “retain” after the PV is provisioned.

How to enable storage size parameter in persistent volume claim?

Storage size which I specified in the persistent volume claim ignores with using nfs as a storage backend.
I want to attach persistent volume to container with specified volume size.
The following is the yaml file which I used to create pvc.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
 name: test-claim
 annotations:
  volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
accessModes:
 - ReadWriteMany
resources:
  requests:
  storage: 1Mi
The following is the result of created pvc.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
test-claim Bound pvc-bd0fdb84-f73c-11e7-bdd5-0050569b0869 1Mi RWX managed-nfs-storage 6m
Finally, I attached created volume to container and see the size of mounted file system, it shows the total amount of disk size which I export using NFS.
Does anybody know how to enable request storage size parameter?
In other words, Is there any way to specify the size of volumes when using NSF as a backend storage?
Simply put, no, it's not possible.
The storage parameter is used for matching PVC to a PV, and for autoprovisioning PVs when supported (ie. adding an EBS on AWS). Kubernetes it self has no means of managing filesystem quota whatsoever.
One thing that could help is if you'd automaticaly provision NFS to share a particular mount point on the server that is created with this limit (ie. as separate LVM LV, btrfs or zfs). You can also think about switching to something like GlusterFS with its provisioning API heketi

Kubernetes NFS Persistent Volumes - multiple claims on same volume? Claim stuck in pending?

Use case:
I have a NFS directory available and I want to use it to persist data for multiple deployments & pods.
I have created a PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
nfs:
server: http://mynfs.com
path: /server/mount/point
I want multiple deployments to be able to use this PersistentVolume, so my understanding of what is needed is that I need to create multiple PersistentVolumeClaims which will all point at this PersistentVolume.
kind: PersistentVolumeClaim
apiVersion: v1
metaData:
name: nfs-pvc-1
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Mi
I believe this to create a 50MB claim on the PersistentVolume. When I run kubectl get pvc, I see:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc-1 Bound nfs-pv 10Gi RWX 35s
I don't understand why I see 10Gi capacity, not 50Mi.
When I then change the PersistentVolumeClaim deployment yaml to create a PVC named nfs-pvc-2 I get this:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc-1 Bound nfs-pv 10Gi RWX 35s
nfs-pvc-2 Pending 10s
PVC2 never binds to the PV. Is this expected behaviour? Can I have multiple PVCs pointing at the same PV?
When I delete nfs-pvc-1, I see the same thing:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc-2 Pending 10s
Again, is this normal?
What is the appropriate way to use/re-use a shared NFS resource between multiple deployments / pods?
Basically you can't do what you want, as the relationship PVC <--> PV is one-on-one.
If NFS is the only storage you have available and would like multiple PV/PVC on one nfs export, use Dynamic Provisioning and a default storage class.
It's not in official K8s yet, but this one is in the incubator and I've tried it and it works well: https://github.com/kubernetes-incubator/external-storage/tree/master/nfs-client
This will enormously simplify your volume provisioning as you only need to take care of the PVC, and the PV will be created as a directory on the nfs export / server that you have defined.
From: https://docs.openshift.org/latest/install_config/storage_examples/shared_storage.html
As Baroudi Safwen mentioned, you cannot bind two pvc to the same pv, but you can use the same pvc in two different pods.
volumes:
- name: nfsvol-2
persistentVolumeClaim:
claimName: nfs-pvc-1 <-- USE THIS ONE IN BOTH PODS
A persistent volume claim is exclusively bound to a persistent volume.
You cannot bind 2 pvc to the same pv. I guess you are interested in the dynamic provisioning. I faced this issue when I was deploying statefulsets, which require dynamic provisioning for pods. So you need to deploy an NFS provisioner in your cluster, the NFS provisioner(pod) will have access to the NFS folder(hostpath), and each time a pod requests a volume, the NFS provisioner will mount it in the NFS directory on behalf of the pod. Here is the github repository to deploy it:
https://github.com/kubernetes-incubator/external-storage/tree/master/nfs/deploy/kubernetes
You have to be careful though, you must ensure the nfs provisioner always runs on the same machine where you have the NFS folder by making use of the node selector since you the volume is of type hostpath.
For my future-self and everyone else looking for the official documentation:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#binding
Once bound, PersistentVolumeClaim binds are exclusive, regardless of
how they were bound. A PVC to PV binding is a one-to-one mapping,
using a ClaimRef which is a bi-directional binding between the
PersistentVolume and the PersistentVolumeClaim.
a few points on dynamic provisioning..
using dynamic provisioning of nfs prevents you for changing any of the default nfs mount options. On my platform this uses rsize/wsize of 1M. this can cause huge problems in some applications using small files or block reading. (I've just hit this issue in a big way)
dynamic is a great option if it suits your needs. I'm now stuck with creating 250 pv/pvc pairs for my application that was being handled by dynamic due to the 1-1 relationship.