How can I copy data from a bound persistent volume? - kubernetes

I've got a persistent volume currently bound to a deployment, I've set the replica count to 0 which I was hoping would unbound the volume - so I could mount it on another pod but it remains in a Bound status.
How can I copy the data from this?
I'd like to transfer it via scp to another location.

I've got a persistent volume currently bound to a deployment, I've set the replica count to 0 which I was hoping would unbound the volume - so I could mount it on another pod but it remains in a Bound status.
"Bound" does not mean it is attached to a Node, nor to a Pod (which is pragmatically the same thing); Bound means that the cloud provider has created a Persistent Volume (out of thin air) in order to fulfill a Persistent Volume Claim for some/all of its allocatable storage. "Bound" relates to its cloud status, not its Pod status. That term exists because kubernetes supports reclaiming volumes, to avoid creating a new cloud storage object if there are existing ones that no one is claiming and fulfill the resource request boundaries.
There's nothing, at least in your question, that prevents you from launching another Pod (in the same Namespace as the Deployment) with a volumes: that points to the persistentVolumeClaim and it will launch that Pod with the volume just as it did in your Deployment. You can then do whatever you'd like in that Pod to egress the data.

Related

PersistentVolumeClaim used by multiple pods: one for writing and another for backup

In a Kubernetes cluster on Oracle cloud, I have a pod with an Apache server.
This pod needs a persistent volume so I used a persistentVolumeClaim and the cloud provider is able to automatically create an associated volume (Oracle Block Volume).
The access mode used by the PVC is readWriteOnce and therefore the volume created has the same access mode.
Everything work great.
Now I want to backup this volume using borg backup and borgmatic by starting a new pod regularly with a cronJob.
This backup pod needs to mount the volume in read only.
Question:
Can I use the previously defined PVC?
Do I need to create a new PVC with readOnly access mode?
As per documentation:
ReadWriteOnce:
the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.
That means if you make a strict rule for deploying your pods to the same node, you can use the same PVC, here's the INSTRUCTION

Does in Kubernetes a PV/PVC guarantees sticky mounting of pods?

I would like to understand if through PVC/PV a pod that is using a volume after a failure will be always re-attached to the same volume or not. Essentially I know that this can be a case for Statefulset but I am trying to understand if this can be also achieved with PVC and PV. Essentially assuming that a Pod_A is attached to Volume_X, then Pod_A fails but in the meantime a Volume_Y was added to the cluster that can potentially fulfil the PVC requirements. So what does it happen when Pod_A is re-created, does it get always mounted to Volume_X or is there any chance that it gets mounted to the new Volume_Y?
a pod that is using a volume after a failure will be always re-attached to the same volume or not
yes, the Pod will be re-attached to the same volume, because it still has the same PVC declared in its manifest.
Essentially assuming that a Pod_A is attached to Volume_X, then Pod_A fails but in the meantime a Volume_Y was added to the cluster that can potentially fulfil the PVC requirements.
The Pod still has the same PVC in its manifest, so it will use the same volume. But if you create a new PVC, it might be bound to the new volume.
So what does it happen when Pod_A is re-created, does it get always mounted to Volume_X or is there any chance that it gets mounted to the new Volume_Y?
The Pod still has the same PVC in its manifest, so it will use the volume that is bound by that PVC. Only when you create a new PVC, that claim can be bound the new volume.

Access Kubernetes Persistent Volume data

Is there any way to access Google cloud Kubernetes persistent volume data without using pod. I cannot start pod due to data corruption in persistent volume. Have any command line tool or any other way.
If you have any concerns running pod with any specific application, in that case, you can run the Ubuntu POD and attach that pod to the PVC and access the data.
There also another option to clone the PV and PVC, perform the testing, and newly created PV and PVC while the old one will work as the backup option.
For cloning PV and PVC you can also use the tool : https://velero.io/
You can also attach the PVC to the POD in read-only mode and try accessing the data.
PersistentVolume resources are cluster resources that exist independently of Pods. This means that the disk and data represented by a PersistentVolume continue to exist as the cluster changes and as Pods are deleted and recreated.
It is possible to save data from your PersistentVolume with Status: Terminating and RetainPolicy set to default(delete). Your PersistentVolumes will not be terminated until there is a pod, deployment or to be more specific a PersistentVolumeClaim using it.
The steps we took to remedy our broken state are as follows:
The first thing you want to do is to create a snapshot of your PersistentVolumes.
In GKE console, go to Compute Engine -> Disks and find your volume there and create a snapshot of your volume. use
kubectl get pv | grep pvc-name
Use the snapshot to create a disk:
gcloud compute disks create name-of-disk --size=10 --source-snapshot=name-of-snapshot --type=pd-standard --zone=your-zone
At this point, stop the services using the volume and delete the volume and volume claim.
Re-create the volume manually with the data from the disk and update your volume claim to target a specific volume file.
For more information refer to the links below.
Accessing file shares from Google Kubernetes Engine clusters.
Configure a Pod to Use a PersistentVolume for Storage

Kubernetes: hostPath Static Storage with PV vs hard coded hostPath in Pod Volume

I'm learning Kubernetes and there is something I don't get well.
There are 3 ways of setting up static storage:
Pods with volumes you attach diretctly the storage to
Pods with a PVC attached to its volume
StatefulSets with also PVC inside
I can understand the power of PVC when working together with StorageClass, but not when working with static storage and local storage like hostPath
To me, it sounds very similar:
In the first case I have a volume directly attached to a pod.
In the second case I have a volume statically attached to a PVC, which is also manually attached to a Pod. In the end, the volume will be statically attached to the Pod.
On both cases, the data will remain when the Pod is terminates and will be adopted by the next Pod which the corresponing definition, right?
The only profit I see from using PVCs over plain Pod is that you can define the acces mode. Apart of that. Is there a difference when working with hostpath?
On the other hand, the advantage of using a StatefulSet instead of a PVC is (if understood properly) that it get a headless service, and that the rollout and rollback mechanism works differently. Is that the point?
Thank you in advance!
Extracted from this blog:
The biggest difference is that the Kubernetes scheduler understands
which node a Local Persistent Volume belongs to. With HostPath
volumes, a pod referencing a HostPath volume may be moved by the
scheduler to a different node resulting in data loss. But with Local
Persistent Volumes, the Kubernetes scheduler ensures that a pod using
a Local Persistent Volume is always scheduled to the same node.
Using hostPath does not garantee that a pod will restart on the same node. So you pod can attach /tmp/storage on k8s-node-1, then if you delete and re-create the pod, it may attach tmp/storage on k8s-node-[2-n]
On the contrary, if you use PVC/PV with local persistent storage class, then if you delete and re-create a pod, it will stick on the node which handle the local persistent storage.
StatefulSet creates pods and has volumeClaimTemplate field, which creates a dedicated PVC for each pod. So each pod created by the statefulSet will have its own dedicated storage, linked with Pod->PVC->PV->Storage. So StatefulSet use also the PVC/PV mechanism.
More details are available here.

StatefulSet behavior when a node dies/gets restarted and has a PersistentVolume

Suppose I have a resource foo which is a statefulset with 3 replicas. Each makes a persistent volume claim.
One of the foo pods (foo-1) dies, and a new one starts in its place. Will foo-1 be bound to the same persistent volume that the previous foo-1 had before it died? Will the number of persistent volume claims stay the same or grow?
This edge case doesn't seem to be in the documentation on StatefulSets.
Yes you can. A PVC is going to create a disk on GCP, and add it as secondary disk to the node in which the pod is running.
Upon deletion of an individual pod, K8s is going to re-create the pod on the same node it was running. If it is not possible (say the node no longer exists), the pod will be created on another node, and the secondary disk will be moved to that node.