Reusing Persistent Volume for another PersistentVolumeClaim - postgresql

I am using rookio on Kubernetes with CoreOS for dynamically creating Persistent Volume.
So I create a PersistentVolumeClaim (kubectl create -f postgres-pvc.yaml) and apply a patch for persistentVolumeReclaimPolicy to Retain. I do a "kubectl get pv"and I can see a dynamically created persistentvolume and is bound. Now when I delete the PersitentVolumeClaim the status goes to Released.
I have stored some precious data in that persistentvolume. Is there a way I can reuse that persistentvolume that has gone into Released status?
thanks
-sonam

If you have precious data that you want to use in another PostgreSQL pod, maybe StatefulSets is which you are looking for, as it allows:
Stable, persistent storage [...] across Pod (re)schedulings.
Therefore, I would advise you to deploy your PostgreSQL database as a StatefulSet. You would need to check that your already existing Volume is bound.
[1] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

Related

Access Kubernetes Persistent Volume data

Is there any way to access Google cloud Kubernetes persistent volume data without using pod. I cannot start pod due to data corruption in persistent volume. Have any command line tool or any other way.
If you have any concerns running pod with any specific application, in that case, you can run the Ubuntu POD and attach that pod to the PVC and access the data.
There also another option to clone the PV and PVC, perform the testing, and newly created PV and PVC while the old one will work as the backup option.
For cloning PV and PVC you can also use the tool : https://velero.io/
You can also attach the PVC to the POD in read-only mode and try accessing the data.
PersistentVolume resources are cluster resources that exist independently of Pods. This means that the disk and data represented by a PersistentVolume continue to exist as the cluster changes and as Pods are deleted and recreated.
It is possible to save data from your PersistentVolume with Status: Terminating and RetainPolicy set to default(delete). Your PersistentVolumes will not be terminated until there is a pod, deployment or to be more specific a PersistentVolumeClaim using it.
The steps we took to remedy our broken state are as follows:
The first thing you want to do is to create a snapshot of your PersistentVolumes.
In GKE console, go to Compute Engine -> Disks and find your volume there and create a snapshot of your volume. use
kubectl get pv | grep pvc-name
Use the snapshot to create a disk:
gcloud compute disks create name-of-disk --size=10 --source-snapshot=name-of-snapshot --type=pd-standard --zone=your-zone
At this point, stop the services using the volume and delete the volume and volume claim.
Re-create the volume manually with the data from the disk and update your volume claim to target a specific volume file.
For more information refer to the links below.
Accessing file shares from Google Kubernetes Engine clusters.
Configure a Pod to Use a PersistentVolume for Storage

Kubernetes: hostPath Static Storage with PV vs hard coded hostPath in Pod Volume

I'm learning Kubernetes and there is something I don't get well.
There are 3 ways of setting up static storage:
Pods with volumes you attach diretctly the storage to
Pods with a PVC attached to its volume
StatefulSets with also PVC inside
I can understand the power of PVC when working together with StorageClass, but not when working with static storage and local storage like hostPath
To me, it sounds very similar:
In the first case I have a volume directly attached to a pod.
In the second case I have a volume statically attached to a PVC, which is also manually attached to a Pod. In the end, the volume will be statically attached to the Pod.
On both cases, the data will remain when the Pod is terminates and will be adopted by the next Pod which the corresponing definition, right?
The only profit I see from using PVCs over plain Pod is that you can define the acces mode. Apart of that. Is there a difference when working with hostpath?
On the other hand, the advantage of using a StatefulSet instead of a PVC is (if understood properly) that it get a headless service, and that the rollout and rollback mechanism works differently. Is that the point?
Thank you in advance!
Extracted from this blog:
The biggest difference is that the Kubernetes scheduler understands
which node a Local Persistent Volume belongs to. With HostPath
volumes, a pod referencing a HostPath volume may be moved by the
scheduler to a different node resulting in data loss. But with Local
Persistent Volumes, the Kubernetes scheduler ensures that a pod using
a Local Persistent Volume is always scheduled to the same node.
Using hostPath does not garantee that a pod will restart on the same node. So you pod can attach /tmp/storage on k8s-node-1, then if you delete and re-create the pod, it may attach tmp/storage on k8s-node-[2-n]
On the contrary, if you use PVC/PV with local persistent storage class, then if you delete and re-create a pod, it will stick on the node which handle the local persistent storage.
StatefulSet creates pods and has volumeClaimTemplate field, which creates a dedicated PVC for each pod. So each pod created by the statefulSet will have its own dedicated storage, linked with Pod->PVC->PV->Storage. So StatefulSet use also the PVC/PV mechanism.
More details are available here.

How to share persistent volume of a StatefulSet with another StatefulSet?

I have a StatefulSet-1 running with 3 replicas & each pod writing logs to its own persistent volume say pv1,pv2,pv3 (achieved using volumeClaimTemplates:)
I have another StatefulSet-2 running with 3 replicas & I want each POD of StatefulSet-2 access already created StatefulSet-1's volumes i.e. pv1,pv2 & pv3 for processing seperate logs written by each pod of StatefulSet-1.
So pv1,pv2,pv3 should be using by both StatefulSet1 & StatefulSet2 since pv1,pv2,pv3 created as part of StatefulSet-1 deployment! pv1,pv2,pv3 will ofcourse takes POD's name of StatefulSet-1 which is ok for StatefulSet-2.
How to configure StatefulSet2 to achieve the above scenario? please help!
Thanks & Regards,
Sudhir
This won't work.
1. PVs backed by GCE disks are in readWriteOnce mode so 1 pvc per pod.
2. You are achieving the statefulset pods with PVCs using PVC templates which rely on dynamic volume provisioning to create the appropriate PVs and PVCs.
If you need these pods to share the PVC, your best bet is to use a readWriteMany PV such as one backed by NFS. You will also need to create the pods of statefulSet-2 manually to have them mount the appropriate PVCs. You could achieve this by creating a single pod deployment for each one.
Something else to consider, can you have the containers of each statefulSet run together in the same pods? Normally this is not recommended, but it would allow them both to share the same volumes (as long as they are not using the same ports)

What happens to persistent volume if the StatefulSet got deleted and re-created?

I made a Kafka and zookeeper as a statefulset and exposed Kafka to the outside of the cluster. However, whenever I try to delete the Kafka statefulset and re-create one, the data seemed to be gone? (when I tried to consume all the message using kafkacat, the old messages seemed to be gone) even if it is using the same PVC and PV. I am currently using EBS as my persistent volume.
Can someone explain to me what is happening to PV when I delete the statefulset? Please help me.
I would probably look at how the persistent volume is created.
If you run the command
kubectl get pv
you can see the Reclaim policy, if it is set to retain, then your volume will survive even when stateful set is deleted
This is the expected behaviour , because the new statefulSet will create a new set of PVs and start over. ( if there is no other choice it can randomly land on old PVs as well , for example local volumes )
StatefulSet doesn't mean that kubernetes will remember what you were doing in some other old statefulset that u have deleted.
Statefulset means that if the pod is restarted or re-created for some reason, the same volume will be assigned to it. This doesn't mean that the volume will be assigned across the StatefulSets.
I assume your scenario is that you have a statefulset which has got a persistentvolumeclaim definition in it - or it is just referencing an existing volume - and you try to delete it.
In this case the persistent volume will stay there. Also the pvc won't disappear;
This is so that you can, if you wanted to, remount the same volume to a different statefulset pod - or an update of the previous one thereof.
If you want to delete the persistent volume you should delete the pvc and the buond PVs will disappear.
Kubernetes default prevent deleting PersistentVolumeClaims and bounded PersistentVolume objects when you scaling StatefulSet down or deleting them.
Retaining PersistentVolumeClaims is the default behavior, but you can configure the StatefulSet to delete them via the persistentVolumeClaimRetentionPolicy field.
This example shows part of StatefulSet manifest file, where retention policy causes deleting PersistentVolumeClaim when StatefulSet is scaled down, and retaining when StatefulSet is deleted.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: quiz
spec:
persistentVolumeClaimRetentionPolicy:
whenScaled: Delete
whenDeleted: Retain
Make sure you have properly configured StatefulSet manifest and Kafka cluster.
NOTE
If you want to delete a StatefulSet but keep the Pods and the
PersistentVolumeClaims, you can use the --cascade=orphan option. In
this case, the PersistentVolumeClaims will be preserved even if the
retention policy is set to Delete.
Marko Lukša "Kubernetes in Action, Second Edition"

Does stellar core deployment on k8s needs persistent storage?

I want to deploy stellar core on k8s with CATCHUP COMPLETE. I'm using this docker image satoshipay/stellar-core
In docker image docs mentioned /data used to store the some informations about DB. And I've seen that helm template is using a persistent volume and mounting it in /data.
I was wondering what will happen if I use a deployment instead of the stateful set and I restart the pod, update it's docker version or delete it? Does it initialize the DB again?
Also does the stellar core need any extra storage for the catchup?
Statefulset vs Deployment
A StatefulSet "provides guarantees about the ordering and uniqueness of these Pods".
If your application needs to be brought up in a specific order, use statefulset.
Storage
Definitely leverage a persistent volume for database. From K8S Docs
On-disk files in a Container are ephemeral
Since it appears you're deploying some kind of blockchain application, this could cause significant delays for startup
In Deployment you specify a PersistentVolumeClaim that is shared by all pod replicas. In other words, shared volume.
The backing storage obviously must have ReadWriteMany or ReadOnlyMany accessMode if you have more than one replica pod.
StatefulSet you specify a volumeClaimTemplates so that each replica pod gets a unique PersistentVolumeClaim associated with it.
In other words, no shared volume.
StatefulSet is useful for running things in cluster e.g Hadoop cluster, MySQL cluster, where each node has its own storage.
So in your case to have more isolation (no shared volumes) is better to have statefulset based solution.
If you use deployment based solution (restart the pod, update it's docker version or delete it) your DB will be initialized again.
Regarding catchup:
In general, running CATCHUP_COMPLETE=true is not recommended in docker containers as they have limited resources by default (if you really want to do it, make sure to give them access to more resources: CPU, memory and disk space).