Kubernetes Snapshots - kubernetes

I have an on-premise rancher server and there are two clusters in it. Let's get them cluster A and cluster B. In cluster A, I am creating a db snapshot and I need to copy that snapshot into cluster B. I am not a kubernetes expert therefore can someone help me with good ideas to achieve this or some reference materials that I could further refer to achieve this task?

There could be multiple ways to synchronize data between clusters. First off, see if your hypervisor supports volume replication. That way, you can copy data across to another volume and mount that volume into the applications in the secondary cluster.
Another approach would be to use Velero with Restic to backup the volumes to an object store (min-io/s3) and then restore those in the second cluster as shown in this example.
OpenEBS sounds like another viable option but I haven't had a chance to work with it yet. Linstor is another solution I have heard of.

Related

How to have data in a database with FastAPI persist across multiple nodes?

If I use the https://github.com/tiangolo/full-stack-fastapi-postgresql project generator, how would one be able to persist data across multiple nodes (either with docker swarm or kubernetes)?
As I understand it, any postgresql data in a volumes directory would be different for every node (e.g. every digitalocean droplet). In this case, a user may ask for their data, get directed by traefik to a node with a different volumes directory, and return different information to the case where they may have been directed to another node. Is this correct?
If so, what would be the best approach to have multiple servers running a database work together and have the same data in the database?
On kubernetes, persistent volumes are used to associate storage that is mounted onto pods wherever they are loaded in the cluster and they are managed by providing the cluster with storage classes that map to drivers that map to some kind of SAN storage.
Docker / Docker swarm has similar support for docker volume plugins, but with the ascendancy of K8s there are virtually no active open source projects, and most of the prior commercial SAN driver vendors have migrated to K8s instead.
Nonetheless, depending on your tolerance, you can use a mix of direct nfs / fuse mounts, there are some not entirely abandoned docker volume drivers available in the nfs / glusterfs space.
This issue moby/moby #39624 addresses CSI support that we will hopefully see drop in 2021 that will bring swarm back inline with k8s.

Can I sync Persistent volume between two k8s clusters?

I have deployed two k8s clusters and i want that if someone will create pv in first cluster then it should automatically get created in the second cluster. How can i achieve this?
simply speaking you can't: these are separate clusters and each of them has a separate configuration. there is no built-in mechanism of triggering between separate clusters. you would need to build your own program that would be watching both API servers and applying the changes.
I'm guessing however that you probably want to share filesystem data between clusters: if so, then have a look at volume types backed by network/distributed file systems such as NFS or ceph.

Is it possible to join two separate kubernetes clusters?

I have deployments on one Kubernetes cluster that I might want to move to another Kubernetes cluster in the future. Is it possible to combine these two clusters or must I redeploy everything? If the answer is yes, what if there are StatefulSets?
The short answer is no.
You can connect to clusters with something like Kubernetes Federation or if you have Calico, you can use something like BGP peering
You'll have to redeploy everything and in the case of StatefulSets, it really depends where you are storing your state. For example:
Is it MySql? Backup your db and restore it in the new place.
Is it Cassandra? Can you reattach the same physical volumes in the cloud provider? if not, then you'll have to transfer your data.
Is it etcd, Consul or Zookeeper? Can you back it up or attach the same physical volumes?

Recover a Kubernetes Cluster

At the moment I have a Kubernetes cluster distributed on AWS via kops. I have a doubt: is it possible to make a sort of snapshot of the Kubernetes cluster and recreate the same environment (master and pod nodes), for example to be resilient or to migrate the cluster in an easy way? I know that the Heptio Ark exists, it is very beautiful. But I'm curious to know if there is an easy way to do it. For example, is it enough to back up Etcd (or in my case the snapshot of EBS volumes)?
Thanks a lot. All suggestions are welcome
kops stores its state in an S3 bucket identified by the KOPS_STATE_STORE. So yes, if your cluster has been removed you can restore it by running kops create cluster.
Keep in mind that it doesn't restore your etcd state so for that you are going to set up etcd backups. You could also make use of Heptio Ark.
Similar answers to this topic:
Recover kops Kubernetes cluster
How to restore kubernetes cluster using kops?
As mentioned by Rico in the earlier post, you can use Velero to back up your etcd using cli client. Another option to consider for the scenario you described is CAPE: CAPE provides an easy to use control plane for Kubernetes Multi-cluster App & Data Management via a friendly user interface.
See below for resources:
How to create an on-demand K8s Backup:
https://www.youtube.com/watch?v=MOPtRTeG8sw&list=PLByzHLEsOQEB01EIybmgfcrBMO6WNFYZL&index=7
How to Restore/Migrate K8s Backup to Another Cluster:
https://www.youtube.com/watch?v=dhBnUgfTsh4&list=PLByzHLEsOQEB01EIybmgfcrBMO6WNFYZL&index=10

Kubernetes - Persistent storage for PostgreSQL

We currently have a 2-node Kubernetes environment running on bare-metal machines (no GCE) and now we wish to set up a PostgreSQL instance on top of this.
Our plan was to map a data volume for the PostgreSQL Data Directory to the node using the volumeMounts option in Kubernetes. However this would be a problem because if the Pod ever gets stopped, Kubernetes will re-launch it at random on one of the other nodes. Thus we have no guarantee that it will use the correct data directory on re-launch...
So what is the best approach for maintaining a consistent and persistent PostgreSQL Data Directory across a Kubernetes cluster?
one solution is to deploy HA postgresql, for example https://github.com/sorintlab/stolon
another is to have some network storage attached to all nodes(NFS, glusterFS) and use volumeMounts in the pods