Recover a Kubernetes Cluster - kubernetes

At the moment I have a Kubernetes cluster distributed on AWS via kops. I have a doubt: is it possible to make a sort of snapshot of the Kubernetes cluster and recreate the same environment (master and pod nodes), for example to be resilient or to migrate the cluster in an easy way? I know that the Heptio Ark exists, it is very beautiful. But I'm curious to know if there is an easy way to do it. For example, is it enough to back up Etcd (or in my case the snapshot of EBS volumes)?
Thanks a lot. All suggestions are welcome

kops stores its state in an S3 bucket identified by the KOPS_STATE_STORE. So yes, if your cluster has been removed you can restore it by running kops create cluster.
Keep in mind that it doesn't restore your etcd state so for that you are going to set up etcd backups. You could also make use of Heptio Ark.
Similar answers to this topic:
Recover kops Kubernetes cluster
How to restore kubernetes cluster using kops?

As mentioned by Rico in the earlier post, you can use Velero to back up your etcd using cli client. Another option to consider for the scenario you described is CAPE: CAPE provides an easy to use control plane for Kubernetes Multi-cluster App & Data Management via a friendly user interface.
See below for resources:
How to create an on-demand K8s Backup:
https://www.youtube.com/watch?v=MOPtRTeG8sw&list=PLByzHLEsOQEB01EIybmgfcrBMO6WNFYZL&index=7
How to Restore/Migrate K8s Backup to Another Cluster:
https://www.youtube.com/watch?v=dhBnUgfTsh4&list=PLByzHLEsOQEB01EIybmgfcrBMO6WNFYZL&index=10

Related

Kubernetes Snapshots

I have an on-premise rancher server and there are two clusters in it. Let's get them cluster A and cluster B. In cluster A, I am creating a db snapshot and I need to copy that snapshot into cluster B. I am not a kubernetes expert therefore can someone help me with good ideas to achieve this or some reference materials that I could further refer to achieve this task?
There could be multiple ways to synchronize data between clusters. First off, see if your hypervisor supports volume replication. That way, you can copy data across to another volume and mount that volume into the applications in the secondary cluster.
Another approach would be to use Velero with Restic to backup the volumes to an object store (min-io/s3) and then restore those in the second cluster as shown in this example.
OpenEBS sounds like another viable option but I haven't had a chance to work with it yet. Linstor is another solution I have heard of.

setting up kubernetes cluster and running a database

This is my proposed kubernetes cluster, I want to be able to run Postgresql database, with my nodes accessing storage machine for storing the data, is this using NFS a good option? How best can I run a database instance here?
I recommend you to use helm chart for deploying any kind of data base, it is very handy and easy to deploy, visit the link:
https://github.com/bitnami/charts/tree/master/bitnami/postgresql
Any way if you want to deploy Postgresql, first of all you need to create persistent volume(pv) and persistent volume claim(pvc), because you choosed NFS for your cluster storage solution. You have to manually create your pv and pvc.
But kubernetes has storage class solution too, it is better to use some kubernetes volume plugin with internal provisioner like glusterfs or cephfs.
https://kubernetes.io/docs/concepts/storage/storage-classes/

HA postgresql on kubernetes

I wanted to deploy postgresql as database in my kubernetes cluster. As of now I've followed this tutorial.
By reading the whole thing I understood that we claimed a static storage before initiating the postgresql so that we have the data in case the pod fails. Also we can do replication by pointing to the same storage space to get our data back.
What happens if we use two workers nodes and the pods containing the database migrate to another node? I don’t think local storage will work.
hostPath volume is not recommended for production usage because of its ephemeral nature which means if the pod is rescheduled to another node the storage is not migrated and if the node reboots the data is lost.
For durable storage use external block or file storage systems mounted on the nodes using a supported CSI driver
For HA postgres I suggest you explore Postgres Operator which delivers an easy to run highly-available PostgreSQL clusters on Kubernetes (K8s) powered by Patroni. It is configured only through Postgres manifests (CRDs) to ease integration into automated CI/CD pipelines with no access to Kubernetes API directly, promoting infrastructure as code vs manual operations

how to recover from master failure with kubeadm

I set up a Kubernetes cluster with a single master node and two worker nodes using kubeadm, and I am trying to figure out how to recover from node failure.
When a worker node fails, recovery is straightforward: I create a new worker node from scratch, run kubeadm join, and everything's fine.
However, I cannot figure out how to recover from master node failure (without interrupting the deployments running on the worker nodes). Do I need to backup and restore the original certificates or can I just run kubeadm init to create a new master from scratch? How do I join the existing worker nodes?
I ended up writing a Kubernetes CronJob backing up the etcd data. If you are interested: I wrote a blog post about it: https://labs.consol.de/kubernetes/2018/05/25/kubeadm-backup.html
In addition to that you may want to backup all of /etc/kubernetes/pki to avoid issues with secrets (tokens) having to be renewed.
For example, kube-proxy uses a secret to store a token and this token becomes invalid if only the etcd certificate is backed up.
As per your mention about Master's backup , actually if you mean backup procedures (like traditional/legacy backups tools/techs) isn't mentioned directly in the official documentation (as i know), but you can take your precautions by some Options/Workarounds :
Setup HA Masters (only for GCE)
Set up High-Availability Kubernetes Masters
Setup HA etcd cluster / Master Load Balancer
Setting-up-an-ha-etcd-cluster
Set up master Load Balancer
Operating etcd clusters for Kubernetes
OS file Systems Snapshot/backup
kubeadm init will definitely not work out of the box, as that will create a new cluster altogether, credentials, ip space, etc.
At a minimum, restoring the master node will require a backup of your etcd data. This typically lives in /var/lib/etcd directory.
You will also need the kubeadm config from the cluster
kubeadm config view should output this. (upward of v1.8)
The step-by-step to restore a master node really isn't so clean cut, which is why they introduce HA - High Availability. This is a much safer way of maintaining redundancy and uptime. Particularly because restoring anything from etcd can be a real pain (in my humble opinion and experience).
If I may go a bit off topic from your question, if you are still getting started with Kubernetes and not deeply invested in kubeadm, i would suggest you consider creating your cluster with kops instead. It supports HA already and I found kops to be more robust and easier to use to either kubeadm and kube-aws (the coreos cluster builder).
https://kubernetes.io/docs/getting-started-guides/kops/

usecase for etcd inside Kubernetes

I was just wondering, why it is useful to run etcd cluster inside Kubernetes, when Kubernetes itself depends on etcd.
It just does not make sense to me, as if I have HA Kube, I am also forced to have HA etcd outside. Hence to reason to install it again inside...
I have an external ETCD that manages my k8s HA cluster and im not letting any developer apps near it. I would be too concerned about something going wrong and breaking the k8s cluster. It is also a fixed size at 3 which works well for the cluster size with its requirements. If the developers need a key/value store for their db and want etcd, this would be a great way to make one in the cluster for the applications. With it being statefulsets, its scalable.
If you're using Kubernetes via GKE, the underlying Etcd cluster is not exposed in any way.