How to backup and restore a kubernetes master node? - kubernetes

There is a k8s single master node, I need to back it up and restore on a different server with different ip addresses. I googled this topic and found a solution -
https://elastisys.com/2018/12/10/backup-kubernetes-how-and-why/
Everything looked easy; so, I followed the instruction and got a copy of the certificates and a snapshot of the etcd database. Then I used the second script to restore the node on a different server. It did not go well this time. It gave me a bunch of errors related to mismatching the certificates and server's local ip addresses.
As far as I understood, when a kubernetes cluster is initializing, it creates a set of certificates assigned to the original server's ip addresses and I cannot just back it up and restore somewhere else.
So, how to backup a k8s master node and restore it?

Make sure that you added an extra flag to the kubeadm init command (--ignore-preflight-errors=DirAvailable--var-lib-etcd) to acknowledge that we want to use the pre-existing data.
Do the following steps:
replace the IP address in all config files in /etc/kubernetes
back up /etc/kubernetes/pki
identify certs in /etc/kubernetes/pki that have the old IP address
as an alt name - 1st step
delete both the cert and key for each of them (for me it was just
apiserver and etcd/peer)
regenerate the certs using kubeadm alpha phase certs - 2nd step
identify configmap in the kube-system namespace that referenced
the old IP - 3rd step
manually edit those configmaps
restart kubelet and docker (to force all containers to be
recreated)
1.
/etc/kubernetes/pki# for f in $(find -name "*.crt"); do openssl x509 -in $f -text -noout > $f.txt; done
/etc/kubernetes/pki# grep -Rl 12\\.34\\.56\\.78 .
./apiserver.crt.txt
./etcd/peer.crt.txt
/etc/kubernetes/pki# for f in $(find -name "*.crt"); do rm $f.txt; done
2.
/etc/kubernetes/pki# rm apiserver.crt apiserver.key
/etc/kubernetes/pki# kubeadm alpha phase certs apiserver
...
/etc/kubernetes/pki# rm etcd/peer.crt etcd/peer.key
/etc/kubernetes/pki# kubeadm alpha phase certs etcd-peer
...
3.
$ kubectl -n kube-system get cm -o yaml | less
...
$ kubectl -n kube-system edit cm ...
Take a look here: master-backup.
UPDATE:
During replacing master nodes and changing IP you cannot contact the api-server to change the configmaps in step 4. Moreover if you have single master k8s cluster connection between worker nodes will be interrupted till new master will be up.
To ensure connection between master and worker nodes during master replacement you have to create HA cluster.
The certificate is signed for {your-old-IP-here} and secure communication can't then happen to {your-new-ip-here}
You can add more IPs in the certificate in beforehand though...
The api-server certificate is signed for hostname kubernetes, so you can add that as an alias to the new IP address in /etc/hosts then do kubectl --server=https://kubernetes:6443 .... .

Related

No pods started after "kubeadm alpha certs renew"

I did a
kubeadm alpha certs renew
but after that, no pods get started. When starting from a Deployment, kubectl get pod doesn't even list the pod, when explicitly starting a pod, it is stuck on Pending.
What am I missing?
Normally I would follow a pattern to debug such issues starting with:
Check all the certificate files are rotated by kubeadm using sudo cat /etc/kubernetes/ssl/apiserver.crt | openssl x509 -text.
Make sure all the control plane services (api-server, controller, scheduler etc) have been restarted to use the new certificates.
If [1] and [2] are okay you should be able to do kubectl get pods
Now you should check the certificates for kubelet and make sure you are not hitting https://github.com/kubernetes/kubeadm/issues/1753
Make sure kubelet is restarted to use the new certificate.
I think of control plane (not being able to do kubectl) and kubelet (node status not ready, should see certificates attempts in api-server logs from the node) certificates expiry separately so I can quickly tell which might be broken.

Is phase kubeconfig required after phase certs in kubeadm?

I've recently upgraded with kubeadm, which I expect to rotate all certificates, and for good measure, I also ran kubeadm init phase certs all, but I'm not sure what steps are required to verify that the certs are all properly in place and not about to expire.
I've seen a SO answer reference kubeadm init phase kubeconfig all is required in addition, but cannot find in the kubernetes kubeadm documentation telling me that it needs to be used in conjunction with phase certs.
What do I need to do to make sure that the cluster will not encounter expired certificates?
I've tried verifying by connecting to the secure local port: echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not, which gives me expirations next month.
While openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text and openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -text yield dates over a year in advance.
These conflicting dates certainly have me concerned that I will find myself like many others with expired certificates. How do I get in front of that?
Thank you for any guidance.
In essence kubeadm init phase certs all regenerates all your certificates including your ca.crt (Certificate Authority), and Kubernetes components use certificate-based authentication to connect to the kube-apiserver (kubelet, kube-scheduler, kube-controller-manager) so you will also have to regenerate pretty much all of those configs by running kubeadm init phase kubeconfig all
Keep in mind that you will have to regenerate the kubelet.conf on all your nodes since it also needs to be updated to connect to the kube-apiserver with the new ca.crt. Also, make sure you add all your hostnames/IP addresses that your kube-apiserver is going to serve on to the kubeadm init phase certs all command (--apiserver-cert-extra-sans)
Most likely you are not seeing the updated certs when connecting through openssl is because you haven't restarted the Kubernetes components and in particular the kube-apiserver. So you will have to start your kube-apiserver, kube-scheduler, kube-controller-manager, etc (or kube-apiservers, kube-schedulers, etc if you are running a multi-master control plane) You will also have to restart your kubelets on all your nodes.
A month later, I've learned a little more and wanted to update this question for those who follow behind me.
I filed an issue on Kubernetes requesting more information on how the kubeadm upgrade process automatically updates certificates. The documentation on Kubernetes says:
Note: kubelet.conf is not included in the list above because kubeadm configures kubelet for automatic certificate renewal.
After upgrading, I did not see an automatic cert renewal for the kubelet. I was then informed that:
the decision on when to rotate the certificate is non-deterministic and it may happen 70 - 90% of the total lifespan of the certificate to prevent overlap on node cert rotations.
They also provided the following process, which resolved my last outstanding certificate rotation:
sudo mv /var/lib/kubelet/pki /var/lib/kubelet/pki-backup
sudo systemctl restart kubelet
# the pki folder should be re-created.

Set authentication-token-webhook-config-file

My goal is to set appscode guard application.
In order to so i need to set the value of authentication-token-webhook-config-file flag in Kubernetes api server.
How to do that ?
If you are looking for the way to add an option key to kube-apiserver pod on existing cluster, you just need to edit file /etc/kubernetes/manifests/kube-apiserver.yaml on master node.
After saving this file, kube-apiserver pod will be restarted by kubelet service automatically.
Considering that flag you've mentioned has to have name of the configuration file as parameter, ensure the file exists on the master node file system.
--authentication-token-webhook-config-file string
File with webhook configuration for token authentication in kubeconfig format. The API server will query the remote service to determine authentication for bearer tokens.
The directory for the manifests is defined by kubelet option --pod-manifest-path and can be found using command:
$ ps aux | grep kubelet
You can find more information about life cycle of such pods in Kubernetes documentation

Get kubeconfig by ssh into cluster

If I am able to SSH into the master or any nodes in the cluster, is it possible for me to get 1) the kubeconfig file or 2) all information necessary to compose my own kubeconfig file?
You could find configuration on master node under /etc/kubernetes/admin.conf (on v1.8+).
On some versions of kubernetes, this can be found under ~/.kube
I'd be interested in hearing the answer to this as well. But I think it depends on how the authentication is set up. For example,
Minikube uses "client certificate" authentication. If it stores the client.key on the cluster as well, you might construct a kubeconfig file by combining it with the cluster’s CA public key.
GKE (Google Kubernetes Engine) uses authentication on a frontend that's separate from the Kubernetes cluster (masters are hosted separately). You can't ssh into the master, but if it was possible, you still might not be able to construct a token that works against the API server.
However, by default Pods have a service account token that can be used to authenticate to Kubernetes API. So if you SSH into a node and run docker exec into a container managed by Kubernetes, you will see this:
/ # ls run/secrets/kubernetes.io/serviceaccount
ca.crt namespace token
You can combine ca.crt and token to construct a kubeconfig file that will authenticate to the Kubernetes master.
So the answer to your question is yes, if you SSH into a node, you can then jump into a Pod and collect information to compose your own kubeconfig file. (See this question on how to disable this. I think there are solutions to disable it by default as well by forcing RBAC and disabling ABAC, but I might be wrong.)

Cannot create priviledged containers

I am using instructions from https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/docker-multinode.md to setup a multinode kubernetes cluster on vmware vcloud infrastructure.
I was able to get the cluster working but when I tried the nfs example I was not able to create the nfs container. So I recreated all the VMs and rebuilt kubernetes from source using:
git clone https://github.com/kubernetes/kubernetes.git
cd kubernetes
sed -i 's/allow_privileged: .*/allow_privileged: true/g' cluster/saltbase/pillar/privilege.sls
./build/run.sh hack/build-cross.sh
cp _output/dockerized/bin/linux/$(dpkg --print-architecture)/kubectl /usr/local/bin
chmod +x /usr/local/bin/kubectl
and continued to setup the kubernetes cluster and retried the NFS example and I get the following error:
kubectl create -f nfs-server-pod.yaml
The Pod "nfs-server" is invalid.
spec.containers[0].securityContext.privileged: forbidden '<*>(0xc20931650c)true'
I tried with both the master and 1.0.3 release and had the same result.
Can you please tell me how to resolve this issue and Thanks for your support
We thought that turning privileged containers off by default would be good for security. It turns out to just be a pain point for a lot of people, so we're working to turn it on by default in kubernetes v1.1.
The --allow-privileged flag has to be set on both the kubelet and the apiserver - please check that