How to convert a Kubernetes non-HA control plane into an HA control plane? - kubernetes

What is the best way to convert a kubernetes non-HA control plane into an HA control plane?
I have started with my cluster as a non-HA control plane - one master node and several worker nodes. The cluster is already running with a lots of services.
Now I would like to add additional master nodes to convert my cluster into a HA control plane. I have setup and configured a load balancer.
But I did not figure out how I can change the -control-plane-endpoint to my load balancer IP address for my existing master node.
Calling kubeadm results in the following error:
sudo kubeadm init --control-plane-endpoint "my-load-balancer:6443" --upload-certs
[init] Using Kubernetes version: v1.20.1
[preflight] Running pre-flight checks
[WARNING SystemVerification]: missing optional cgroups: hugetlb
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-6443]: Port 6443 is in use
[ERROR Port-10259]: Port 10259 is in use
[ERROR Port-10257]: Port 10257 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR Port-2379]: Port 2379 is in use
[ERROR Port-2380]: Port 2380 is in use
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
The error message seems to be clear as my master is already running.
Is there a way how I can easily tell my existing master node to use the new load balancer to run as a HA control plane?

Best solution in my opinion
The best approach to convert a non-HA control plane to an HA control plane is to create a completely new HA control plane and after that to migrate all your applications there.
Possible solution
Below I will try to help you to achieve your goal but I do not recommend using this procedure on any cluster that will ever be considered as production cluster. It work for my scenario, it also might help you.
Update the kube-apiserver certificate
First of all, kube-apiserver uses a certificate to encrypt control plane traffic and this certificate have something known as SAN (Subject Alternative Name).
SAN is a list of IP addresses that you will use to access the API, so you need to add there IP address of your LoadBalancer and probably the hostname of your LB as well.
To do that, you have to get kubeadm configuration e.g. using command:
$ kubeadm config view > kubeadm-config.yaml
and then add certSANs to kubeadm-config.yaml config file under apiServer section, it should looks like below example: (you may also need to add controlPlaneEndpoint to point to your LB).
apiServer:
certSANs:
- "192.168.0.2" # your LB address
- "loadbalancer" # your LB hostname
extraArgs:
authorization-mode: Node,RBAC
...
controlPlaneEndpoint: "loadbalancer" # your LB DNS name or DNS CNAME
...
Now you can update kube-apiserver cert using:
BUT please remember you must first delete/move your old kube-apiserver cert and key from /etc/kubernetes/pki/ !
$ kubeadm init phase certs apiserver --config kubeadm-config.yaml.
Finally restart your kube-apiserver.
Update the kubelet, the scheduler and the controller manager kubeconfig files
Next step is to update the kubelet, scheduler and controller manager to communicate with the kube-apiserver using LoadBalancer.
All three of these components use standard kubeconfig files:
/etc/kubernetes/kubelet.conf, /etc/kubernetes/scheduler.conf, /etc/kubernetes/controller-manager.conf to communicate with kube-apiserver.
The only thing to do is to edit the server: line to point to LB instead of kube-apiserver directly and then restart these components.
The kubelet is systemd service so to restart it use:
systemctl restart kubelet
the controller manager and schedulers are deployed as pods.
Update the kube-proxy kubeconfig files
Next it is time to update kubeconfig file for kube-proxy and same as before - the only thing to do is to edit the server: line to point to LoadBalancer instead of kube-apiserver directly.
This kubeconfig is in fact a configmap, so you can edit it directly using:
$ kubectl edit cm kube-proxy -n kube-system
or first save it as manifest file:
$ kubectl get cm kube-proxy -n kube-system -o yaml > kube-proxy.yml
and then apply changes.
Don't forget to restart kube-proxy after these changes.
Update the kubeadm-config configmap
At the end upload new kubeadm-config configmap (with certSANs and controlPlaneEndpoint entries) to the cluster, it's especially important when you want to add new node to the cluster.
$ kubeadm config upload from-file --config kubeadm-config.yaml
if command above doesn't work, try this:
$ kubeadm upgrade apply --config kubeadm-config.yaml

Related

adding master to Kubernetes cluster: cluster doesn't have a stable controlPlaneEndpoint address

How can I add a second master to the control plane of an existing Kubernetes 1.14 cluster?
The available documentation apparently assumes that both masters (in stacked control plane and etcd nodes) are created at the same time. I have created my first master already a while ago with kubeadm init --pod-network-cidr=10.244.0.0/16, so I don't have a kubeadm-config.yaml as referred to by this documentation.
I have tried the following instead:
kubeadm join ... --token ... --discovery-token-ca-cert-hash ... \
--experimental-control-plane --certificate-key ...
The part kubeadm join ... --token ... --discovery-token-ca-cert-hash ... is what is suggested when running kubeadm token create --print-join-command on the first master; it normally serves for adding another worker. --experimental-control-plane is for adding another master instead. The key in --certificate-key ... is as suggested by running kubeadm init phase upload-certs --experimental-upload-certs on the first master.
I receive the following errors:
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver.
The recommended driver is "systemd". Please follow the guide at
https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight:
One or more conditions for hosting a new control plane instance is not satisfied.
unable to add a new control plane instance a cluster that doesn't have a stable
controlPlaneEndpoint address
Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.
What does it mean for my cluster not to have a stable controlPlaneEndpoint address? Could this be related to controlPlaneEndpoint in the output from kubectl -n kube-system get configmap kubeadm-config -o yaml currently being an empty string? How can I overcome this situation?
As per HA - Create load balancer for kube-apiserver:
In a cloud environment you should place your control plane nodes behind a TCP forwarding load balancer. This load balancer distributes
traffic to all healthy control plane nodes in its target list. The
health check for an apiserver is a TCP check on the port the
kube-apiserver listens on (default value :6443).
The load balancer must be able to communicate with all control plane nodes on the apiserver port. It must also allow incoming traffic
on its listening port.
Make sure the address of the load balancer
always matches the address of kubeadm’s ControlPlaneEndpoint.
To set ControlPlaneEndpoint config, you should use kubeadm with the --config flag. Take a look here for a config file example:
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT"
Kubeadm config files examples are scattered across many documentation sections. I recommend that you read the /apis/kubeadm/v1beta1 GoDoc, which have fully populated examples of YAML files used by multiple kubeadm configuration types.
If you are configuring a self-hosted control-plane, consider using the kubeadm alpha selfhosting feature:
[..] key components such as the API server, controller manager, and
scheduler run as DaemonSet pods configured via the Kubernetes API
instead of static pods configured in the kubelet via static files.
This PR (#59371) may clarify the differences of using a self-hosted config.
You need to copy the certificates ( etcd/api server/ca etc. ) from the existing master and place on the second master.
then run kubeadm init script. since the certs are already present the cert creation step is skipped and rest of the cluster initialization steps are resumed.

Something seems to be catching TCP traffic to pods

I'm trying to deploy Kubernetes with Calico (IPIP) with Kubeadm. After deployment is done I'm deploying Calico using these manifests
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
Before applying it, I'm editing CALICO_IPV4POOL_CIDR and setting it to 10.250.0.0/17 as well as using command kubeadm init --pod-cidr 10.250.0.0/17.
After few seconds CoreDNS pods (for example getting addr 10.250.2.2) starts restarting with error 10.250.2.2:8080 connection refused.
Now a bit of digging:
from any node in cluster ping 10.250.2.2 works and it reaches pod (tcpdump in pod net namespace shows it).
from different pod (on different node) curl 10.250.2.2:8080 works well
from any node to curl 10.250.2.2:8080 fails with connection refused
Because it's coredns pod it listens on 53 both udp and tcp, so I've tried netcat from nodes
nc 10.250.2.2 53 - connection refused
nc -u 10.250.2.2 55 - works
Now I've tcpdump each interface on source node for port 8080 and curl to CoreDNS pod doesn't even seem to leave node... sooo iptables?
I've also tried weave, canal and flannel, all seem to have same issue.
I've ran out of ideas by now...any pointers please?
Seems to be a problem with Calico implementation, CoreDNS Pods are sensitive on the CNI network Pods successful functioning.
For proper CNI network plugin implementation you have to include --pod-network-cidr flag to kubeadm init command and afterwards apply the same value to CALICO_IPV4POOL_CIDR parameter inside calico.yml.
Moreover, for a successful Pod network installation you have to apply some RBAC rules in order to make sufficient permissions in compliance with general cluster security restrictions, as described in official Kubernetes documentation:
For Calico to work correctly, you need to pass
--pod-network-cidr=192.168.0.0/16 to kubeadm init or update the calico.yml file to match your Pod network. Note that Calico works on
amd64 only.
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
In your case I would switched to the latest Calico versions at least from v3.3 as given in the example.
If you've noticed that you run Pod network plugin installation properly, please take a chance and update the question with your current environment setup and Kubernetes components versions with a health statuses.

Kubernetes: Trying to add second master node to K8S master using stacked control plane instructions

I was following instructions at https://kubernetes.io/docs/setup/independent/high-availability/#stacked-control-plane-and-etcd-nodes and I can't get the secondary master node to join the primary master.
$> kubeadm join LB_IP:6443 --token TOKEN --discovery-token-ca-cert-hash sha256:HASH --experimental-control-plane
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "LB_IP:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://LB_IP:6443"
[discovery] Requesting info from "https://LB_IP:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "LB_IP:6443"
[discovery] Successfully established connection with API Server "LB_IP:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
One or more conditions for hosting a new control plane instance is not satisfied.
unable to add a new control plane instance on a cluster that doesn't use an external etcd
Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The cluster uses an external etcd.
* The certificates that must be shared among control plane instances are provided.
Here is my admin init config:
apiVersion: kubeadm.k8s.io/v1alpha3
kind: ClusterConfiguration
kubernetesVersion: "1.12.3"
apiServer:
certSANs:
- "LB_IP"
controlPlaneEndpoint: "LB_IP:6443"
networking:
podSubnet: "192.168.128.0/17"
serviceSubnet: "192.168.0.0/17"
And I initialized the primary master node like:
kubeadm init --config=./kube-adm-config.yaml
I have also copied all the certs to the secondary node and kubectl works on the secondary:
[root#secondary ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
primary.fqdn Ready master 8h v1.12.3
I was really hoping to not set up external etcd nodes. The instructions seem pretty straightforward and I don't understand what I am missing.
Any advice to help get this stacked control plane multi-master setup with local etcd to work would be appreciated. Or any debugging ideas. Or at least "stacked control plane doesn't work, you must use external etcd".
Upgrading to k8s version 1.13.0 resolved my issue. I think the instructions were specifically for this newer version.

My kubernetes cluster IP address changed and now kubectl will no longer connect

Running under Ubuntu I used kubeadm init to setup my cluster (master node) and copied over the /etc/kubernetes/admin.conf $HOME/.kube/config and all was well when using kubectl.
However after a reboot my master node has had an IP address change which is not the same as what is in $HOME/.kube/config so now I can no longer connect kubectl
So how do I regenerate the admin.conf now that I have a new IP address? Running kubeadm init will just kill everything which is not what I want.
I found this solution on the internet and it works for me:
systemctl stop kubelet docker
cd /etc/
mv kubernetes kubernetes-backup
mv /var/lib/kubelet /var/lib/kubelet-backup
mkdir -p kubernetes
cp -r kubernetes-backup/pki kubernetes
rm kubernetes/pki/{apiserver.*,etcd/peer.*}
systemctl start docker
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd
#Run "kubeadm reset" on all nodes if was this error "error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists"
cp kubernetes/admin.conf ~/.kube/config
kubectl get nodes --sort-by=.metadata.creationTimestamp
kubectl delete node $(kubectl get nodes -o jsonpath='{.items[(#.status.conditions[0].status=="Unknown")].metadata.name}')
kubectl get pods --all-namespaces
After These, Join your Slaves to Master.
Reference: https://medium.com/#juniarto.samsudin/ip-address-changes-in-kubernetes-master-node-11527b867e88
The following command can be used to regenerate admin.conf
kubeadm alpha phase kubeconfig admin --apiserver-advertise-address <new_ip>
However, if you use an IP instead of a hostname, your API-server certificate will be invalid. So, either regenerate your certs ( kubeadm alpha phase certs renew apiserver ), use hostnames instead of IPs or add the insecure --insecure-skip-tls-verify flag when using kubectl
You do not want to use kubeadm reset. That will reset everything and you would have to start configuring your cluster again.
Well, in your scenario, please have a look on the steps below:
nano /etc/hosts (update your new IP against YOUR_HOSTNAME)
nano /etc/kubernetes/config (configuration settings related to your cluster) here in this file look for the following params and update accordingly
KUBE_MASTER="--master=http://YOUR_HOSTNAME:8080"
KUBE_ETCD_SERVERS="--etcd-servers=http://YOUR_HOSTNAME:2379" #2379 is default port
nano /etc/etcd/etcd.conf (conf related to etcd)
KUBE_ETCD_SERVERS="--etcd-servers=http://YOUR_HOSTNAME/WHERE_EVER_ETCD_HOSTED:2379"
2379 is default port for etcd. and you can have multiple etcd servers defined here comma separated
Restart kubelet, apiserver, etcd services.
It is good to use hostname instead of IP to avoid such scenarios.
Hope it helps!

How to change name of a kubernetes node

I have a running node in a kubernetes cluster. Is there a way I can change its name?
I have tried to
delete the node using kubectl delete
change the name in the node's manifest
add the node back.
But the node won't start.
Anyone know how it should be done?
Thanks
Usualy it's kubelet that is responsible for registering the node under particular name, so you should make changes to your nodes kubelet configuration and then it should pop up as new node.
Changing the node's name is not possible at the moment, it requires you to remove and rejoin the node.
You need to make sure the hostname is changed to the new name, remove the node, reset it and rejoin it.
(you will notice that with the command : kubectl edit node , you will get an error if you try and save the name:
A copy of your changes has been stored to "/tmp/kubectl-edit-qlh54.yaml"
error: At least one of apiVersion, kind and name was changed
)
Ideally you have removed the running pods on it.
You can try to run kubectl drain <node_name_to_rename> . Proceed at your own risk if that doesn't complete . --ignore-daemon-sets can be used to ignore possible issues for pods that cannot be evicted.
In short, for a node that has been renamed and is out of the cluster on CentOS 7:
kubectl delete node <original-nodename>
Then on the node that you want to rejoin, as root:
kubeadm reset
check the output and see if it applies to your setup (for potential further cleanup).
Now generate the join command on the master node:
export KUBECONFIG=/etc/kubernetes/admin.conf #(or wherever you have it)
kubeadm token create --print-join-command
Run the output on the worker node you have just reset:
kubeadm join <masternode_ip_address>:6443 --token somegeneratedtoken --discovery-token-ca-cert-hash sha256:somesha256hashthatyougotfromtheabovecommand
If you run kubectl get nodes it should show up now with the new name
output in my case:
W0220 10:43:23.286109 11473 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
Enjoy your renamed node!
Based on source: https://www.youtube.com/watch?v=TqoA9HwFLVU