Unauthorized issues when adding new kubernetes master - kubernetes

I am trying to add new master , and I copy cert and keys i.e. /etc/kubernetes/pki/apiserver-kubelet-client.crt from current master to a new one. I noticed that after I do 'kubeadm init --config=config.yaml' this key (probably all of them) is changing (kubeadm init itself is successful).. Why is this happening and could it be a root cause of my new master being in NotReady status ?
systemctl status kubelet shows a lot of *Failed to list v1.Node: Unauthorized, *Failed to list v1.Secret: Unauthorized..
docker#R90HE73F:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master-0 Ready master 7d1h v1.13.1
k8s-master-1 Ready master 7d v1.13.1
k8s-master-2 NotReady master 104m v1.13.1
k8s-worker-0 Ready <none> 7d v1.13.1
k8s-worker-1 Ready <none> 7d v1.13.1
k8s-worker-2 Ready <none> 7d v1.13.1
Btw etcd cluster is healthy

To add a new master to kubernetes cluster, you need to copy four files from your existing kubernetes master certificate directory before doing kubeadm init on new master. Those files are ca.crt, ca.key, sa.pub, sa.key and copy these files to /etc/kubernetes/pki folder on new master. If you don't copy the sa* files your kubernetes master will be into NotReady state and will have those errors.
For more information on how to setup kubernetes multi master, please check out my blog on kubernetes high availability:
https://velotio.com/blog/2018/6/15/kubernetes-high-availability-kubeadm

Related

How to constaint kubectl kubeconfig only display work node but the master node is not displayed

When using kubelet kubeconfig, only the worker node is displayed but the master node is not displayed, Like the following output on the aws eks worker node:
kubectl get node --kubeconfig /var/lib/kubelet/kubeconfig
NAME STATUS ROLES AGE VERSION
ip-172-31-12-2.ap-east-1.compute.internal Ready <none> 30m v1.18.9-eks-d1db3c
ip-172-31-42-138.ap-east-1.compute.internal Ready <none> 4m7s v1.18.9-eks-d1db3c
For some reasons, I need to hide the information of other worker and master nodes, and only display the information of the worker node where the kubectl command is currently executed.
what should i do ?
I really appreciate your help.

Kubernetes nodes get status "NotReady" if I turn off one of the masters

I have a Kubernetes cluster of 3 masters and 2 nodes in VM cloude on CentOS7:
[root#kbn-mst-02 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kbn-mst-01 Ready master 15d v1.18.3
kbn-mst-02 Ready master 14d v1.18.3
kbn-mst-03 Ready master 14d v1.18.3
kbn-wn-01 Ready <none> 25h v1.18.5
kbn-wn-02 Ready <none> 150m v1.18.5
If I turn off kbn-mst-03 (212.46.30.7), then kbn-wn-01 and kbn-wn-02 get status NotReady:
[root#kbn-mst-02 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kbn-mst-01 Ready master 15d v1.18.3
kbn-mst-02 Ready master 14d v1.18.3
kbn-mst-03 NotReady master 14d v1.18.3
kbn-wn-01 NotReady <none> 25h v1.18.5
kbn-wn-02 NotReady <none> 154m v1.18.5
The log on kbn-wn-02 shows lost connection to 212.46.30.7:
Jul 3 09:28:10 kbn-wn-02 kubelet: E0703 09:28:10.295233 12339 kubelet_node_status.go:402] Error updating node status, will retry: error getting node "kbn-wn-02": Get https://212.46.30.7:6443/api/v1/nodes/kbn-wn-02?resourceVersion=0&timeout=10s: context deadline exceede
Turning off other masters doesn't change the status of nodes.
Why does kbn-wn-02 have a hard bind to kbn-mst-03 (212.46.30.7) and how can I change it?
Currently your worker nodes only know about kbn-mst-03 master and when that master is turned off the kubelet on worker nodes can not send health status and metrics of the worker node to master node kbn-mst-03and hence you see worker nodes asNotReady`. This is also the reason why turning off other masters does not change the status of nodes because they are not known and contacted at all by kubelet of worker nodes.
You should use a load balancer in-front of the masters and use the load balancer endpoint while creating the worker nodes. Then if one master is turned off other master nodes will be able to handle requests because the load balancer will stop sending traffic to failed master and route traffic to other master.
How can you change the hard bind to one master and move to use load balancer endpoint will depend on what tool you used to setup the kubernetes cluster. If you are using kubeadm then you can specify a load balancer endpoint in kubeadm init in master nodes and use that endpoint in kubeadm join in worker nodes.
From kubeadm docs here
--control-plane-endpoint can be used to set the shared endpoint for all control-plane nodes.
--control-plane-endpoint allows both IP addresses and DNS names that can map to IP addresses. Please contact your network administrator to
evaluate possible solutions with respect to such mapping.
Here is an example mapping:
192.168.0.102 cluster-endpoint
Where 192.168.0.102 is the IP address of this node and cluster-endpoint is a custom DNS name that maps to
this IP. This will allow you to pass
--control-plane-endpoint=cluster-endpoint to kubeadm init and pass the same DNS name to kubeadm join. Later you can modify cluster-endpoint
to point to the address of your load-balancer in an high availability
scenario.

Kubernetes HA master set up

I have made a HA Kubernetes cluster. FIrst I added a node and joined the other node as master role.
I basically did the multi etcd set up. This worked fine for me. I did the fail over testing which also worked fine. Now the problem is once I am done working, I drained and deleted the other node and then I shut down the other machine( a VM on GCP). But then my kubectl commands dont work... Let me share the steps:
kubectl get node(when multi node is set up)
NAME STATUS ROLES AGE VERSION
instance-1 Ready <none> 17d v1.15.1
instance-3 Ready <none> 25m v1.15.1
masternode Ready master 18d v1.16.0
kubectl get node ( when I shut down my other node)
root#masternode:~# kubectl get nodes
The connection to the server k8smaster:6443 was refused - did you specify the right host or port?
Any clue?
After reboot the server you need to do some step below:
sudo -i
swapoff -a
exit
strace -eopenat kubectl version

kubernetes worker node in "NotReady" status

I am trying to setup my first cluster using Kubernetes 1.13.1. The master got initialized okay, but both of my worker nodes are NotReady. kubectl describe node shows that Kubelet stopped posting node status on both worker nodes. On one of the worker nodes I get log output like
> kubelet[3680]: E0107 20:37:21.196128 3680 kubelet.go:2266] node
> "xyz" not found.
Here is the full details:
I am using Centos 7 & Kubernetes 1.13.1.
Initializing was done as follows:
[root#master ~]# kubeadm init --apiserver-advertise-address=10.142.0.4 --pod-network-cidr=10.142.0.0/24
Successfully initialized the cluster:
You can now join any number of machines by running the following on each node
as root:
`kubeadm join 10.142.0.4:6443 --token y0epoc.zan7yp35sow5rorw --discovery-token-ca-cert-hash sha256:f02d43311c2696e1a73e157bda583247b9faac4ffb368f737ee9345412c9dea4`
deployed the flannel CNI:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
The join command worked fine.
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node01" as an annotation
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
Result of kubectl get nodes:
[root#master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 9h v1.13.1
node01 NotReady <none> 9h v1.13.1
node02 NotReady <none> 9h v1.13.1
on both nodes:
[root#node01 ~]# service kubelet status
Redirecting to /bin/systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2019-01-08 04:49:20 UTC; 32s ago
Docs: https://kubernetes.io/docs/
Main PID: 4224 (kubelet)
Memory: 31.3M
CGroup: /system.slice/kubelet.service
└─4224 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfi
`Jan 08 04:54:10 node01 kubelet[4224]: E0108 04:54:10.957115 4224 kubelet.go:2266] node "node01" not found`
I appreciate your advise on how to troubleshoot this.
The previous answer sounds correct. You can verify that by running
kubectl describe node node01 on the master, or wherever kubectl is correctly configured.
It seems like the reason of this error is due to incorrect subnet. In Flannel documentation it is written that you should use /16 not /24 for pod network.
NOTE: If kubeadm is used, then pass --pod-network-cidr=10.244.0.0/16
to kubeadm init to ensure that the podCIDR is set.
I tried to run kubeadm with /24 and although I had nodes in Ready state the flannel pods did not run properly which resulted in some issues.
You can check if your flannel pods are running properly by:
kubectl get pods -n kube-system if the status is other than running then it is incorrect behavior. In this case you can check details by running kubectl describe pod PODNAME -n kube-system. Try changing the subnet and update us if that fixed the problem.
I ran into almost the same problem, and in the end I found that the reason was that the firewall was not turned off. You can try the following commands:
sudo ufw disable
or
systemctl disable firewalld
or
setenforce 0

Kubernetes worker node is in Not Ready state

I am comparatively new to kubernetes but i have successfully created many clusters before. Now i am facing an issue where i tried to add a node to an already existing cluster. At first kubeadm join seems to be successful but even after initializing the pod network only the master became into Ready.
root#master# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-virtual-machine Ready master 18h v1.9.0
testnode-virtual-machine NotReady <none> 16h v1.9.0
OS: Ubuntu 16.04
Any help will be appreciated.
Thanks.
try the following on the slave node and try to get the status again on master.
> sudo swapoff -a
> exit