How to join worker node in existing cluster? - kubernetes

I was facing some issues while joining worker node in existing cluster.
Please find below the details of my scenario.
I've Created a HA clusters with 4 master and 3 worker.
I removed 1 master node.
Removed node was not a part of cluster now and reset was successful.
Now joining the removed node as worker node in existing cluster.
I'm firing below command
kubeadm join --token giify2.4i6n2jmc7v50c8st 192.168.230.207:6443 --discovery-token-ca-cert-hash sha256:dd431e6e19db45672add3ab0f0b711da29f1894231dbeb10d823ad833b2f6e1b
In above command - 192.168.230.207 is cluster IP
Result of above command
[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING FileExisting-tc]: tc not found in system path
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get config map: Get https://192.168.230.206:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config: dial tcp 192.168.230.206:6443: connect: connection refused
Already tried Steps
ted this file(kubectl -n kube-system get cm kubeadm-config -o yaml) using kubeadm patch and removed references of removed node("192.168.230.206")
We are using external etcd so checked member list to confirm removed node is not a part of etcd now. Fired below command etcdctl --endpoints=https://cluster-ip --ca-file=/etc/etcd/pki/ca.pem --cert-file=/etc/etcd/pki/client.pem --key-file=/etc/etcd/pki/client-key.pem member list
Can someone please help me resolve this issue as I'm not able to join this node?

In addition to #P Ekambaram answer, I assume that you probably have completely dispose of all redundant data from previous kubeadm join setup.
Remove cluster entries via kubeadm command on worker node: kubeadm reset;
Wipe all redundant data residing on worker node: rm -rf /etc/kubernetes; rm -rf ~/.kube;
Try to re-join worker node.

Use these instructions one after another to completely remove all old installation on worker node.....
kubeadm reset
systemctl stop kubelet
systemctl stop docker
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ifconfig docker0 down
ip link delete cni0
ip link delete flannel.1
systemctl start docker.service
yum remove kubeadm kubectl kubelet kubernetes-cni kube*
yum autoremove
rm -rf ~/.kube
then reinstall using
yum install -y kubelet kubeadm kubectl
reboot
systemctl start docker && systemctl enable docker
systemctl start kubelet && systemctl enable kubelet
then use kubeadm join command

fix these problems and run join command
docker service is not enabled, please run 'systemctl enable docker.service'
detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd".
kubelet service is not enabled, please run 'systemctl enable kubelet.service

kubeadm join try to reach 192.168.230.206 whereas new IP is 192.168.230.207.
In addition to change in cm kubeadm-config, you may change your cluster IP address (cluster.server) in this configmap
kubectl edit cm -n kube-public cluster-info

Related

The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?

I have setup the Kubernetes cluster with Kubespray
Once I restart the node and check the status of the node I am getting as below
$ kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
Environment:
OS : CentOS 7
Kubespray
kubelet version: 1.22.3
Need your help on this.
Regards,
Zain
This work for me, I'm using minukube,
When checking the minikube status by running the command minikube status you'll probably get something like that
E0121 07:14:19.882656 7165 status.go:415] kubeconfig endpoint: got:
127.0.0.1:55900, want: 127.0.0.1:49736
type: Control Plane
host: Running
kubelet: Stopped
apiserver: Stopped
kubeconfig: Misconfigured
To fix it, I just followed the next steps:
minikube update-context
minukube start
Below step can solve your issue.
kubelet may be down, use the below commands on the master node.
1. sudo -i
2. swapoff -a
3. exit
4. strace -eopenat kubectl version
Then try using kubectl get nodes.
Thank you Sai for your inputs. i was getting journalctl -xeu kubelet output was Error while dialing dial unix /var/run/cri-dockerd.sock: connect: no such file or directory i was restarted and enabled cri-dockerd services
sudo systemctl enable cri-dockerd.service
sudo systemctl restart cri-dockerd.service
then sudo systemctl start kubelet finally it works for me.
#kubectl cluster-info
Kubernetes control plane is running at https://127.0.0.1:6443
this link will give https://github.com/kubernetes-sigs/kubespray/issues/8734 more info.
Regards,Zain

Kubernetes Deployment: Error: failed to create deployment:

Environment Details:
Kubernetes version: `v1.20.2`
Master Node: `Bare Metal/Host OS: CentOS 7`
Worker Node: `VM/Host OS: CentOS 7`
I have installed & configured the Kubernetes cluster, the Master node on the bare metal server & the worker node on windows server 2012 HyperV VM. Both master and worker nodes have the same Kubernetes version ( v1.20.2) & centos7. Successfully joined worker node to master, below is the get nodes status.
$ kubectl get nodes
**NAME STATUS ROLES AGE VERSION
k8s-worker-node1 Ready <none> 2d2h v1.20.2
master-node Ready control-plane,master 3d4h v1.20.2**
While creating a deployment on the worker node I am getting the below error message.
On worker node, I issued the following command.
$ kubectl create deployment nginx-depl --image=nginx
Error message is:
error: failed to create deployment: Post “http://localhost:8080/apis/apps/v1/namespaces/default/deployments?fieldManager=kubectl-create”: dial tcp: lookup localhost on 8.8.8.8:53: no such host
please help me to resolve this issue as I am not able to understand what is the problem.
May you have to run minikube start before. I’m learning and between one class and another I forgot to run this command. I hope I have helped someone.
This worked for me.
It seems that you are issuing the kubectl create deployment command on the worker node. This won't work because the kubectl command communicates with the kub-apiserver for cluster communication. Since the apiserver does not run on the worker node executing the command on it will raise an error.
Instead execute the same kubectl command on the master node as a non-root user with the following additional commands,
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ kubectl create deployment nginx-depl --image=nginx

My kubernetes cluster IP address changed and now kubectl will no longer connect

Running under Ubuntu I used kubeadm init to setup my cluster (master node) and copied over the /etc/kubernetes/admin.conf $HOME/.kube/config and all was well when using kubectl.
However after a reboot my master node has had an IP address change which is not the same as what is in $HOME/.kube/config so now I can no longer connect kubectl
So how do I regenerate the admin.conf now that I have a new IP address? Running kubeadm init will just kill everything which is not what I want.
I found this solution on the internet and it works for me:
systemctl stop kubelet docker
cd /etc/
mv kubernetes kubernetes-backup
mv /var/lib/kubelet /var/lib/kubelet-backup
mkdir -p kubernetes
cp -r kubernetes-backup/pki kubernetes
rm kubernetes/pki/{apiserver.*,etcd/peer.*}
systemctl start docker
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd
#Run "kubeadm reset" on all nodes if was this error "error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists"
cp kubernetes/admin.conf ~/.kube/config
kubectl get nodes --sort-by=.metadata.creationTimestamp
kubectl delete node $(kubectl get nodes -o jsonpath='{.items[(#.status.conditions[0].status=="Unknown")].metadata.name}')
kubectl get pods --all-namespaces
After These, Join your Slaves to Master.
Reference: https://medium.com/#juniarto.samsudin/ip-address-changes-in-kubernetes-master-node-11527b867e88
The following command can be used to regenerate admin.conf
kubeadm alpha phase kubeconfig admin --apiserver-advertise-address <new_ip>
However, if you use an IP instead of a hostname, your API-server certificate will be invalid. So, either regenerate your certs ( kubeadm alpha phase certs renew apiserver ), use hostnames instead of IPs or add the insecure --insecure-skip-tls-verify flag when using kubectl
You do not want to use kubeadm reset. That will reset everything and you would have to start configuring your cluster again.
Well, in your scenario, please have a look on the steps below:
nano /etc/hosts (update your new IP against YOUR_HOSTNAME)
nano /etc/kubernetes/config (configuration settings related to your cluster) here in this file look for the following params and update accordingly
KUBE_MASTER="--master=http://YOUR_HOSTNAME:8080"
KUBE_ETCD_SERVERS="--etcd-servers=http://YOUR_HOSTNAME:2379" #2379 is default port
nano /etc/etcd/etcd.conf (conf related to etcd)
KUBE_ETCD_SERVERS="--etcd-servers=http://YOUR_HOSTNAME/WHERE_EVER_ETCD_HOSTED:2379"
2379 is default port for etcd. and you can have multiple etcd servers defined here comma separated
Restart kubelet, apiserver, etcd services.
It is good to use hostname instead of IP to avoid such scenarios.
Hope it helps!

Unable to see join nodes in Kubernetes master

This is my worker node:
root#ivu:~# kubeadm join 10.16.70.174:6443 --token hl36mu.0uptj0rp3x1lfw6n --discovery-token-ca-cert-hash sha256:daac28160d160f938b82b8c720cfc91dd9e6988d743306f3aecb42e4fb114f19 --ignore-preflight-errors=swap
[preflight] Running pre-flight checks.
[WARNING Swap]: running with swap on is not supported. Please disable swap
[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[discovery] Trying to connect to API Server "10.16.70.174:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.16.70.174:6443"
[discovery] Requesting info from "https://10.16.70.174:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.16.70.174:6443"
[discovery] Successfully established connection with API Server "10.16.70.174:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
While checking in master nodes using command kubectl get nodes, I can only able to see master:
ivum01#ivum01-HP-Pro-3330-SFF:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ivum01-hp-pro-3330-sff Ready master 36m v1.10.0
For question answer:
docker kubelet kubeadm kubectl installed fine
kubectl get node can not see the current added node; of cause kubectl get pods --all-namespaces has no result for this node;
docker which in current node has no report for kubeadm command(means no k8s images pull, no running container for that)
must import is kubelet not running in work node
run kubelet output:
Failed to get system container stats for "/user.slice/user-1000.slice/session-1.scope": failed to get cgroup stats for "/user.slice/user-1000.slice/session-1.scope": failed to get container info for "/user.slice/user-1000.slice/session-1.scope": unknown container "/user.slice/user-1000.slice/session-1.scope"
same as this issue said
tear down and reset cluster(kubeadm reset) and redo that has no problem in my case.
I had this problem and it was solved by ensuring that cgroup driver on the worker nodes also were set properly.
check with:
docker info | grep -i cgroup
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
set it with:
sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
then restart kubelet service and rejoin the cluster:
systemctl daemon-reload
systemctl restart kubelet
kubeadm reset
kubeadm join ...
Info from docs: https://kubernetes.io/docs/tasks/tools/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node

Installing a Kubernetes pod network for cluster nodes hosted on VirtualBox VMs

On OS X 10.11.6, I created 4 CentOS 7 VMs each with two interfaces ( One NAT, and one Host-only network.) in VirtualBox. Each VM's host-only interface receives an IP via DCHCP and DNS via dnsmasq.
OS X is running dnsmasq configure via a /usr/local/etc/dnsmasq.conf file that contains:
interface=vboxnet0
bind-interfaces
dhcp-range=vboxnet0,192.168.56.100,192.168.56.200,255.255.255.0,infinite
dhcp-leasefile=/usr/local/etc/dnsmasq.leases
local=/dev/
expand-hosts
domain=dev
address=/kube-master.dev/192.168.56.100
address=/kube-minion1.dev/192.168.56.101
address=/kube-minion2.dev/192.168.56.102
address=/kube-minion3.dev/192.168.56.103
address=/vbox-host.dev/192.168.56.1
dhcp-host=08:00:27:09:48:16,192.168.56.100
dhcp-host=0a:00:27:00:00:00,192.168.56.1
dhcp-host=08:00:27:95:AE:39,192.168.56.101
dhcp-host=08:00:27:97:C9:D4,192.168.56.102
dhcp-host=08:00:27:9B:AD:B5,192.168.56.103
I can ssh into each VM through their respective host-only adapter's associated address (e.g., kube-master.dev, kube-minion1.dev, kube-minion2.dev, kube-minion3.dev), and then
yum update
skipping a few steps, get to the point of installing kubeadm as per http://kubernetes.io/docs/getting-started-guides/kubeadm/, that is:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
setenforce 0
yum install -y docker kubelet kubeadm kubectl kubernetes-cni ebtables
systemctl enable docker && systemctl start docker
systemctl enable kubelet && systemctl start kubelet
Then it is unclear to me if the following is correct but on kube-master.dev I execute
kubeadm init --api-advertise-addresses=192.168.56.100 --api-external-dns-names=kube-master.dev
And then on each minion execute:
rm -Rf /etc/kubernetes/manifests/
kubeadm join --token=e7cd12.68011e93d5db7670 192.168.56.100
On kube-master.dev, I then run
kubectl get nodes
to verify the each node has joined the cluster.
The command returns:
NAME STATUS AGE
kube-master.dev Ready 44m
kube-minion1.dev Ready 40m
kube-minion2.dev Ready 39m
kube-minion3.dev Ready 39m
indicating things are groovy.
Afterward, things go entirely off the rail when I attempt to install a pod network.
On kube-master.dev, I run:
kubectl apply -f https://git.io/weave-kube
to install Weave Net, and once the POD network is installed I start monitoring that network is working via executing:
watch kubectl get pods --all-namespaces
And
kube-dns-654381707-05i1t 0/3
never moves off of zero.
So please what am I doing wrong? I've hammered at this for days. The kubeadm documentation is a bit thin in a few place, so I'm not sure I init'ed the master correctly, and installing the pod network is bit conjecture on my part. Also, I haven't found a tutorial other than the Kubernetes kubeadm and the associated youtube video documenting the use of kubeadm to set up a kubernetes cluster.