Kubernetes worker node staying in "NotReady" state - kubernetes

I have two node Kubernetes setup in Virtualbox. Master is up and running fine. But the worker node is staying in "NotReady" state.
[root#master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 1d v1.10.2
node NotReady <none> 1h v1.10.2
"journalctl -u kubelet" command on worker node is reporting networking related errors:
kuberuntime_manager.go:757] checking backoff for container "install-cni" in pod "kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5ab)"
remote_runtime.go:278] ContainerStatus "459643e54de7f82df8ada0f60e8f3d51d42c5ce348747a66e20ad5720155e63f" from runtime service failed: rpc error: code = U
kuberuntime_container.go:636] failed to remove pod init container "install-cni": failed to get container status "459643e54de7f82df8ada0f60e8f3d51d42c5ce34
kuberuntime_manager.go:757] checking backoff for container "install-cni" in pod "kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5ab)"
kuberuntime_manager.go:767] Back-off 10s restarting failed container=install-cni pod=kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5a
pod_workers.go:186] Error syncing pod 873fa36d-4b83-11e8-9997-080027afb5ab ("kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5ab)"), sk
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni con
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni con
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni con
I am running Kubernetes version 1.10 and docker version 1.13.1. Could you please help me identify the root cause and resolution for this issue?

Well the thing is, when you want to form a kubernetes cluster, it requires that you deploy a CNI plugin which would provide networking between your pods. The error that you have shown here is due to a CNI plugin not being installed or not being configured properly.
The kube-dns pod would be in pending state until the CNI plugin is deployed on your cluster. Once kube-dns moves to a running state, (after deploying the cni provider) you can run your application workloads.
If you have not deployed a CNI plugin, there are several ones you can choose from.
Calico: Provides Pod networking via standard BGP. (Follow the documentation for further info)
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
Weave: Creates an overlay network.
export kubever=$(kubectl version | base64 | tr -d '\n')
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"
Flannel: Creates an overlay network treating each host as a subnet.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
Container traffic needs to be made aware to the iptables and you can do that by
sysctl net.bridge.bridge-nf-call-iptables=1
This is required by Flannel and Weave to function.
Please do refer to the documentation of each CNI plugin which would be suitable for your cluster.

Related

Resolving Minikube metallb imagepullbackoff

I am moving from Docker Desktop to Minikube and have been having some trouble in getting MetalLB to work properly. I am starting Minikube in MacOS Monterey.
I've started a Minikube profile using the command below:
minikube start -p myprofile --cpus=4 --memory='32g' --disk-size='100000mb'
--driver=hyperkit --kubernetes-version=v1.21.8 --addons=metallb
When I check the pods for MetalLB, they are in an ImagePullBackOff status. The pods are trying to pull images docker.io/metallb/controller:v0.9.6 and docker.io/metallb/speaker:v0.9.6 respectively.
NAME READY STATUS RESTARTS AGE
controller-5fd6788656-jvj4m 0/1 ImagePullBackOff 0 26m
speaker-ctdmw 0/1 ImagePullBackOff 0 37m
After running eval $(minikube -p myprofile docker-env) and manually pulling through docker pull docker.io/metallb/speaker:v0.9.6, I get the error:
Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on <ip-address>:53: read udp <ip-address>:49978-><ip-address>:53: i/o timeout
I'm not certain if it's useful, but after SSHing into the Minikube node, I've also verified ping google.com does not return a result.
When starting my Minikube profile, I had the following output:
😄 [myprofile] minikube v1.28.0 on Darwin 12.3.1
🆕 Kubernetes 1.25.3 is now available. If you would like to upgrade, specify: --kubernetes-version=v1.25.3
✨ Using the hyperkit driver based on existing profile
👍 Starting control plane node myprofile in cluster myprofile
🔄 Restarting existing hyperkit VM for "myprofile" ...
❗ This VM is having trouble accessing https://k8s.gcr.io
💡 To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/
🐳 Preparing Kubernetes v1.21.8 on Docker 20.10.20 ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
▪ Using image metallb/speaker:v0.9.6
▪ Using image metallb/controller:v0.9.6
🌟 Enabled addons: storage-provisioner, metallb, default-storageclass
❗ /usr/local/bin/kubectl is version 1.25.4, which may have incompatibilities with Kubernetes 1.21.8.
▪ Want kubectl v1.21.8? Try 'minikube kubectl -- get pods -A'
🏄 Done! kubectl is now configured to use "myprofile" cluster and "default" namespace by default

Kubernetes: Calio not working on remote worker, local ok

I setup a Kubernetes cluster with calico.
The setup is "simple"
1x master (local network, ok)
1x node (local network, ok)
1x node (cloud server, not ok)
All debian buster with docker 19.03
On the cloud server the calico pods do not come up:
calico-kube-controllers-token-x:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 47m (x50 over 72m) kubelet Pod sandbox changed, it will be killed and re-created.
Warning FailedMount 43m kubelet MountVolume.SetUp failed for volume "calico-kube-controllers-token-x" : failed to sync secret cache: timed out waiting for the condition
Normal SandboxChanged 3m41s (x78 over 43m) kubelet Pod sandbox changed, it will be killed and re-created.
calico-node-x:
Warning Unhealthy 43m (x5 over 43m) kubelet Liveness probe failed: calico/node is not ready: Felix is not live: Get "http://localhost:9099/liveness": dial tcp [::1]:9099: connect: connection refused
Warning Unhealthy 14m (x77 over 43m) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/bird/bird.ctl: connect: no such file or directory
Warning BackOff 4m26s (x115 over 39m) kubelet Back-off restarting failed container
My guess is that there is something wrong with IP/Network config, but did not figure out which.
Required ports (k8s&BGP) are forwarded from the router, also tried the master directly connected to the internet
--control-plane-endpoint is a hostname and public resolveable
Calico is using BGP peering (using public ip as peer)
This entry does worry me the most:
displayes local ip: kubectl get --raw /api
I tried to find a way to change this to the public IP of the master, without success.
Anyone got a clue what to try next?
After an additional time spend with analysis the problem happend to be the distributed api ip address was the local one, not the dns-name.
Created a vpn with wireguard from the cloud node to the local master, so the local ip of the master is reachable from the cloud node.
Don't know if that is the cleanest solution, but it works.
Run this command to verify if IP_AUTODETECTION_METHOD environment variable in calico daemonset has been set
kubectl get daemonset/calico-node -n kube-system --output json | jq '.spec.template.spec.containers[].env[] | select(.name | startswith("IP"))'
Run this command in each of your k8s nodes to find the valid network interface
ifconfig
Explicitly set the IP_AUTODETECTION_METHOD environment variable, to make sure the calico node communicates to the correct network interface of the K8s node.
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=en.*

Failed to create pod sandbox kubernetes error

I have a Ubuntu 16.04 which is acting as kubernetes master. I have installed kuber v1.13.1 and using weave for networking. I have 2 Raspberry pi devices running the same version of kubernetes. I created a cluster and joined the raspberry pi to Ubuntu kube master. I have started a deployment and everything looks to be working fine.
When I checked the logs of the container, I found out that it was not able to connect to the internet. I tried pinging but got no results. When I run the command to describe the pod, I got following:
Warning FailedCreatePodSandBox 42m (x3 over 42m) kubelet, node02 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "dea99f80488031b84b7b1f934343e54d877adf931071401651628505d52f55f9" network for pod "deployment-cnfc5": NetworkPlugin cni failed to set up pod "deployment-cnfc5_matrix-device" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/dea99f80488031b84b7b1f934343e54d877adf931071401651628505d52f55f9: dial tcp 127.0.0.1:6784: connect: connection refused
I have checked the directory /etc/cni/net.d and it contains 10-weave.conflist on both master and worker node. I have also checked the directory /opt/cni/bin and found below on master node:
bridge flannel ipvlan macvlan ptp tuning weave-ipam weave-plugin-2.5.1
dhcp host-local loopback portmap sample vlan weave-net
and on worker, I got below:
bridge flannel ipvlan macvlan ptp tuning weave-ipam weave-plugin-2.5.0
dhcp host-local loopback portmap sample vlan weave-net weave-plugin-2.5.1
Please can anyone please let me know what can I do to resolve this issue.? Thanks.
I initiated the kube master by using below commands:
sudo kubeadm init --token-ttl=0 --apiserver-advertise-address=192.168.0.142
and installed weave using:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

Pods are not created on new nodes

When i create a sample nginx pod with some replica's to test my kubernetes cluster. i get a strange output. The pods create themself on the first node but on the 2 other nodes they stuck at status "Container creating"
When i describe the pods (only the ones on the other nodes) they give this error message
Warning FailedCreatePodSandBox 1m kubelet, xploregroup Failed create pod sandbox.
Normal SandboxChanged 1m kubelet, xploregroup Pod sandbox changed, it will be killed and re-created.
the strange part is that all node have all exactly the same configuration (cloned the image from the master) and i joined them all exactly the same way.
The pods get distributed normally but only the pods on node1 is running .
Can someone direct me to the same direction :(
[EDIT]
journalctl -u kubelet gives this error
Mar 12 13:42:45 kubeMaster kubelet[16379]: W0312 13:42:45.824314 16379 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 12 13:42:45 kubeMaster kubelet[16379]: E0312 13:42:45.824816 16379 kubelet.go:2104] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
The problem seems to be with my network plugin. In my /etc/systemd/system/kubelet.service.d/10.kubeadm.conf . the flags for the network plugins are present ? environment= kubelet_network_args --cni-bin-dir=/etc/cni/net.d
--network-plugin=cni

using network plugins "cni": cni config unintialized; Skipping pod

I created the kubernetes cluster by using kubeadm kubeadm init.
I am getting error messages in /var/log/messages.
Oct 20 10:09:52 aws08 kubelet: I1020 10:09:52.015921 7116
docker_manager.go:1787] DNS ResolvConfPath exists:
/var/lib/docker/containers/717adf7a8481637ac20a9ba103d8f97635a88bf05f18bd4299f0d164e48f2920/resolv.conf.
Will attempt to add ndots option: options ndots:5 Oct 20 10:09:52
aws08 kubelet: I1020 10:09:52.015963 7116 docker_manager.go:2121]
Calling network plugin cni to setup pod for
kube-dns-2247936740-cjij4_kube-system(3b296413-96aa-11e6-8c40-02fff663a168)
Oct 20 10:09:52 aws08 kubelet: E1020 10:09:52.015982 7116
docker_manager.go:2127] Failed to setup network for pod
"kube-dns-2247936740-cjij4_kube-system(3b296413-96aa-11e6-8c40-02fff663a168)"
using network plugins "cni": cni config unintialized; Skipping pod Oct
20 10:09:52 aws08 kubelet: I1020 10:09:52.018824 7116
docker_manager.go:1492] Killing container
"717adf7a8481637ac20a9ba103d8f97635a88bf05f18bd4299f0d164e48f2920
kube-system/kube-dns-2247936740-cjij4" with 30 second grace period
The DNS pod is failing:
kube-system kube-dns-2247936740-j5rtc 0/3 ContainerCreating 21 1h
If I disabled CNI, the DNS pod is running. But the issue for DNS persists.
The method to disable cni is to comment the KUBELET_NETWORK_ARGS line in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and restart kubelet service
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
# Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=100.64.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_EXTRA_ARGS=--v=4"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_EXTRA_ARGS
followed by:
sudo systemctl restart kubelet
I'm guessing that you forgot to setup the pod network.
From the documentation:
It is necessary to do this before you try to deploy any applications to your cluster, and before kube-dns will start up. Note also that kubeadm only supports CNI based networks and therefore kubenet based networks will not work.
You can install a pod network add-on with the following command:
kubectl apply -f <add-on.yaml>
Example:
kubectl create -f https://git.io/weave-kube
To install Weave Net add-on.
After you have done this, you might need to recreate kube-dns pod.
The cni intialization should be completed during kubelet initialization. So try reboot kubelet service and make sure that cni configuration can be parsed correctly.