using network plugins "cni": cni config unintialized; Skipping pod - kubernetes

I created the kubernetes cluster by using kubeadm kubeadm init.
I am getting error messages in /var/log/messages.
Oct 20 10:09:52 aws08 kubelet: I1020 10:09:52.015921 7116
docker_manager.go:1787] DNS ResolvConfPath exists:
/var/lib/docker/containers/717adf7a8481637ac20a9ba103d8f97635a88bf05f18bd4299f0d164e48f2920/resolv.conf.
Will attempt to add ndots option: options ndots:5 Oct 20 10:09:52
aws08 kubelet: I1020 10:09:52.015963 7116 docker_manager.go:2121]
Calling network plugin cni to setup pod for
kube-dns-2247936740-cjij4_kube-system(3b296413-96aa-11e6-8c40-02fff663a168)
Oct 20 10:09:52 aws08 kubelet: E1020 10:09:52.015982 7116
docker_manager.go:2127] Failed to setup network for pod
"kube-dns-2247936740-cjij4_kube-system(3b296413-96aa-11e6-8c40-02fff663a168)"
using network plugins "cni": cni config unintialized; Skipping pod Oct
20 10:09:52 aws08 kubelet: I1020 10:09:52.018824 7116
docker_manager.go:1492] Killing container
"717adf7a8481637ac20a9ba103d8f97635a88bf05f18bd4299f0d164e48f2920
kube-system/kube-dns-2247936740-cjij4" with 30 second grace period
The DNS pod is failing:
kube-system kube-dns-2247936740-j5rtc 0/3 ContainerCreating 21 1h
If I disabled CNI, the DNS pod is running. But the issue for DNS persists.
The method to disable cni is to comment the KUBELET_NETWORK_ARGS line in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and restart kubelet service
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
# Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=100.64.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_EXTRA_ARGS=--v=4"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_EXTRA_ARGS
followed by:
sudo systemctl restart kubelet

I'm guessing that you forgot to setup the pod network.
From the documentation:
It is necessary to do this before you try to deploy any applications to your cluster, and before kube-dns will start up. Note also that kubeadm only supports CNI based networks and therefore kubenet based networks will not work.
You can install a pod network add-on with the following command:
kubectl apply -f <add-on.yaml>
Example:
kubectl create -f https://git.io/weave-kube
To install Weave Net add-on.
After you have done this, you might need to recreate kube-dns pod.

The cni intialization should be completed during kubelet initialization. So try reboot kubelet service and make sure that cni configuration can be parsed correctly.

Related

kubernetes worker node in "NotReady" status

I am trying to setup my first cluster using Kubernetes 1.13.1. The master got initialized okay, but both of my worker nodes are NotReady. kubectl describe node shows that Kubelet stopped posting node status on both worker nodes. On one of the worker nodes I get log output like
> kubelet[3680]: E0107 20:37:21.196128 3680 kubelet.go:2266] node
> "xyz" not found.
Here is the full details:
I am using Centos 7 & Kubernetes 1.13.1.
Initializing was done as follows:
[root#master ~]# kubeadm init --apiserver-advertise-address=10.142.0.4 --pod-network-cidr=10.142.0.0/24
Successfully initialized the cluster:
You can now join any number of machines by running the following on each node
as root:
`kubeadm join 10.142.0.4:6443 --token y0epoc.zan7yp35sow5rorw --discovery-token-ca-cert-hash sha256:f02d43311c2696e1a73e157bda583247b9faac4ffb368f737ee9345412c9dea4`
deployed the flannel CNI:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
The join command worked fine.
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node01" as an annotation
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
Result of kubectl get nodes:
[root#master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 9h v1.13.1
node01 NotReady <none> 9h v1.13.1
node02 NotReady <none> 9h v1.13.1
on both nodes:
[root#node01 ~]# service kubelet status
Redirecting to /bin/systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2019-01-08 04:49:20 UTC; 32s ago
Docs: https://kubernetes.io/docs/
Main PID: 4224 (kubelet)
Memory: 31.3M
CGroup: /system.slice/kubelet.service
└─4224 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfi
`Jan 08 04:54:10 node01 kubelet[4224]: E0108 04:54:10.957115 4224 kubelet.go:2266] node "node01" not found`
I appreciate your advise on how to troubleshoot this.
The previous answer sounds correct. You can verify that by running
kubectl describe node node01 on the master, or wherever kubectl is correctly configured.
It seems like the reason of this error is due to incorrect subnet. In Flannel documentation it is written that you should use /16 not /24 for pod network.
NOTE: If kubeadm is used, then pass --pod-network-cidr=10.244.0.0/16
to kubeadm init to ensure that the podCIDR is set.
I tried to run kubeadm with /24 and although I had nodes in Ready state the flannel pods did not run properly which resulted in some issues.
You can check if your flannel pods are running properly by:
kubectl get pods -n kube-system if the status is other than running then it is incorrect behavior. In this case you can check details by running kubectl describe pod PODNAME -n kube-system. Try changing the subnet and update us if that fixed the problem.
I ran into almost the same problem, and in the end I found that the reason was that the firewall was not turned off. You can try the following commands:
sudo ufw disable
or
systemctl disable firewalld
or
setenforce 0

Kubernetes worker node staying in "NotReady" state

I have two node Kubernetes setup in Virtualbox. Master is up and running fine. But the worker node is staying in "NotReady" state.
[root#master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 1d v1.10.2
node NotReady <none> 1h v1.10.2
"journalctl -u kubelet" command on worker node is reporting networking related errors:
kuberuntime_manager.go:757] checking backoff for container "install-cni" in pod "kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5ab)"
remote_runtime.go:278] ContainerStatus "459643e54de7f82df8ada0f60e8f3d51d42c5ce348747a66e20ad5720155e63f" from runtime service failed: rpc error: code = U
kuberuntime_container.go:636] failed to remove pod init container "install-cni": failed to get container status "459643e54de7f82df8ada0f60e8f3d51d42c5ce34
kuberuntime_manager.go:757] checking backoff for container "install-cni" in pod "kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5ab)"
kuberuntime_manager.go:767] Back-off 10s restarting failed container=install-cni pod=kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5a
pod_workers.go:186] Error syncing pod 873fa36d-4b83-11e8-9997-080027afb5ab ("kube-flannel-ds-zjlvn_kube-system(873fa36d-4b83-11e8-9997-080027afb5ab)"), sk
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni con
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni con
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni con
I am running Kubernetes version 1.10 and docker version 1.13.1. Could you please help me identify the root cause and resolution for this issue?
Well the thing is, when you want to form a kubernetes cluster, it requires that you deploy a CNI plugin which would provide networking between your pods. The error that you have shown here is due to a CNI plugin not being installed or not being configured properly.
The kube-dns pod would be in pending state until the CNI plugin is deployed on your cluster. Once kube-dns moves to a running state, (after deploying the cni provider) you can run your application workloads.
If you have not deployed a CNI plugin, there are several ones you can choose from.
Calico: Provides Pod networking via standard BGP. (Follow the documentation for further info)
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
Weave: Creates an overlay network.
export kubever=$(kubectl version | base64 | tr -d '\n')
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"
Flannel: Creates an overlay network treating each host as a subnet.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
Container traffic needs to be made aware to the iptables and you can do that by
sysctl net.bridge.bridge-nf-call-iptables=1
This is required by Flannel and Weave to function.
Please do refer to the documentation of each CNI plugin which would be suitable for your cluster.

Install openiscsi initiator on kubelet

I have created a 3 node azure kubernetes cluster using the following commands
az group create --name ResourceGroup --location canadacentral
az provider register -n Microsoft.ContainerService
az provider register -n Microsoft.Compute
az provider register -n Microsoft.Network
az aks create --resource-group ResourceGroup --name ReplicaSet --node-count 3 --kubernetes-version 1.8.7 --node-vm-size Standard_A0 --generate-ssh-keys
kubectl create -f https://raw.githubusercontent.com/openebs/openebs/master/k8s/openebs-operator.yaml
kubectl create -f https://raw.githubusercontent.com/openebs/openebs/master/k8s/openebs-storageclasses.yaml
Subsequently I have created a postgres stateful set as well which does not start since opensci is not installed on the kubelet.
Kubelet logs from Node-1 (where the pgset pod is scheduled)
I0313 05:42:41.910525 7845 reconciler.go:257] operationExecutor.MountVolume started for volume "pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3" (UniqueName: "kubernetes.io/iscsi/10.0.20.229:3260:iqn.2016-09.com.openebs.jiva:pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3:0") pod "pgset-0" (UID: "a9826973-2674-11e8-a384-0a58ac1f03e3")
I0313 05:42:41.910605 7845 operation_generator.go:416] MountVolume.WaitForAttach entering for volume "pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3" (UniqueName: "kubernetes.io/iscsi/10.0.20.229:3260:iqn.2016-09.com.openebs.jiva:pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3:0") pod "pgset-0" (UID: "a9826973-2674-11e8-a384-0a58ac1f03e3") DevicePath ""
E0313 05:42:41.910744 7845 iscsi_util.go:207] iscsi: could not read iface default error:
E0313 05:42:41.910815 7845 nestedpendingoperations.go:264] Operation for "\"kubernetes.io/iscsi/10.0.20.229:3260:iqn.2016-09.com.openebs.jiva:pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3:0\"" failed. No retries permitted until 2018-03-13 05:44:43.910784094 +0000 UTC (durationBeforeRetry 2m2s). Error: MountVolume.WaitForAttach failed for volume "pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3" (UniqueName: "kubernetes.io/iscsi/10.0.20.229:3260:iqn.2016-09.com.openebs.jiva:pvc-a980c1e4-2674-11e8-a384-0a58ac1f03e3:0") pod "pgset-0" (UID: "a9826973-2674-11e8-a384-0a58ac1f03e3") : executable file not found in $PATH
E0313 05:43:12.080406 7845 kubelet.go:1628] Unable to mount volumes for pod "pgset-0_default(a9826973-2674-11e8-a384-0a58ac1f03e3)": timeout expired waiting for volumes to attach/mount for pod "default"/"pgset-0". list of unattached/unmounted volumes=[pgdata]; skipping pod
E0313 05:43:12.081262 7845 pod_workers.go:182] Error syncing pod a9826973-2674-11e8-a384-0a58ac1f03e3 ("pgset-0_default(a9826973-2674-11e8-a384-0a58ac1f03e3)"), skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"pgset-0". list of unattached/unmounted volumes=[pgdata]
My question is whether there is a way to configure and ensure that the kubelet comes up by default with the openiscsi initiator utils installed and running
The following steps were followed to manually install iscsi-initiator in kubelet:
SSH into the Kubernetes Nodes
Identify the docker container running the kubelet using sudo docker ps.
Enter the kubelet container shell
sudo docker exec -it kubelet_container_id bash
Install open-iscsi.
apt-get update
apt install -y open-iscsi

Pods are not created on new nodes

When i create a sample nginx pod with some replica's to test my kubernetes cluster. i get a strange output. The pods create themself on the first node but on the 2 other nodes they stuck at status "Container creating"
When i describe the pods (only the ones on the other nodes) they give this error message
Warning FailedCreatePodSandBox 1m kubelet, xploregroup Failed create pod sandbox.
Normal SandboxChanged 1m kubelet, xploregroup Pod sandbox changed, it will be killed and re-created.
the strange part is that all node have all exactly the same configuration (cloned the image from the master) and i joined them all exactly the same way.
The pods get distributed normally but only the pods on node1 is running .
Can someone direct me to the same direction :(
[EDIT]
journalctl -u kubelet gives this error
Mar 12 13:42:45 kubeMaster kubelet[16379]: W0312 13:42:45.824314 16379 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 12 13:42:45 kubeMaster kubelet[16379]: E0312 13:42:45.824816 16379 kubelet.go:2104] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
The problem seems to be with my network plugin. In my /etc/systemd/system/kubelet.service.d/10.kubeadm.conf . the flags for the network plugins are present ? environment= kubelet_network_args --cni-bin-dir=/etc/cni/net.d
--network-plugin=cni

How to debug when Kubernetes nodes are in 'Not Ready' state

I initialized the master node and add 2 worker nodes, but only master and one of the worker node show up when I run the following command:
kubectl get nodes
also, both these nodes are in 'Not Ready' state.
What are the steps should I take to understand what the problem could be?
I can ping all the nodes from each of the other nodes.
The version of Kubernetes is 1.8.
OS is Cent OS 7
I used the following repo to install Kubernetes:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes] name=Kubernetes
baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
EOF
yum install kubelet kubeadm kubectl kubernetes-cni
First, describe nodes and see if it reports anything:
$ kubectl describe nodes
Look for conditions, capacity and allocatable:
Conditions:
Type Status
---- ------
OutOfDisk False
MemoryPressure False
DiskPressure False
Ready True
Capacity:
cpu: 2
memory: 2052588Ki
pods: 110
Allocatable:
cpu: 2
memory: 1950188Ki
pods: 110
If everything is alright here, SSH into the node and observe kubelet logs to see if it reports anything. Like certificate erros, authentication errors etc.
If kubelet is running as a systemd service, you can use
$ journalctl -u kubelet
Steps to debug:-
In case you face any issue in kubernetes, first step is to check if kubernetes self applications are running fine or not.
Command to check:- kubectl get pods -n kube-system
If you see any pod is crashing, check it's logs
if getting NotReady state error, verify network pod logs.
if not able to resolve with above, follow below steps:-
kubectl get nodes # Check which node is not in ready state
kubectl describe node nodename #nodename which is not in readystate
ssh to that node
execute systemctl status kubelet # Make sure kubelet is running
systemctl status docker # Make sure docker service is running
journalctl -u kubelet # To Check logs in depth
Most probably you will get to know about error here, After fixing it reset kubelet with below commands:-
systemctl daemon-reload
systemctl restart kubelet
In case you still didn't get the root cause, check below things:-
Make sure your node has enough space and memory. Check for /var directory space especially.
command to check: -df -kh, free -m
Verify cpu utilization with top command. and make sure any process is not taking an unexpected memory.
I was having similar issue because of a different reason:
Error:
cord#node1:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 17h v1.13.5
node2 Ready <none> 17h v1.13.5
node3 NotReady <none> 9m48s v1.13.5
cord#node1:~$ kubectl describe node node3
Name: node3
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
Ready False Thu, 18 Apr 2019 01:15:46 -0400 Thu, 18 Apr 2019 01:03:48 -0400 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 192.168.2.6
Hostname: node3
cord#node3:~$ journalctl -u kubelet
Apr 18 01:24:50 node3 kubelet[54132]: W0418 01:24:50.649047 54132 cni.go:149] Error loading CNI config list file /etc/cni/net.d/10-calico.conflist: error parsing configuration list: no 'plugins' key
Apr 18 01:24:50 node3 kubelet[54132]: W0418 01:24:50.649086 54132 cni.go:203] Unable to update cni config: No valid networks found in /etc/cni/net.d
Apr 18 01:24:50 node3 kubelet[54132]: E0418 01:24:50.649402 54132 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 18 01:24:55 node3 kubelet[54132]: W0418 01:24:55.650816 54132 cni.go:149] Error loading CNI config list file /etc/cni/net.d/10-calico.conflist: error parsing configuration list: no 'plugins' key
Apr 18 01:24:55 node3 kubelet[54132]: W0418 01:24:55.650845 54132 cni.go:203] Unable to update cni config: No valid networks found in /etc/cni/net.d
Apr 18 01:24:55 node3 kubelet[54132]: E0418 01:24:55.651056 54132 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 18 01:24:57 node3 kubelet[54132]: I0418 01:24:57.248519 54132 setters.go:72] Using node IP: "192.168.2.6"
Issue:
My file: 10-calico.conflist was incorrect. Verified it from a different node and from sample file in the same directory "calico.conflist.template".
Resolution:
Changing the file, "10-calico.conflist" and restarting the service using "systemctl restart kubelet", resolved my issue:
NAME STATUS ROLES AGE VERSION
node1 Ready master 18h v1.13.5
node2 Ready <none> 18h v1.13.5
node3 Ready <none> 48m v1.13.5
I recently started using VMWare Octant https://github.com/vmware-tanzu/octant. This is a better UI than the Kubernetes Dashboard. You can view the Kubernetes cluster and look at the details of the cluster and the PODS. This will allow you to check the logs and open a terminal into the POD(s).
I found applying the network and rebooting both the nodes did the trick for me.
kubectl apply -f [podnetwork].yaml
I recently had this issue and checking out the known-issues from kind website here https://kind.sigs.k8s.io/docs/user/known-issues/ it would tell you specifically the main problem mostly comes from the lack of memory allocated to docker. They actually advice to allocate 8GB to docker, I allocated 6GB up from 3GB and it worked fine for me this is kind version I am running atm
$ kind version
kind v0.10.0 go1.15.7 darwin/amd64
and this is docker version
$ docker version
Client:
Cloud integration: 1.0.17
Version: 20.10.8
API version: 1.41
Go version: go1.16.6
Git commit: 3967b7d
Built: Fri Jul 30 19:55:20 2021
OS/Arch: darwin/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.8
API version: 1.41 (minimum version 1.12)
Go version: go1.16.6
Git commit: 75249d8
Built: Fri Jul 30 19:52:10 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.9
GitCommit: e25210fe30a0a703442421b0f60afac609f950a3
runc:
Version: 1.0.1
GitCommit: v1.0.1-0-g4144b63
docker-init:
Version: 0.19.0
GitCommit: de40ad0
I hope this helps you or anyone facing the same issue.
and here is the output from kind
$ k get node
NAME STATUS ROLES AGE VERSION
test2-control-plane Ready control-plane,master 4m42s v1.20.2