kubernetes issue : runtime network not ready

kubernetes issue : runtime network not ready - kubernetes

I am beginner in kubernetes and I'm trying to set up my first cluster , my worker node has joined to my cluster successfully but when I run kubectl get nodes it is in NotReady status .
and this massesge exists when I run
kubectl describe node k8s-node-1
runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
I have run this command to install a a Pod network add-on:
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
how can I solve this issue?

Adding this answer as community wiki for better visibility. OP already solved the problem with rebooting the machine.
Worth to remember that going thru all the steps with bootstrapping cluster and installing all the prerequisites will make your cluster running successfully . If you had any previous installations please remember to perform kubeadm reset and remove .kube folder from the home or root directory.
I`m also linking this github case with the same issue whereas people provide solution for this problem.

Related

Kubectl connection refused existing cluster

Hope someone can help me.
To describe the situation in short, I have a self managed k8s cluster, running on 3 machines (1 master, 2 worker nodes). In order to make it HA, I attempted to add a second master to the cluster.
After some failed attempts, I found out that I needed to add controlPlaneEndpoint configuration to kubeadm-config config map. So I did, with masternodeHostname:6443.
I generated the certificate and join command for the second master, and after running it on the second master machine, it failed with
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available
Checking the first master now, I get connection refused for the IP on port 6443. So I cannot run any kubectl commands.
Tried recreating the .kube folder, with all the config copied there, no luck.
Restarted kubelet, docker.
The containers running on the cluster seem ok, but I am locked out of any cluster configuration (dashboard is down, kubectl commands not working).
Is there any way I make it work again? Not losing any of the configuration or the deployments already present?
Thanks! Sorry if it’s a noob question.
Cluster information:
Kubernetes version: 1.15.3
Cloud being used: (put bare-metal if not on a public cloud) bare-metal
Installation method: kubeadm
Host OS: RHEL 7
CNI and version: weave 0.3.0
CRI and version: containerd 1.2.6

This is an old, known problem with Kubernetes 1.15 [1,2].
It is caused by short etcd timeout period. As far as I'm aware it is a hard coded value in source, and cannot be changed (feature request to make it configurable is open for version 1.22).
Your best bet would be to upgrade to a newer version, and recreate your cluster.

gpu worker node unable to join cluster

I've a EKS setup (v1.16) with 2 ASG: one for compute ("c5.9xlarge") and the other gpu ("p3.2xlarge").
Both are configured as Spot and set with desiredCapacity 0.
K8S CA works as expected and scale out each ASG when necessary, the issue is that the newly created gpu instance is not recognized by the master and running kubectl get nodes emits nothing.
I can see that the ec2 instance was in Running state and also I could ssh the machine.
I double checked the the labels and tags and compared them to the "compute".
Both are configured almost similarly, the only difference is that the gpu nodegroup has few additional tags.
Since I'm using eksctl tool (v.0.35.0) and the compute nodeGroup vs. gpu nodeGroup is basically copy&paste, I can't figured out what could be the problem.
UPDATE:
ssh the instance I could see the following error (/var/log/messages)
failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
and the kubelet service crashed.
would it possible the my GPU uses wrong AMI (amazon-eks-gpu-node-1.18-v20201211)?

As a simple you can use this preBootstrapCommands in eksctl yaml config file:
- name: test-node-group
preBootstrapCommands:
- "sed -i 's/cgroupDriver:.*/cgroupDriver: cgroupfs/' /etc/eksctl/kubelet.yaml"

There is some issue with EKS 1.16, even the graviton processors machine won't join the cluster. To fix it first you try upgrading your CNI version. Please refer the documentation here:
https://docs.aws.amazon.com/eks/latest/userguide/cni-upgrades.html
And if that doesn't work, then upgrade your EKS version to the latest available version then should work.

I've found out the issue. It seems to be mis-alignment between eksctl (v0.35.0) and the AL2-GPU AMI.
AWS team change the control group in docker to be "systemd" instead of "cgroup" (github) while the eksctl tool I used didn't absorb the changes.
A temporary solution is to edit the /etc/eksctl/kubelet.yaml file using preBootstrapCommands

Unable to setup multiple node kubernetes cluster using kubeadm (Vagrant)

I have been setting up multi node kubernetes cluster using kubeadm.Setup included 1 master and worker node each. I have created the VM using vagrant.
I followed the docs,
https://kubernetes.io/docs/setup/independent/install-kubeadm/
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm
Created 2 VM's using vagrant
IP: Master- 192.168.33.10 , Worker- 192.168.1.21 (Both host only network)
I have experienced 2 scenarios,
Case 1:
Ran kubeadm init --pod-network-cidr=10.244.0.0/16 successfully with all pods running.
Installed "Canal" pod network add on.
Followed all the instructions given at the end of the successfull kubeadm init command.
SSH into 2nd VM and ran kubeadm join .. command and I am struck at "[preflight] Running pre-flight checks"
Case 2:
Did the same process with tag --apiserver-advertise-address=192.168.33.10
Successfully ran the command kubeadm init --apiserver-advertise-address=192.168.33.10
But when I ran the command kubectl get nodes it only showed the master node. (expected the worker node to show too).
Kindly help me understand how can I complete this setup. Thank you.

I have github repository which does exactly what you want. I am pretty sure that you will get idea from it. If anything is not clear, please update with comment or original post.

Waiting for pods: apiserver get stuck

I am trying to implement auditing policy
My yaml
~/.minikube/addons$ cat audit-policy.yaml
# Log all requests at the Metadata level.
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
- level: Metadata
Pods got stuck
minikube start --extra-config=apiserver.Authorization.Mode=RBAC --extra-config=apiserver.Audit.LogOptions.Path=/var/logs/audit.log --extra-config=apiserver.Audit.PolicyFile=/etc/kubernetes/addons/audit-policy.yaml
😄 minikube v0.35.0 on linux (amd64)
💡 Tip: Use 'minikube start -p <name>' to create a new cluster, or 'minikube delete' to delete this one.
🔄 Restarting existing virtualbox VM for "minikube" ...
⌛ Waiting for SSH access ...
📶 "minikube" IP address is 192.168.99.101
🐳 Configuring Docker as the container runtime ...
✨ Preparing Kubernetes environment ...
▪ apiserver.Authorization.Mode=RBAC
▪ apiserver.Audit.LogOptions.Path=/var/logs/audit.log
▪ apiserver.Audit.PolicyFile=/etc/kubernetes/addons/audit-policy.yaml
🚜 Pulling images required by Kubernetes v1.13.4 ...
🔄 Relaunching Kubernetes v1.13.4 using kubeadm ...
⌛ Waiting for pods: apiserver
Why?
I can do this
minkub start
Then I go for minikube ssh
$ sudo bash
$ cd /var/logs
bash: cd: /var/logs: No such file or directory
ls
cache empty lib lock log run spool tmp
How to apply extra-config?

I don't have good news. Although you made some mistakes with the /var/logs it does not matter in this case, as there seems to be no way of implement auditing policy in Minikube (I mean there is, few ways at least but they all seem to fail).
You can try couple of ways presented in GitHub issues and other links I will provide, but I tried probably all of them and they do not work with current Minikube version. You might try to make this work with earlier versions maybe, as it seems like at some point it was possible with the way you have provided in your question, but as for now in the updated version it is not. Anyway I have spend some time on trying the ways from the links and couple of my own ideas but no success, maybe you will be able to find the missing piece.
You can find more information in this documents:
Audit Logfile Not Created
Service Accounts and Auditing in Kubernetes
fails with -extra-config=apiserver.authorization-mode=RBAC and audit logging: timed out waiting for kube-proxy
How do I enable an audit log on minikube?
Enable Advanced Auditing Webhook Backend Configuration

Failed to create pod sandbox kubernetes cluster

I have an weave network plugin.
inside my folder /etc/cni/net.d there is a 10-weave.conf
{
"name": "weave",
"type": "weave-net",
"hairpinMode": true
}
My weave pods are running and the dns pod is also running
But when i want to run a pod like a simple nginx wich will pull an nginx image
The pod stuck at container creating , describe pod gives me the error , failed create pod sandbox.
When i run journalctl -u kubelet i get this error
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
is my network plugin not good configured ?
i used this command to configure my weave network
kubectl apply -f https://git.io/weave-kube-1.6
After this won't work i also tried this command
kubectl apply -f “https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d ‘\n’)”
I even tried flannel and that gives me the same error.
The system i am setting kubernetes on is a raspberry pi.
I am trying to build a raspberry pi cluster with 3 nodes and 1 master with kubernetes
Dose anyone have ideas on this?

Thank you all for responding to my question. I solved my problem now. For anyone who has come to my question in the future the solution was as followed.
I cloned my raspberry pi images because i wanted a basicConfig.img for when i needed to add a new node to my cluster of when one gets down.
Weave network (the plugin i used) got confused because on every node and master the os had the same machine-id. When i deleted the machine id and created a new one (and reboot the nodes) my error got fixed. The commands to do this was
sudo rm /etc/machine-id
sudo rm /var/lib/dbus/machine-id
sudo dbus-uuidgen --ensure=/etc/machine-id
Once again my patience was being tested. Because my kubernetes setup was normal and my raspberry pi os was normal. I founded this with the help of someone in the kubernetes community. This again shows us how important and great are IT community is. To the people of the future who will come to this question. I hope this solution will fix your error and will decrease the amount of time you will be searching after a stupid small thing.

Looking at the pertinent code in Kubernetes and in CNI, the specific error you see seems to indicate that it cannot find any files ending in .json, .conf or .conflist in the directory given.
This makes me think it could be something as the conf file not being present on all the hosts, so I would verify that as a first step.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

kubernetes issue : runtime network not ready - kubernetes

Related

Kubectl connection refused existing cluster

gpu worker node unable to join cluster

Unable to setup multiple node kubernetes cluster using kubeadm (Vagrant)

Waiting for pods: apiserver get stuck

Failed to create pod sandbox kubernetes cluster

Categories

Resources