I am trying to setup a kubernetes cluster (two nodes, 1 master, 1 worker) on VirtualBox. My host computer runs Windows 10 and on the VirtualBox I have installed Ubuntu 18.10, Codename cosmic.
I have configured two adapters on each VirtualBox, one NAT and one Host-Only adapter. I did that because I need to access some internal resources using the host IP (NAT) and I also need a stable network between the host and the virtual machines (Host-only network).
I have installed Kubernetes v1.12.4 and successfully joined the worker to the master node.
NAME STATUS ROLES AGE VERSION
kubernetes-master Ready master 36m v1.12.4
kubernetes-slave Ready <none> 25m v1.12.4
I am using Flannel for networking.
All pods seems to be ok.
NAMESPACE NAME READY STATUS RESTARTS AGE
default nginx-server-7bb6997d9c-kdcld 1/1 Running 0 27m
kube-system coredns-576cbf47c7-btrvb 1/1 Running 1 38m
kube-system coredns-576cbf47c7-zfscv 1/1 Running 1 38m
kube-system etcd-kubernetes-master 1/1 Running 1 38m
kube-system kube-apiserver-kubernetes-master 1/1 Running 1 38m
kube-system kube-controller-manager-kubernetes-master 1/1 Running 1 38m
kube-system kube-flannel-ds-amd64-29p96 1/1 Running 1 28m
kube-system kube-flannel-ds-amd64-sb2fq 1/1 Running 1 37m
kube-system kube-proxy-59v6b 1/1 Running 1 38m
kube-system kube-proxy-bfd78 1/1 Running 0 28m
kube-system kube-scheduler-kubernetes-master 1/1 Running 1 38m
I have deployed nginx to verify that everything is working
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 41m
nginx-http ClusterIP 10.111.151.28 <none> 80/TCP 29m
However when I try to reach nginx I am getting a timeout. describe pod gives me the following events.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 32m default-scheduler Successfully assigned default/nginx-server-7bb6997d9c-kdcld to kubernetes-slave
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "dbb2595628fc2579c29779e31e27e27eaeff2dbcf2bdb68467c47f22a3590bd0" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "801e0f3f8ca4a9b7cc21d87d41141485e1b1da357f2d89e1644acf0ecf634016" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "77214c757449097bfbe05b24ebb5fd3c7f1d96f7e3e9a3cd48f3b37f30224feb" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "ebffdd723083d916c0910489e12368dc4069dd99c24a3a4ab1b1d4ab823866ff" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d87b93815380246a05470e597a88d50eb31c132a50e30000ab41a456d1e65107" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3ef233ef0a6c447134c7b027747a701d6576a80e76c9cc8ffd8287e8ee5f02a4" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "6b621aab3c57154941b37360240228fe939b528855a5fe8cd9536df63d41ed93" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "fa992bde90e0a1839180666bedaf74965fb26f3dccb33a66092836a25882ab44" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "81f74f687e17d67bd2853849f84ece33a118744278d78ac7af3bdeadff8aa9c7" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m (x2 over 32m) kubelet, kubernetes-slave (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "29188c3e73d08e81b08b2258254dc2691fcaa514ecc96e9df86f2e61ba455b76" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 32m (x11 over 32m) kubelet, kubernetes-slave Pod sandbox changed, it will be killed and re-created.
Normal Pulling 32m kubelet, kubernetes-slave pulling image "nginx"
Normal Pulled 32m kubelet, kubernetes-slave Successfully pulled image "nginx"
Normal Created 32m kubelet, kubernetes-slave Created container
I have tried to do the same exactly installation with a bridge adapter only configured to the virtual machines and then everything works as expected.
I believe that its a configuration issue however I am unable to solve it. Can someone advise me.
As I have mentioned in deleted comment, I recreated this on my Ubuntu 18.04 host. Created two Ubuntu 18.10 VM, with two adapters (NAT and one Host-Only adapter). I have the same configuration as you have specified here. Everything works fine.
What I had to do was to add the second adapter manually, I did it by using netplan before running kubeadm init and kubeadm join on node.
Just in case you did not do that - add the host only adapter network to the yaml file in /etc/netplan/50-cloud-init.yaml and run sudo netplan generate and sudo netplan apply. For nginx I have used deployment from official Kubernetes documentation. Then I have exposed the service:
kubectl create service nodeport nginx --tcp=80:80
Curling my node IP address on NodePort from host machine works fine.
This was just to demonstrate what I did so it works in my environment. Judging from the described pod error it seems like there is something wrong with Flannel itself:
/run/flannel/subnet.env: no such file or directory
I checked this directory on master and it looks like this:
/run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
Check if the file is there, and if this will not help you, we can try to further troubleshoot if you provide more information. However there are too many unknowns so I had to guess in some places, my advice would be to destroy it all and try again with the information I have provided, and run the nginx with NodePort and not ClusterIP type. ClusterIP will only be reachable from inside of the cluster - for example Node.
Please let me pump up this thread. Long time ago I had configurated 1 NAT for internet, 1 HOST for SSH remote and errors the same. Special when setup Rancher Longhorn.
Now, I don't build like that. First, I build the GATEWAY SERVER by using CentOS with iptable (1 NAT, 1 HOST)
Then, other VMs has just 1 interface HOST connected direct to GATEWAY SERVER
Related
I have installed microk8s on my centos 8 operating system.
kube-system coredns-7f9c69c78c-lxm7c 0/1 Running 1 18m
kube-system calico-node-thhp8 1/1 Running 1 68m
kube-system calico-kube-controllers-f7868dd95-dpsnl 0/1 CrashLoopBackOff 23 68m
When I do microk8s enable dns, coredns or calico-kube-controllers cannot be started as above.
Describe the pod for coredns :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned kube-system/coredns-7f9c69c78c-lxm7c to localhost.localdomain
Normal Pulled 14m kubelet Container image "coredns/coredns:1.8.0" already present on machine
Normal Created 14m kubelet Created container coredns
Normal Started 14m kubelet Started container coredns
Warning Unhealthy 11m (x22 over 14m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Normal SandboxChanged 2m8s kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 2m7s kubelet Container image "coredns/coredns:1.8.0" already present on machine
Normal Created 2m7s kubelet Created container coredns
Normal Started 2m6s kubelet Started container coredns
Warning Unhealthy 2m6s kubelet Readiness probe failed: Get "http://10.1.102.132:8181/ready": dial tcp 10.1.102.132:8181: connect: connection refused
Warning Unhealthy 9s (x12 over 119s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Describe the pod for calico-kube-controllers :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 73m default-scheduler no nodes available to schedule pods
Warning FailedScheduling 73m (x1 over 73m) default-scheduler no nodes available to schedule pods
Warning FailedScheduling 72m (x1 over 72m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
Normal Scheduled 72m default-scheduler Successfully assigned kube-system/calico-kube-controllers-f7868dd95-dpsnl to localhost.localdomain
Warning FailedCreatePodSandBox 72m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f3ea36b003b0c9142ae63fee31531f9102e40ab837f4d795d1efb5c85af223ec": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Warning FailedCreatePodSandBox 71m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a1c405cdcebe79c586badcc8da47700247751a50ef9a1403e95fc4995485fba0": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Warning FailedCreatePodSandBox 71m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4adb07610eef0d7a618105abf72a114e486c373a02d5d1b204da2bd35268dd1b": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Warning FailedCreatePodSandBox 71m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "96aac009175973ac4c20034824db3443b3ab184cfcd1ed23786e539fb6147796": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Warning FailedCreatePodSandBox 71m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "79639a18edcffddbdb93492157af43bb6c1f1a9ac2af1b3fbbac58335737d5dc": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Warning FailedCreatePodSandBox 70m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3264f006447297583a37d8cc87ffe01311deaf2a31bf25867b3b18c83db2167d": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Warning FailedCreatePodSandBox 70m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5c5cf6509bfcf515ad12bc51451e4c385e5242c4f7bb593779d207abf9c906a4": error getting ClusterInformation: resource does not exist: ClusterInformation(default) with error: clusterinformations.crd.projectcalico.org "default" not found
Normal Pulling 70m kubelet Pulling image "calico/kube-controllers:v3.13.2"
Normal Pulled 69m kubelet Successfully pulled image "calico/kube-controllers:v3.13.2" in 50.744281789s
Normal Created 69m kubelet Created container calico-kube-controllers
Normal Started 69m kubelet Started container calico-kube-controllers
Warning Unhealthy 69m (x2 over 69m) kubelet Readiness probe failed: Failed to read status file status.json: open status.json: no such file or directory
Warning MissingClusterDNS 37m (x185 over 72m) kubelet pod: "calico-kube-controllers-f7868dd95-dpsnl_kube-system(d8c3ee40-7d3b-4a84-9398-19ec8a6d9082)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
Warning Unhealthy 31m (x6 over 32m) kubelet Readiness probe failed: Failed to read status file status.json: open status.json: no such file or directory
Normal Pulled 30m (x4 over 32m) kubelet Container image "calico/kube-controllers:v3.13.2" already present on machine
Normal Created 30m (x4 over 32m) kubelet Created container calico-kube-controllers
Normal Started 30m (x4 over 32m) kubelet Started container calico-kube-controllers
Warning BackOff 22m (x42 over 32m) kubelet Back-off restarting failed container
Normal SandboxChanged 10m kubelet Pod sandbox changed, it will be killed and re-created.
Warning Unhealthy 9m36s (x6 over 10m) kubelet Readiness probe failed: Failed to read status file status.json: open status.json: no such file or directory
Normal Pulled 8m51s (x4 over 10m) kubelet Container image "calico/kube-controllers:v3.13.2" already present on machine
Normal Created 8m51s (x4 over 10m) kubelet Created container calico-kube-controllers
Normal Started 8m51s (x4 over 10m) kubelet Started container calico-kube-controllers
Warning BackOff 42s (x42 over 10m) kubelet Back-off restarting failed container
I cannot start my microk8s services. I don't encounter these on my Ubuntu server. What can I do in these error situations that I encounter for my Centos 8 server?
Have you tried updating the microk8s version?
We are trying to create POD but the Pod's status struck at ContainerCreating for long time.
This is the output we got after running the command: kubectl describe pod
Name: demo-6c59fb8f77-9x6sr
Namespace: default
Priority: 0
Node: k8-slave2/10.0.0.5
Start Time: Wed, 23 Dec 2020 10:16:23 +0000
Labels: app=demo
pod-template-hash=6c59fb8f77
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/demo-6c59fb8f77
Containers:
private-docker-registry:
Container ID:
Image: private-docker-registry:5000/mahin/mof-docker-demo:v1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-p94zw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-p94zw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-p94zw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/demo-6c59fb8f77-9x6sr to k8-slave2
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8eee497a2176c7f5782222f804cc63a4abac7f4a2fc7813016793857ae1b1dff" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "95e72bfc6f6c13de7f5c96eb76b012c2e6639ca03f4c2f270b23ed1a09b90413" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "566370012e4a1d32af2ef9035ff64d743cd81f36f25d2724e7b033e393b8247e" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7d499e40f572cfc29ecfb44f8376493df56a44213b1c1e9333b65499a0c288cd" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "53241e64de1e4470712b4061e2c82f44916d654bc532f8f1d12e5d5d4e136914" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "fd168faab4546f988dc38fc56df2f71cf80c922e86d3f869be15a43f08328f99" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e578afe329abb0cba64802dfa480e00f2bbbb8c80be537791c24a31c853eb62f" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "a3cb32dba55907ca907fc4f38f7ca05ef6db10a6af2dd1fa3c4db166e4ab9ffe" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7e4368ba8ec460b3c94de24ab0a04b6c799eb28df885cbbacfc3bb3ffa8c1e67" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m (x4 over 10m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c4aaa8f8cd2dc1eff788baf04774c4ecc845568d00ed1b386df311ec224eb6f3" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 56s (x551 over 10m) kubelet Pod sandbox changed, it will be killed and re-created.
azureuser#k8-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default demo-6c59fb8f77-2jq6k 0/1 ContainerCreating 0 5m23s
kube-system coredns-f9fd979d6-q8s9b 1/1 Running 2 27h
kube-system coredns-f9fd979d6-qnm4j 1/1 Running 2 27h
kube-system etcd-k8-master 1/1 Running 2 27h
kube-system kube-apiserver-k8-master 1/1 Running 3 27h
kube-system kube-controller-manager-k8-master 1/1 Running 3 27h
kube-system kube-flannel-ds-kqz4t 0/1 CrashLoopBackOff 92 27h
kube-system kube-flannel-ds-szqzn 1/1 Running 3 27h
kube-system kube-flannel-ds-v9q47 0/1 CrashLoopBackOff 142 27h
kube-system kube-proxy-4mb47 1/1 Running 2 27h
kube-system kube-proxy-54m9b 1/1 Running 2 27h
kube-system kube-proxy-wdxfz 1/1 Running 1 27h
kube-system kube-scheduler-k8-master 1/1 Running 3 27h
kubernetes-dashboard dashboard-metrics-scraper-7b59f7d4df-zmlvs 0/1 ContainerCreating 0 27h
kubernetes-dashboard kubernetes-dashboard-665f4c5ff-cnsvn 0/1 ContainerCreating 0 6h3m
To fix the flannel crashloopbackoff we did Kubeadm reset and after some time this problem showed up again.
Current we are working with one master and two worker node.
My cluster details as follows:
azureuser#k8-master:~$ kubectl config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://52.150.11.168:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin#kubernetes
current-context: kubernetes-admin#kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
Docker version:
azureuser#k8-master:~$ sudo docker version
[sudo] password for azureuser:
Client:
Version: 19.03.6
API version: 1.40
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Wed Oct 14 19:00:27 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.6
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Wed Oct 14 16:52:50 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.3-0ubuntu1~18.04.2
GitCommit:
runc:
Version: spec: 1.0.1-dev
GitCommit:
docker-init:
Version: 0.18.0
GitCommit:
kubeadm version :
azureuser#k8-master:~$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:15:05Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
The flannel is crashing whenever I tried to schedule pod creation.
Background
I think your issue is cased by your 2 Flannel CNI pods CrashLoopBackOff status.
Your error
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8eee497a2176c7f5782222f804cc63a4abac7f4a2fc7813016793857ae1b1dff" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
is pointing that pod cannot be created due to lack of /run/flannel/subnet.env file.
In Flannel Github document you can find:
Flannel runs a small, single binary agent called flanneld on each host, and is responsible for allocating a subnet lease to each host out of a larger, preconfigured address space.
Meaning, to proper work, Flannel pod should be running on each node as it contains subnets information. From your outputs I can see that only 1 is working properly out of 3 Flannel pods.
NAMESPACE NAME READY STATUS RESTARTS AGE
...
kube-system kube-flannel-ds-kqz4t 0/1 CrashLoopBackOff 92 27h
kube-system kube-flannel-ds-szqzn 1/1 Running 3 27h
kube-system kube-flannel-ds-v9q47 0/1 CrashLoopBackOff 142 27h
If mentioned pod was scheduled on node where flannel pod is not working it won't be created due to CNI network issues. Besides your demo pod, also kubernetes-dashboard pods have the same issue with ContainerCreating status.
Conclusion
Your demo pod cannot be scheduled as Kubernetes encounter some network issues related with flannel configuration file (...network: open /run/flannel/subnet.env: no such file or directory).
Your flannel pods restarts counts is very high as for 27 hours. You have to determine why and fix it. It might be lack of resources, network issues with your infrastructure or many other reasons. Once all flannel pods will be working correctly, your shouldn't encounter this error.
Solution
You have to make flannel pods works correctly on each node.
Additional Troubleshooting Details
For detailed investigation please provide
$ kubectl describe kube-flannel-ds-kqz4t -n kube-system
$ kubectl describe kube-flannel-ds-v9q47 -n kube-system
Logs details would be also helpful
$ kubectl logs kube-flannel-ds-kqz4t -n kube-system
$ kubectl logs kube-flannel-ds-v9q47 -n kube-system
Please replace kubectl get pods --all-namespaces with kubectl get pods -o wide -A and output of kubectl get nodes -o wide.
If you will provide those information, it should be possible to determine root cause of flannel pods issues and I will edit this answer with exact solution.
I am new to Kubernetes. I have created a Kubernetes cluster with one Master node and 2 worker nodes. I have installer helm for the deployment of apps. I am getting the following error while starting the tiller pod
tiller-deploy-5b4685ffbf-znbdc 0/1 ContainerCreating 0 23h
After describing the pod I got the following result
[root#master-node flannel]# kubectl --namespace kube-system describe
pod tiller-deploy-5b4685ffbf-znbdc
Events:
Type Reason Age From Message
Warning FailedCreatePodSandBox 10m (x34020 over 22h) kubelet,
worker-node1 (combined from similar events): Failed to create pod
sandbox: rpc error: code = Unknown desc = failed to set up sandbox
container
"cdda0a8ae9200668a2256e8c7b41904dce604f73f0282b0443d972f5e2846059"
network for pod "tiller-deploy-5b4685ffbf-znbdc": networkPlugin cni
failed to set up pod "tiller-deploy-5b4685ffbf-znbdc_kube-system"
network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 25s (x34556 over 22h) kubelet, worker-node1 Pod
sandbox changed, it will be killed and re-created.
Any hint of how can I get away with this error.
You need to setup a CNI plugin such as Flannel. Verify if all the pods in kube-system namespace are running.
To apply flannel in you cluster run the following command:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
For flannel to work correctly pod-network-cidr should be 10.244.0.0/16 or if you have a different CIDR, you can customize flannel manifest (kube-flannel.yml) according to your needs.
Example:
net-conf.json: |
{
"Network": "10.10.0.0/16",
"Backend": {
"Type": "vxlan"
}
I try to install GlusterFS on my kubernetes cluster using heketi. I start gk-deploy but it shows that pods aren't found:
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
GlusterFS pods ... not found.
deploy-heketi pod ... not found.
heketi pod ... not found.
gluster-s3 pod ... not found.
Creating initial resources ... Error from server (AlreadyExists): error when creating "/heketi/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled
OK
node/sapdh2wrk1 not labeled
node/sapdh2wrk2 not labeled
node/sapdh2wrk3 not labeled
daemonset.extensions/glusterfs created
Waiting for GlusterFS pods to start ... pods not found.
I've started gk-deploy more than once.
I have 3 nodes in my kubernetes cluster and it seems like pods can't start up on none of them, but I don't understand why.
Pods are created but aren't ready:
kubectl get pods
NAME READY STATUS RESTARTS AGE
glusterfs-65mc7 0/1 Running 0 16m
glusterfs-gnxms 0/1 Running 0 16m
glusterfs-htkmh 0/1 Running 0 16m
heketi-754dfc7cdf-zwpwn 0/1 ContainerCreating 0 74m
Here is a log of one GlusterFS Pod, it ends with a warning:
Events:
Type Reason Age From Message
Normal Scheduled 19m default-scheduler Successfully assigned default/glusterfs-65mc7 to sapdh2wrk1
Normal Pulled 19m kubelet, sapdh2wrk1 Container image "gluster/gluster-centos:latest" already present on machine
Normal Created 19m kubelet, sapdh2wrk1 Created container
Normal Started 19m kubelet, sapdh2wrk1 Started container
Warning Unhealthy 13m (x12 over 18m) kubelet, sapdh2wrk1 Liveness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Warning Unhealthy 3m58s (x35 over 18m) kubelet, sapdh2wrk1 Readiness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Glusterfs-5.8-100.1 is installed and started up on every node including master.
What is the reason why Pods don't start up?
I am adding a node to the Kubernetes cluster as a node using flannel. Here are the nodes on my cluster:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
jetson-80 NotReady <none> 167m v1.15.0
p4 Ready master 18d v1.15.0
This machine is reachable through the same network. When joining the cluster, Kubernetes pulls some images, among others k8s.gcr.io/pause:3.1, but for some reason failed in pulling the images:
Warning FailedCreatePodSandBox 15d
kubelet,jetson-81 Failed create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Error response from daemon: Get https://k8s.gcr.io/v2/: read tcp 192.168.8.81:58820->108.177.126.82:443: read: connection reset by peer
The machine is connected to the internet but only wget command works, not ping
I tried to pull images elsewhere and copy them to the machine.
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.15.0 d235b23c3570 2 months ago 82.4MB
quay.io/coreos/flannel v0.11.0-arm64 32ffa9fadfd7 6 months ago 53.5MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB
Here are the list of pods on the master :
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-gmsz7 1/1 Running 0 2d22h
coredns-5c98db65d4-j6gz5 1/1 Running 0 2d22h
etcd-p4 1/1 Running 0 2d22h
kube-apiserver-p4 1/1 Running 0 2d22h
kube-controller-manager-p4 1/1 Running 0 2d22h
kube-flannel-ds-amd64-cq7kz 1/1 Running 9 17d
kube-flannel-ds-arm64-4s7kk 0/1 Init:CrashLoopBackOff 0 2m8s
kube-proxy-l2slz 0/1 CrashLoopBackOff 4 2m8s
kube-proxy-q6db8 1/1 Running 0 2d22h
kube-scheduler-p4 1/1 Running 0 2d22h
tiller-deploy-5d6cc99fc-rwdrl 1/1 Running 1 17d
but it didn't work either when I check the associated flannel pod kube-flannel-ds-arm64-4s7kk:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 66s default-scheduler Successfully assigned kube-system/kube-flannel-ds-arm64-4s7kk to jetson-80
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 68ffc44cf8cd655234691b0362615f97c59d285bec790af40f890510f27ba298
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: a196d8540b68dc7fcd97b0cda1e2f3183d1410598b6151c191b43602ac2faf8e
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 9d05d1fcb54f5388ca7e64d1b6627b05d52aea270114b5a418e8911650893bc6
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 5b730961cddf5cc3fb2af564b1abb46b086073d562bb2023018cd66fc5e96ce7
Normal Created <invalid> (x5 over <invalid>) kubelet, jetson-80 Created container install-cni
Warning Failed <invalid> kubelet, jetson-80 Error: failed to start container "install-cni": Error response from daemon: cannot join network of a non running container: 1767e9eb9198969329eaa14a71a110212d6622a8b9844137ac5b247cb9e90292
Normal SandboxChanged <invalid> (x5 over <invalid>) kubelet, jetson-80 Pod sandbox changed, it will be killed and re-created.
Warning BackOff <invalid> (x4 over <invalid>) kubelet, jetson-80 Back-off restarting failed container
Normal Pulled <invalid> (x6 over <invalid>) kubelet, jetson-80 Container image "quay.io/coreos/flannel:v0.11.0-arm64" already present on machine
I still can't identify if it's a Kubernetes or Flannel issue and haven't been able to solve it despite multiple attempts. Please let me know if you need me to share more details
EDIT:
using kubectl describe pod -n kube-system kube-proxy-l2slz :
Normal Pulled <invalid> (x67 over <invalid>) kubelet, ahold-jetson-80 Container image "k8s.gcr.io/kube-proxy:v1.15.0" already present on machine
Normal SandboxChanged <invalid> (x6910 over <invalid>) kubelet, ahold-jetson-80 Pod sandbox changed, it will be killed and re-created.
Warning FailedSync <invalid> (x77 over <invalid>) kubelet, ahold-jetson-80 (combined from similar events): error determining status: rpc error: code = Unknown desc = Error: No such container: 03e7ee861f8f63261ff9289ed2d73ea5fec516068daa0f1fe2e4fd50ca42ad12
Warning BackOff <invalid> (x8437 over <invalid>) kubelet, ahold-jetson-80 Back-off restarting failed container
Your problem may be coused by the mutil sandbox container in you node. Try to restart the kubelet:
$ systemctl restart kubelet
Check if you have generated and copied public key to right node to have connection between them: ssh-keygen.
Please make sure the firewall/security groups allow traffic on UDP port 58820.
Look at the flannel logs and see if there are any errors there but also look for "Subnet added: " messages. Each node should have added the other two subnets.
While running ping, try to use tcpdump to see where the packets get dropped.
Try src flannel0 (icmp), src host interface (udp port 58820), dest host interface (udp port 58820), dest flannel0 (icmp), docker0 (icmp).
Here is useful documentation: flannel-documentation.