Error while starting POD in a newly created kubernetes cluster (ContainerCreating) - kubernetes

I am new to Kubernetes. I have created a Kubernetes cluster with one Master node and 2 worker nodes. I have installer helm for the deployment of apps. I am getting the following error while starting the tiller pod
tiller-deploy-5b4685ffbf-znbdc 0/1 ContainerCreating 0 23h
After describing the pod I got the following result
[root#master-node flannel]# kubectl --namespace kube-system describe
pod tiller-deploy-5b4685ffbf-znbdc
Events:
Type Reason Age From Message
Warning FailedCreatePodSandBox 10m (x34020 over 22h) kubelet,
worker-node1 (combined from similar events): Failed to create pod
sandbox: rpc error: code = Unknown desc = failed to set up sandbox
container
"cdda0a8ae9200668a2256e8c7b41904dce604f73f0282b0443d972f5e2846059"
network for pod "tiller-deploy-5b4685ffbf-znbdc": networkPlugin cni
failed to set up pod "tiller-deploy-5b4685ffbf-znbdc_kube-system"
network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 25s (x34556 over 22h) kubelet, worker-node1 Pod
sandbox changed, it will be killed and re-created.
Any hint of how can I get away with this error.

You need to setup a CNI plugin such as Flannel. Verify if all the pods in kube-system namespace are running.
To apply flannel in you cluster run the following command:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
For flannel to work correctly pod-network-cidr should be 10.244.0.0/16 or if you have a different CIDR, you can customize flannel manifest (kube-flannel.yml) according to your needs.
Example:
net-conf.json: |
{
"Network": "10.10.0.0/16",
"Backend": {
"Type": "vxlan"
}

Related

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container

We are trying to create POD but the Pod's status struck at ContainerCreating for long time.
This is the output we got after running the command: kubectl describe pod
Name: demo-6c59fb8f77-9x6sr
Namespace: default
Priority: 0
Node: k8-slave2/10.0.0.5
Start Time: Wed, 23 Dec 2020 10:16:23 +0000
Labels: app=demo
pod-template-hash=6c59fb8f77
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/demo-6c59fb8f77
Containers:
private-docker-registry:
Container ID:
Image: private-docker-registry:5000/mahin/mof-docker-demo:v1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-p94zw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-p94zw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-p94zw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/demo-6c59fb8f77-9x6sr to k8-slave2
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8eee497a2176c7f5782222f804cc63a4abac7f4a2fc7813016793857ae1b1dff" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "95e72bfc6f6c13de7f5c96eb76b012c2e6639ca03f4c2f270b23ed1a09b90413" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "566370012e4a1d32af2ef9035ff64d743cd81f36f25d2724e7b033e393b8247e" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7d499e40f572cfc29ecfb44f8376493df56a44213b1c1e9333b65499a0c288cd" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "53241e64de1e4470712b4061e2c82f44916d654bc532f8f1d12e5d5d4e136914" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "fd168faab4546f988dc38fc56df2f71cf80c922e86d3f869be15a43f08328f99" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e578afe329abb0cba64802dfa480e00f2bbbb8c80be537791c24a31c853eb62f" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "a3cb32dba55907ca907fc4f38f7ca05ef6db10a6af2dd1fa3c4db166e4ab9ffe" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7e4368ba8ec460b3c94de24ab0a04b6c799eb28df885cbbacfc3bb3ffa8c1e67" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m (x4 over 10m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c4aaa8f8cd2dc1eff788baf04774c4ecc845568d00ed1b386df311ec224eb6f3" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 56s (x551 over 10m) kubelet Pod sandbox changed, it will be killed and re-created.
azureuser#k8-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default demo-6c59fb8f77-2jq6k 0/1 ContainerCreating 0 5m23s
kube-system coredns-f9fd979d6-q8s9b 1/1 Running 2 27h
kube-system coredns-f9fd979d6-qnm4j 1/1 Running 2 27h
kube-system etcd-k8-master 1/1 Running 2 27h
kube-system kube-apiserver-k8-master 1/1 Running 3 27h
kube-system kube-controller-manager-k8-master 1/1 Running 3 27h
kube-system kube-flannel-ds-kqz4t 0/1 CrashLoopBackOff 92 27h
kube-system kube-flannel-ds-szqzn 1/1 Running 3 27h
kube-system kube-flannel-ds-v9q47 0/1 CrashLoopBackOff 142 27h
kube-system kube-proxy-4mb47 1/1 Running 2 27h
kube-system kube-proxy-54m9b 1/1 Running 2 27h
kube-system kube-proxy-wdxfz 1/1 Running 1 27h
kube-system kube-scheduler-k8-master 1/1 Running 3 27h
kubernetes-dashboard dashboard-metrics-scraper-7b59f7d4df-zmlvs 0/1 ContainerCreating 0 27h
kubernetes-dashboard kubernetes-dashboard-665f4c5ff-cnsvn 0/1 ContainerCreating 0 6h3m
To fix the flannel crashloopbackoff we did Kubeadm reset and after some time this problem showed up again.
Current we are working with one master and two worker node.
My cluster details as follows:
azureuser#k8-master:~$ kubectl config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://52.150.11.168:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin#kubernetes
current-context: kubernetes-admin#kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
Docker version:
azureuser#k8-master:~$ sudo docker version
[sudo] password for azureuser:
Client:
Version: 19.03.6
API version: 1.40
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Wed Oct 14 19:00:27 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.6
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Wed Oct 14 16:52:50 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.3-0ubuntu1~18.04.2
GitCommit:
runc:
Version: spec: 1.0.1-dev
GitCommit:
docker-init:
Version: 0.18.0
GitCommit:
kubeadm version :
azureuser#k8-master:~$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:15:05Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
The flannel is crashing whenever I tried to schedule pod creation.
Background
I think your issue is cased by your 2 Flannel CNI pods CrashLoopBackOff status.
Your error
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8eee497a2176c7f5782222f804cc63a4abac7f4a2fc7813016793857ae1b1dff" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
is pointing that pod cannot be created due to lack of /run/flannel/subnet.env file.
In Flannel Github document you can find:
Flannel runs a small, single binary agent called flanneld on each host, and is responsible for allocating a subnet lease to each host out of a larger, preconfigured address space.
Meaning, to proper work, Flannel pod should be running on each node as it contains subnets information. From your outputs I can see that only 1 is working properly out of 3 Flannel pods.
NAMESPACE NAME READY STATUS RESTARTS AGE
...
kube-system kube-flannel-ds-kqz4t 0/1 CrashLoopBackOff 92 27h
kube-system kube-flannel-ds-szqzn 1/1 Running 3 27h
kube-system kube-flannel-ds-v9q47 0/1 CrashLoopBackOff 142 27h
If mentioned pod was scheduled on node where flannel pod is not working it won't be created due to CNI network issues. Besides your demo pod, also kubernetes-dashboard pods have the same issue with ContainerCreating status.
Conclusion
Your demo pod cannot be scheduled as Kubernetes encounter some network issues related with flannel configuration file (...network: open /run/flannel/subnet.env: no such file or directory).
Your flannel pods restarts counts is very high as for 27 hours. You have to determine why and fix it. It might be lack of resources, network issues with your infrastructure or many other reasons. Once all flannel pods will be working correctly, your shouldn't encounter this error.
Solution
You have to make flannel pods works correctly on each node.
Additional Troubleshooting Details
For detailed investigation please provide
$ kubectl describe kube-flannel-ds-kqz4t -n kube-system
$ kubectl describe kube-flannel-ds-v9q47 -n kube-system
Logs details would be also helpful
$ kubectl logs kube-flannel-ds-kqz4t -n kube-system
$ kubectl logs kube-flannel-ds-v9q47 -n kube-system
Please replace kubectl get pods --all-namespaces with kubectl get pods -o wide -A and output of kubectl get nodes -o wide.
If you will provide those information, it should be possible to determine root cause of flannel pods issues and I will edit this answer with exact solution.

More detailed monitoring of Pod states

Our Pods usually spend at least a minute and up to several minutes in the Pending state, the events via kubectl describe pod x yield:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned testing/runner-2zyekyp-project-47-concurrent-0tqwl4 to host
Normal Pulled 55s kubelet, host Container image "registry.com/image:c1d98da0c17f9b1d4ca81713c138ee2e" already present on machine
Normal Created 55s kubelet, host Created container build
Normal Started 54s kubelet, host Started container build
Normal Pulled 54s kubelet, host Container image "gitlab/gitlab-runner-helper:x86_64-6214287e" already present on machine
Normal Created 54s kubelet, host Created container helper
Normal Started 54s kubelet, host Started container helper
The provided information is not exactly detailed as to figure out exactly what is happening.
Question:
How can we gather more detailed metrics of what exactly and when exactly something happens in regards to get a Pod running in order to troubleshoot which step exactly needs how much time?
Special interest would be the metric of how long it takes to mount a volume.
Check kubelet and kube scheduler logs because kube scheduler schedules the pod to a node and kubelet starts the pod on that node and reports the status as ready.
journalctl -u kubelet # after logging into the kubernetes node
kubectl logs kube-scheduler -n kube-system
Describe the pod, deployment, replicaset to get more details
kubectl describe pod podnanme -n namespacename
kubectl describe deploy deploymentnanme -n namespacename
kubectl describe rs replicasetnanme -n namespacename
Check events
kubectl get events -n namespacename
Describe the nodes and check available resources and status which should be ready.
kubectl describe node nodename

AWS has per node Pod IP restrictions, pods are stuck at ContainerCreating state

As we all know, AWS has per node Pod IP restriction and kubernetes doesn't care this while scheduling, pods get scheduled in nodes where no pod IPs can be allocated and pods get stuck at ContainerCreating state as following:
Normal Scheduled 114s default-scheduler Successfully assigned default/whoami-deployment-9f9c86c4f-r4flx to ip-192-168-15-248.ec2.internal
Warning FailedCreatePodSandBox 111s kubelet, ip-192-168-15-248.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8d4b5f98f9b600ad9ec486f994fa2f9223d5224842df7f78802616f014b52970" network for pod "whoami-deployment-9f9c86c4f-r4flx": NetworkPlugin cni failed to set up pod "whoami-deployment-9f9c86c4f-r4flx_default" network: add cmd: failed to assign an IP address to container
Normal SandboxChanged 86s (x12 over 109s) kubelet, ip-192-168-15-248.ec2.internal Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 61s (x4 over 76s) kubelet, ip-192-168-15-248.ec2.internal (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e2a3c54ba7d9a33a45248f7c276f4a2d5b0c8ba6c3deb5184392156b35638553" network for pod "whoami-deployment-9f9c86c4f-r4flx": NetworkPlugin cni failed to set up pod "whoami-deployment-9f9c86c4f-r4flx_default" network: add cmd: failed to assign an IP address to container
So I tried overcoming the issue by tainting nodes with key=value:NoSchedule, so that default scheduler doesn't schedule pods to the nodes which already reached pod IP limit and deleted all pods which were stuck at ContainerCreating state. I was hoping that it will make the scheduler not to schedule any more pods to tainted nodes and that's what happened but, since pods are not scheduled I was also hoping, cluster-autoscaler will scale ASG and my pods will run on new nodes and that's what didn't happen.
When I do describe pod I see:
Warning FailedScheduling 40s (x5 over 58s) default-scheduler 0/5 nodes are available: 5 node(s) had taints that the pod didn't tolerate.
Normal NotTriggerScaleUp 5s (x6 over 56s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) had taints that the pod didn't tolerate
When I look at cluster-autoscaler logs I see:
I1108 16:30:47.521026 1 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"whoami-deployment-9f9c86c4f-x5h4d", UID:"158cc806-0245-11ea-a67a-0efb4254edc4", APIVersion:"v1", ResourceVersion:"2483839", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) had taints that the pod didn't tolerate
Now, I tried an alternative way to mark my nodes unschedulable by removing the above NoSchedule taint and patching nodes by:
kubectl patch nodes node1.internal -p '{"spec": {"unschedulable": true}}'
And this is the logs I see in cluster-autoscaler:
I1109 10:47:50.894680 1 static_autoscaler.go:138] Starting main loop
W1109 10:47:50.894719 1 static_autoscaler.go:562] Cluster has no ready nodes.
I1109 10:47:50.901157 1 event.go:209] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"7c949105-0153-11ea-9a39-12e5fc698b6e", APIVersion:"v1", ResourceVersion:"2629645", FieldPath:""}): type: 'Warning' reason: 'ClusterUnhealthy' Cluster has no ready nodes.
So, my idea of overcoming the issue made no sense. How shall I overcome this?
Kubernetes version: 1.14
Cluster Autoscaler: 1.14.6
Let me know if you guys need more details.

Troubles while installing GlusterFS on Kubernetes cluster using Heketi

I try to install GlusterFS on my kubernetes cluster using heketi. I start gk-deploy but it shows that pods aren't found:
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
GlusterFS pods ... not found.
deploy-heketi pod ... not found.
heketi pod ... not found.
gluster-s3 pod ... not found.
Creating initial resources ... Error from server (AlreadyExists): error when creating "/heketi/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled
OK
node/sapdh2wrk1 not labeled
node/sapdh2wrk2 not labeled
node/sapdh2wrk3 not labeled
daemonset.extensions/glusterfs created
Waiting for GlusterFS pods to start ... pods not found.
I've started gk-deploy more than once.
I have 3 nodes in my kubernetes cluster and it seems like pods can't start up on none of them, but I don't understand why.
Pods are created but aren't ready:
kubectl get pods
NAME READY STATUS RESTARTS AGE
glusterfs-65mc7 0/1 Running 0 16m
glusterfs-gnxms 0/1 Running 0 16m
glusterfs-htkmh 0/1 Running 0 16m
heketi-754dfc7cdf-zwpwn 0/1 ContainerCreating 0 74m
Here is a log of one GlusterFS Pod, it ends with a warning:
Events:
Type Reason Age From Message
Normal Scheduled 19m default-scheduler Successfully assigned default/glusterfs-65mc7 to sapdh2wrk1
Normal Pulled 19m kubelet, sapdh2wrk1 Container image "gluster/gluster-centos:latest" already present on machine
Normal Created 19m kubelet, sapdh2wrk1 Created container
Normal Started 19m kubelet, sapdh2wrk1 Started container
Warning Unhealthy 13m (x12 over 18m) kubelet, sapdh2wrk1 Liveness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Warning Unhealthy 3m58s (x35 over 18m) kubelet, sapdh2wrk1 Readiness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Glusterfs-5.8-100.1 is installed and started up on every node including master.
What is the reason why Pods don't start up?

Kubernetes cluster VirtualBox issues with networking (NAT and Host-only adapters)

I am trying to setup a kubernetes cluster (two nodes, 1 master, 1 worker) on VirtualBox. My host computer runs Windows 10 and on the VirtualBox I have installed Ubuntu 18.10, Codename cosmic.
I have configured two adapters on each VirtualBox, one NAT and one Host-Only adapter. I did that because I need to access some internal resources using the host IP (NAT) and I also need a stable network between the host and the virtual machines (Host-only network).
I have installed Kubernetes v1.12.4 and successfully joined the worker to the master node.
NAME STATUS ROLES AGE VERSION
kubernetes-master Ready master 36m v1.12.4
kubernetes-slave Ready <none> 25m v1.12.4
I am using Flannel for networking.
All pods seems to be ok.
NAMESPACE NAME READY STATUS RESTARTS AGE
default nginx-server-7bb6997d9c-kdcld 1/1 Running 0 27m
kube-system coredns-576cbf47c7-btrvb 1/1 Running 1 38m
kube-system coredns-576cbf47c7-zfscv 1/1 Running 1 38m
kube-system etcd-kubernetes-master 1/1 Running 1 38m
kube-system kube-apiserver-kubernetes-master 1/1 Running 1 38m
kube-system kube-controller-manager-kubernetes-master 1/1 Running 1 38m
kube-system kube-flannel-ds-amd64-29p96 1/1 Running 1 28m
kube-system kube-flannel-ds-amd64-sb2fq 1/1 Running 1 37m
kube-system kube-proxy-59v6b 1/1 Running 1 38m
kube-system kube-proxy-bfd78 1/1 Running 0 28m
kube-system kube-scheduler-kubernetes-master 1/1 Running 1 38m
I have deployed nginx to verify that everything is working
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 41m
nginx-http ClusterIP 10.111.151.28 <none> 80/TCP 29m
However when I try to reach nginx I am getting a timeout. describe pod gives me the following events.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 32m default-scheduler Successfully assigned default/nginx-server-7bb6997d9c-kdcld to kubernetes-slave
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "dbb2595628fc2579c29779e31e27e27eaeff2dbcf2bdb68467c47f22a3590bd0" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "801e0f3f8ca4a9b7cc21d87d41141485e1b1da357f2d89e1644acf0ecf634016" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "77214c757449097bfbe05b24ebb5fd3c7f1d96f7e3e9a3cd48f3b37f30224feb" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "ebffdd723083d916c0910489e12368dc4069dd99c24a3a4ab1b1d4ab823866ff" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d87b93815380246a05470e597a88d50eb31c132a50e30000ab41a456d1e65107" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3ef233ef0a6c447134c7b027747a701d6576a80e76c9cc8ffd8287e8ee5f02a4" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "6b621aab3c57154941b37360240228fe939b528855a5fe8cd9536df63d41ed93" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "fa992bde90e0a1839180666bedaf74965fb26f3dccb33a66092836a25882ab44" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m kubelet, kubernetes-slave Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "81f74f687e17d67bd2853849f84ece33a118744278d78ac7af3bdeadff8aa9c7" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 32m (x2 over 32m) kubelet, kubernetes-slave (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "29188c3e73d08e81b08b2258254dc2691fcaa514ecc96e9df86f2e61ba455b76" network for pod "nginx-server-7bb6997d9c-kdcld": NetworkPlugin cni failed to set up pod "nginx-server-7bb6997d9c-kdcld_default" network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 32m (x11 over 32m) kubelet, kubernetes-slave Pod sandbox changed, it will be killed and re-created.
Normal Pulling 32m kubelet, kubernetes-slave pulling image "nginx"
Normal Pulled 32m kubelet, kubernetes-slave Successfully pulled image "nginx"
Normal Created 32m kubelet, kubernetes-slave Created container
I have tried to do the same exactly installation with a bridge adapter only configured to the virtual machines and then everything works as expected.
I believe that its a configuration issue however I am unable to solve it. Can someone advise me.
As I have mentioned in deleted comment, I recreated this on my Ubuntu 18.04 host. Created two Ubuntu 18.10 VM, with two adapters (NAT and one Host-Only adapter). I have the same configuration as you have specified here. Everything works fine.
What I had to do was to add the second adapter manually, I did it by using netplan before running kubeadm init and kubeadm join on node.
Just in case you did not do that - add the host only adapter network to the yaml file in /etc/netplan/50-cloud-init.yaml and run sudo netplan generate and sudo netplan apply. For nginx I have used deployment from official Kubernetes documentation. Then I have exposed the service:
kubectl create service nodeport nginx --tcp=80:80
Curling my node IP address on NodePort from host machine works fine.
This was just to demonstrate what I did so it works in my environment. Judging from the described pod error it seems like there is something wrong with Flannel itself:
/run/flannel/subnet.env: no such file or directory
I checked this directory on master and it looks like this:
/run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
Check if the file is there, and if this will not help you, we can try to further troubleshoot if you provide more information. However there are too many unknowns so I had to guess in some places, my advice would be to destroy it all and try again with the information I have provided, and run the nginx with NodePort and not ClusterIP type. ClusterIP will only be reachable from inside of the cluster - for example Node.
Please let me pump up this thread. Long time ago I had configurated 1 NAT for internet, 1 HOST for SSH remote and errors the same. Special when setup Rancher Longhorn.
Now, I don't build like that. First, I build the GATEWAY SERVER by using CentOS with iptable (1 NAT, 1 HOST)
Then, other VMs has just 1 interface HOST connected direct to GATEWAY SERVER