Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod network - kubernetes

Issue Redis POD creation on k8s(v1.10) cluster and POD creation stuck at "ContainerCreating"
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 30m default-scheduler Successfully assigned redis to k8snode02
Normal SuccessfulMountVolume 30m kubelet, k8snode02 MountVolume.SetUp succeeded for volume "default-token-f8tcg"
Warning FailedCreatePodSandBox 5m (x1202 over 30m) kubelet, k8snode02 Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "redis_default" network: failed to find plugin "loopback" in path [/opt/loopback/bin /opt/cni/bin]
Normal SandboxChanged 47s (x1459 over 30m) kubelet, k8snode02 Pod sandbox changed, it will be killed and re-created.

When I used calico as CNI and I faced a similar issue.
The container remained in creating state, I checked for /etc/cni/net.d and /opt/cni/bin on master both are present but not sure if this is required on worker node as well.
root#KubernetesMaster:/opt/cni/bin# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-5c7588df-5zds6 0/1 ContainerCreating 0 21m
root#KubernetesMaster:/opt/cni/bin# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubernetesmaster Ready master 26m v1.13.4
kubernetesslave1 Ready <none> 22m v1.13.4
root#KubernetesMaster:/opt/cni/bin#
kubectl describe pods
Name: nginx-5c7588df-5zds6
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: kubernetesslave1/10.0.3.80
Start Time: Sun, 17 Mar 2019 05:13:30 +0000
Labels: app=nginx
pod-template-hash=5c7588df
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/nginx-5c7588df
Containers:
nginx:
Container ID:
Image: nginx
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qtfbs (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-qtfbs:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qtfbs
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18m default-scheduler Successfully assigned default/nginx-5c7588df-5zds6 to kubernetesslave1
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "123d527490944d80f44b1976b82dbae5dc56934aabf59cf89f151736d7ea8adc" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8cc5e62ebaab7075782c2248e00d795191c45906cc9579464a00c09a2bc88b71" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "30ffdeace558b0935d1ed3c2e59480e2dd98e983b747dacae707d1baa222353f" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "630e85451b6ce2452839c4cfd1ecb9acce4120515702edf29421c123cf231213" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "820b919b7edcfc3081711bb78b79d33e5be3f7dafcbad29fe46b6d7aa22227aa" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "abbfb5d2756f12802072039dec20ba52f546ae755aaa642a9a75c86577be589f" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "dfeb46ffda4d0f8a434f3f3af04328fcc4b6c7cafaa62626e41b705b06d98cc4" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "9ae3f47bb0282a56e607779d3267127ee8b0ae1d7f416f5a184682119203b1c8" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Warning FailedCreatePodSandBox 18m kubelet, kubernetesslave1 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "546d07f1864728b2e2675c066775f94d658e221ada5fb4ed6bf6689ec7b8de23" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
Normal SandboxChanged 18m (x12 over 18m) kubelet, kubernetesslave1 Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 3m39s (x829 over 18m) kubelet, kubernetesslave1 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f586be437843537a3082f37ad139c88d0eacfbe99ddf00621efd4dc049a268cc" network for pod "nginx-5c7588df-5zds6": NetworkPlugin cni failed to set up pod "nginx-5c7588df-5zds6_default" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
root#KubernetesMaster:/etc/cni/net.d#
On worker node NGINX is trying to come up but getting exited, I am not sure what's going on here - I am newbie to kubernetes & not able to fix this issue -
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 3 minutes ago Up 3 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 3 minutes ago Up 3 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 3 minutes ago Up 3 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
94b2994401d0 k8s.gcr.io/pause:3.1 "/pause" 1 second ago Up Less than a second k8s_POD_nginx-5c7588df-5zds6_default_677a722b-4873-11e9-a33a-06516e7d78c4_534
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 4 minutes ago Up 4 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 4 minutes ago Up 4 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f72500cae2b7 k8s.gcr.io/pause:3.1 "/pause" 1 second ago Up Less than a second k8s_POD_nginx-5c7588df-5zds6_default_677a722b-4873-11e9-a33a-06516e7d78c4_585
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 4 minutes ago Up 4 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
root#kubernetesslave1:/home/ubuntu# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ad5500e8270 fadcc5d2b066 "/usr/local/bin/kube…" 5 minutes ago Up 5 minutes k8s_kube-proxy_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
b1c9929ebe9e k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_calico-node-749qx_kube-system_4e2d8c9c-4873-11e9-a33a-06516e7d78c4_1
ceb78340b563 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_kube-proxy-f24gd_kube-system_4e2d313a-4873-11e9-a33a-06516e7d78c4_1
I checked about /etc/cni/net.d & /opt/cni/bin on worker node as well, it is there -
root#kubernetesslave1:/home/ubuntu# cd /etc/cni
root#kubernetesslave1:/etc/cni# ls -ltr
total 4
drwxr-xr-x 2 root root 4096 Mar 17 05:19 net.d
root#kubernetesslave1:/etc/cni# cd /opt/cni
root#kubernetesslave1:/opt/cni# ls -ltr
total 4
drwxr-xr-x 2 root root 4096 Mar 17 05:19 bin
root#kubernetesslave1:/opt/cni# cd bin
root#kubernetesslave1:/opt/cni/bin# ls -ltr
total 107440
-rwxr-xr-x 1 root root 3890407 Aug 17 2017 bridge
-rwxr-xr-x 1 root root 3475802 Aug 17 2017 ipvlan
-rwxr-xr-x 1 root root 3520724 Aug 17 2017 macvlan
-rwxr-xr-x 1 root root 3877986 Aug 17 2017 ptp
-rwxr-xr-x 1 root root 3475750 Aug 17 2017 vlan
-rwxr-xr-x 1 root root 9921982 Aug 17 2017 dhcp
-rwxr-xr-x 1 root root 2605279 Aug 17 2017 sample
-rwxr-xr-x 1 root root 32351072 Mar 17 05:19 calico
-rwxr-xr-x 1 root root 31490656 Mar 17 05:19 calico-ipam
-rwxr-xr-x 1 root root 2856252 Mar 17 05:19 flannel
-rwxr-xr-x 1 root root 3084347 Mar 17 05:19 loopback
-rwxr-xr-x 1 root root 3036768 Mar 17 05:19 host-local
-rwxr-xr-x 1 root root 3550877 Mar 17 05:19 portmap
-rwxr-xr-x 1 root root 2850029 Mar 17 05:19 tuning
root#kubernetesslave1:/opt/cni/bin#

Ensure that /etc/cni/net.d and its /opt/cni/bin friend both exist and are correctly populated with the CNI configuration files and binaries on all Nodes. For flannel specifically, one might make use of the flannel cni repo

I had this issue with my GKE cluster on GCP with one of my preemptive node pools. Thanks to #mdaniel tip of checking the integrity of /etc/cni/net.d I could reproduce the issue again by ssh into the node of a testing cluster with the command gcloud compute ssh <name of some node> --zone <zone-of-cluster> --internal-ip. Then I simply edited the file /etc/cni/net.d/10-gke-ptp.conflist and messed with the values on the "routes": [ {"dst": "0.0.0.0/0"} ] (changed from 0.0.0.0/0 to 1.0.0.0/0).
After that, I deleted the pods that were running inside of it and they all got stuck with the ContainerCreating status forever generating kublet events with the error Failed create pod sandbox: rpc error: code...
Note that in order to test I've set up my nodepool to have maximum of 1 node. Otherwise it will scale up a new one and the pods will be recreated at the new node. In my production incident the nodepool reached maximum node count so setting my tests to max 1 node reproduced a similar situation.
Since that, deleting the node from GKE solved the issue in production, I created a Python script that lists all events on the cluster and filters the ones that have the keyword "Failed create pod sandbox: rpc error: code". Than I go over all events and get their pods, and then from the pods, I get the nodes. Finally I loop over the nodes deleting them both from Kubernetes API and from Compute API with it's respective Python clients. For the Python script I used the libs: kubernetes and google-cloud-compute.
This is a simpler version of the script. Test it before using it:
from kubernetes import client, config
from google.cloud.compute_v1.services.instances import InstancesClient
ERROR_KEYWORDS = [
'Failed to create pod sandbox'.lower()
]
config.load_kube_config()
v1 = client.CoreV1Api()
events_result = v1.list_event_for_all_namespaces()
filtered_events = []
# filter only the events containing ERROR_KEYWORDS
for event in events_result.items:
for error_keyword in ERROR_KEYWORDS:
if error_keyword in event.message.lower():
filtered_events.append(event)
# gets the list of pods from those events
pods_list = {}
for event in filtered_events:
try:
pod = v1.read_namespaced_pod(
event.involved_object.name,
namespace=event.involved_object.namespace
)
pod_dict = {
"name": event.involved_object.name,
"namespace": event.involved_object.namespace,
"node": pod.spec.node_name
}
pods_list[event.involved_object.name] = pod_dict
except Exception as e:
pass
# Get the nodes from those pods
broken_nodes = set()
for name, pod_dict in pods_list.items():
if pod_dict.get('node'):
broken_nodes.add(pod_dict["node"])
broken_nodes = list(broken_nodes)
# Deletes the nodes from both Kubernetes API and Compute Engine API
if broken_nodes:
broken_nodes_str = ", ".join(broken_nodes)
print(f'BROKEN NODES: "{broken_nodes_str}"')
for node in broken_nodes:
try:
api_response = v1.delete_node(node)
except Exception as e:
pass
time.sleep(30)
try:
result = gcp_client.delete(project=PROJECT_ID, zone=CLUSTER_ZONE, instance=node)
except Exception as e:
pass

AWS EKS doesn't yet support t3a, m5ad r5ad instances

kubectl drain node1 node2 --delete-local-data --force --ignore-daemonsets
Just when I was planning to expel pods on all nodes, I didn’t expect the pods with errors all the time to become Running. You can try to execute it, hope it will be useful to you

This problem appeared for me when I added a PVC on AWS EKS.
Updating the aws-node CNI plugin to the latest version resolved it -
https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html

Following steps resets kubernetes cluster and helped me to solve my problem.
Stop all running pods
Delete all worker nodes from cluster
Perform kubeadm reset on master and nodes
Initiate the master node
kubeadm init --apiserver-advertise-address
install Pod network “WeaveNet”
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&env.IPALLOC_RANGE=192.168.0.0/16"
Join nodes to the cluster
Restart all nodes

#-------------------------------------
#Reset the kubernetes environment
#-----------------------------------
#[root#centos8-Master: ~]# k get nodes
#NAME STATUS ROLES AGE VERSION
#centos8-master Ready control-plane 14m v1.24.1
#centos8-slave Ready <none> 11m v1.24.3
#
#Master Node
#1. Delete the nodes
#First delete all pods, deployments, svc
#kubectl delete --all pods
#kubectl delete --all deployments
#kubectl delete --all svc
#kubectl drain centos8-slave --ignore-daemonsets --delete-emptydir-data --force
#kubectl delete node centos8-slave
#
#Worker Node
#2. Go to worker node, stop all the kubelet services.
#[root#centos8-Slave rprasads]# kubectl version --short
#Client Version: v1.24.3
#Kustomize Version: v4.5.4
#[root#centos8-Slave rprasads]# systemctl stop kubelet
#[root#centos8-Slave rprasads]# netstat -tulnp |grep kube
#kill -9 <pid> [kube-proxy]
#
#Master Node
#2. Reset the kubeadm.
#$ sudo kubeadm reset
#$ sudo swapoff -a
#
#Master Node
#3. Get you kubeadm version
#[root#centos8-Master: ~]# kubectl version --short
#Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
#Client Version: v1.24.1
#Kustomize Version: v4.5.4
#Server Version: v1.24.3
#
#Master Node
#4.On Master Initialize the kubeadm with proper network address and version
#$ kubeadm init --apiserver-advertise-address=192.168.56.101 --pod-network-cidr=192.168.0.0/16
##Download calico yaml file from the site: Refer the documentation https://projectcalico.docs.tigera.io/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-more-than-50-nodes
#
#$ curl https://projectcalico.docs.tigera.io/manifests/calico.yaml -O
#$ kubectl apply -f calico.yaml
#
#Worker Node
#5. Go to worker node and add the node with the command displayed.
# kubeadm join 192.168.56.101:6443 --token h0nuxq.zk9m731nc4ia93pq --discovery-token-ca-cert-hash sha256:1682644baf3433caeb0e6f9099ed487ef48b94ab6a0314df88e3ff42ae501a13
#
#Master Node
#6.On the master node run below commands.
#$ sudo rm -rf $HOME/.kube
#
#$ mkdir -p $HOME/.kube
#$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
#$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
#
#$ sudo systemctl enable docker.service
#$ sudo service kubelet restart
#
#$ kubectl get nodes
#
#
#------------------------------------------------
#Test your new kubernetes cluster environment.
#-----------------------------------------------
#[root#centos8-Master: ~]# kubectl run nginx --image=nginx
#Wait for some time.
#
#[root#centos8-Master: ~]# k describe pods nginx
#Normal Scheduled 21s default-scheduler Successfully assigned default/nginx to centos8-slave
#
#[root#centos8-Master: ~]# k get pods
#NAME READY STATUS RESTARTS AGE
#nginx 1/1 Running 0 25s
#
#*************************************END*************************************

Related

k8s unable to pull image from the local unsecured registry

I am doing the CKAD course from the Linux Foundation (LFD259)
In Lab 3.2. (Configure a Local Repository) we spin up a local unsecured registry from which k8s would pull the simple app image. However, I am unable to make it work.
So, before creating the deployment everything seems to be in order:
student#master:~$ curl 10.97.82.186:5000/v2/_catalog
{"repositories":["simpleapp"]}
student#master:~$ k get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 1/1 1 1 118m
registry 1/1 1 1 118m
student#master:~$ k get pod
NAME READY STATUS RESTARTS AGE
nginx-6488f757bc-cf4q4 1/1 Running 1 (51m ago) 118m
registry-d4cf9fd7d-qj6tn 1/1 Running 1 (51m ago) 118m
student#master:~$ sudo podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/simpleapp latest bb19ffc6050a 2 hours ago 943 MB
10.97.82.186:5000/simpleapp latest bb19ffc6050a 2 hours ago 943 MB
docker.io/library/python 3 e285995a3494 8 days ago 943 MB
10.97.82.186:5000/tagtest latest 9c6f07244728 6 weeks ago 5.83 MB
student#master:~$ echo $repo
10.97.82.186:5000
student#master:~$
Let us create the deployment as per the lab instructions:
student#master:~$ k create deployment try1 --image=$repo/simpleapp
deployment.apps/try1 created
student#master:~$ k describe pod try1-5f97db4fb8-j9csw |grep Failed
Warning Failed 11s kubelet Failed to pull image "10.97.82.186:5000/simpleapp": rpc error: code = Unknown desc = failed to pull and unpack image "10.97.82.186:5000/simpleapp:latest": failed to resolve reference "10.97.82.186:5000/simpleapp:latest": failed to do request: Head https://10.97.82.186:5000/v2/simpleapp/manifests/latest: http: server gave HTTP response to HTTPS client
Warning Failed 11s kubelet Error: ErrImagePull
Warning Failed 10s (x2 over 11s) kubelet Error: ImagePullBackOff
student#master:~$
What I find suspicious is the url https://10.97.82.186:5000/v2/simpleapp/manifests/latest - no way https is going to work here.
How do we fix it?
P.S.
Also posted the question here - https://forum.linuxfoundation.org/discussion/862137/k8s-unable-to-pull-image-from-the-local-unsecured-registry
EDIT 1
To work with a local image registry we are instructed to modify the following two files:
/etc/containers/registries.conf.d/registry.conf
student#master:~$ cat /etc/containers/registries.conf.d/registry.conf
[[registry]]
location = "10.97.82.186:5000"
insecure = true
student#master:~$
/etc/containerd/config.toml
student#master:~$ diff -U3 /etc/containerd/config.toml /etc/containerd/config.toml.orig
--- /etc/containerd/config.toml 2022-09-21 21:22:37.032171446 +0000
+++ /etc/containerd/config.toml.orig 2022-09-22 03:35:37.032007211 +0000
## -152,9 +152,6 ##
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
- [plugins."io.containerd.grpc.v1.cri".registry.mirrors."*"]
- endpoint = ["10.97.82.186:5000"]
-
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
student#master:~$
You can try to set "http" protocol explicitly in your endpoint url i.e.
- [plugins."io.containerd.grpc.v1.cri".registry.mirrors."*"]
- endpoint = ["http://10.97.82.186:5000"]

network: failed to find plugin "bandwidth" in path [/opt/cni/bin]]

k8s cannot successfully create pod and is always in ConainerCreating stage.
When the describe pod status reports the error message:
network: failed to find plugin bandwidth in path [/opt/cni/bin]]
Pod sandbox changed, it will be killer and re-create.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22s default-scheduler Successfully assigned monitoring/vpc-minibase-fronted-monitor-687545864b-2tfr7 to 192.168.1.90
Warning FailedCreatePodSandBox 22s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "df25ea8a9310e4bac83aa7661e8b0304991790c4bf727ff7272738a460a229f2" network for pod "vpc-minibase-fronted-monitor-687545864b-2tfr7": networkPlugin cni failed to set up pod "vpc-minibase-fronted-monitor-687545864b-2tfr7_monitoring" network: failed to find plugin "loopback" in path [/opt/cni/bin], failed to clean up sandbox container "df25ea8a9310e4bac83aa7661e8b0304991790c4bf727ff7272738a460a229f2" network for pod "vpc-minibase-fronted-monitor-687545864b-2tfr7": networkPlugin cni failed to teardown pod "vpc-minibase-fronted-monitor-687545864b-2tfr7_monitoring" network: failed to find plugin "bandwidth" in path [/opt/cni/bin]]
Normal SandboxChanged 8s (x2 over 21s) kubelet Pod sandbox changed, it will be killed and re-created.
Check the cni path /opt/cni/bin/bandwidth but it do exists.
> ls /opt/cni/bin/
bandwidth calico calico-ipam flannel host-local install loopback portmap tuning
Since the path /opt/cni/bin/bandwidth exists, it is suspected that the cni plugin cannot be correctly identified because of the inode change.
Do the experiment and change the file inode in turn.
Change the inode of /opt/cni/bin/bandwidth, but pod can still created.
Change the inode of /opt/cni/bin, but pod can still created.
Change the inode of /opt/cni, and reproduce the above phenomenon.
> sudo cp cni cni.bak
> ls -li
2737603 drwxr-xr-x 3 root root 4096 Sep 22 20:56 cni
2737669 drwxr-xr-x 3 root root 4096 Sep 22 20:58 cni.bak
> sudo rm -rf cni
> sudo mv cni.bak cni
> ls -li
2737669 drwxr-xr-x 3 root root 4096 Sep 22 20:56 cni
Try restarting kubelet to get the current environment cni plugin again, since then all pods are back to normal.
docker restart kubelet

today,I create three pods by deployment.yaml,but pods status always is ContainerCreating.could

this is my deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.12.2
ports:
- containerPort: 80
this is my pods
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-5cc6c7559b-6vk87 0/1 ContainerCreating 0 51m <none> k8s-node2 <none> <none>
nginx-deployment-5cc6c7559b-g7wpz 0/1 ContainerCreating 0 51m <none> k8s-node1 <none> <none>
nginx-deployment-5cc6c7559b-s6k2s 0/1 ContainerCreating 0 51m <none> k8s-node1 <none> <none>
this is my description of a pod
Name: nginx-deployment-5cc6c7559b-6vk87
Namespace: default
Priority: 0
Node: k8s-node2/192.168.74.136
Start Time: Mon, 22 Mar 2021 03:02:36 -0400
Labels: app=nginx
pod-template-hash=5cc6c7559b
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/nginx-deployment-5cc6c7559b
Containers:
nginx:
Container ID:
Image: nginx:1.12.2
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-s7x98 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-s7x98:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-s7x98
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/nginx-deployment-5cc6c7559b-6vk87 to k8s-node2
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "688ae6e1b403f8cf0f56bb41ef6e2341044c949304874400a3f4ced159c40f08" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d8b9f498bf0407ebc5e8e47700af9cec559632f38d12252b1edcde723ce9863f" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "2c72ed28e5672a1da32f7941ba0b638eb459048ff9e70aec42bd125a569faf3f" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7dd06af29506a6e4f22c9484b47ca23412a57b61398ee6caa89edec59e2dcfa5" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "6c14c33fdbb3bb8e42d7e33c991bc51220dcbfd5acc71115c26f966a759fff29" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "dacb90c7ab07cc55c83dba82286e65dd89e30be569e9b5744202c2ae65f54830" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d03004bff912d6f9aaf614e892d2b43c153392e8fcc03e7988c43d4dfb46ebf0" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7eaf53ffba761c30bfa13f2b3cae2ca2957f9fefee47edf6c0b46943bb09d7a3" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 33m kubelet, k8s-node2 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "5047e9ad878b99b69090cf96e5534dfe10ec46830cdcc7e73a8afc96dc11e98c" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 18m (x859 over 33m) kubelet, k8s-node2 Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 3m34s (x1712 over 33m) kubelet, k8s-node2 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "501ff9b578eac098d6f763a0bc6212423b71714c9d2b1c83ea94b25e7a30e374" network for pod "nginx-deployment-5cc6c7559b-6vk87": networkPlugin cni failed to set up pod "nginx-deployment-5cc6c7559b-6vk87_default" network: open /run/flannel/subnet.env: no such file or directory
I find a similar question:Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod network
I have /etc/cni/net.d and its /opt/cni/bin\
[root#k8s-master bin]# cd /etc/cni/net.d/
[root#k8s-master net.d]# ll -a
total 4
drwxr-xr-x. 2 root root 33 Feb 25 11:08 .
drwxr-xr-x. 3 root root 19 Feb 25 11:08 ..
-rw-r--r--. 1 root root 292 Feb 25 11:08 10-flannel.conflist
[root#k8s-master net.d]# cd /opt/cni/bin
[root#k8s-master bin]# ll -a
total 56484
drwxr-xr-x. 2 root root 239 Feb 25 10:01 .
drwxr-xr-x. 3 root root 17 Feb 25 10:01 ..
-rwxr-xr-x. 1 root root 3254624 Sep 9 2020 bandwidth
-rwxr-xr-x. 1 root root 3581192 Sep 9 2020 bridge
-rwxr-xr-x. 1 root root 9837552 Sep 9 2020 dhcp
-rwxr-xr-x. 1 root root 4699824 Sep 9 2020 firewall
-rwxr-xr-x. 1 root root 2650368 Sep 9 2020 flannel
-rwxr-xr-x. 1 root root 3274160 Sep 9 2020 host-device
-rwxr-xr-x. 1 root root 2847152 Sep 9 2020 host-local
-rwxr-xr-x. 1 root root 3377272 Sep 9 2020 ipvlan
-rwxr-xr-x. 1 root root 2715600 Sep 9 2020 loopback
-rwxr-xr-x. 1 root root 3440168 Sep 9 2020 macvlan
-rwxr-xr-x. 1 root root 3048528 Sep 9 2020 portmap
-rwxr-xr-x. 1 root root 3528800 Sep 9 2020 ptp
-rwxr-xr-x. 1 root root 2849328 Sep 9 2020 sbr
-rwxr-xr-x. 1 root root 2503512 Sep 9 2020 static
-rwxr-xr-x. 1 root root 2820128 Sep 9 2020 tuning
-rwxr-xr-x. 1 root root 3377120 Sep 9 2020 vlan
I have three nodes named k8s-master k8s-node1 and k8s-node2,but I don't add rules for nodes.
Something is not right
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-m7bjr 0/1 CrashLoopBackOff 171 24d
coredns-7ff77c879f-x4xjf 0/1 Running 170 24d
etcd-k8s-master 1/1 Running 0 24d
kube-apiserver-k8s-master 1/1 Running 8 24d
kube-controller-manager-k8s-master 1/1 Running 2 24d
kube-proxy-6wxcp 1/1 Running 1 24d
kube-proxy-cmhn6 1/1 Running 0 24d
kube-proxy-pzhqc 1/1 Running 0 24d
kube-scheduler-k8s-master 1/1 Running 2 24d
my network plugin flannel isn't work,maybe it cause this question
Only execute one command
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Question is over
As you said looks like the issue is with flannel.
Please try to follow the advises on this GitHub issue: https://github.com/kubernetes/kubernetes/issues/70202#issuecomment-481173403
The top voted answer is:
Just got the same problem - fixed it by manually adding the file:
/run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container

We are trying to create POD but the Pod's status struck at ContainerCreating for long time.
This is the output we got after running the command: kubectl describe pod
Name: demo-6c59fb8f77-9x6sr
Namespace: default
Priority: 0
Node: k8-slave2/10.0.0.5
Start Time: Wed, 23 Dec 2020 10:16:23 +0000
Labels: app=demo
pod-template-hash=6c59fb8f77
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/demo-6c59fb8f77
Containers:
private-docker-registry:
Container ID:
Image: private-docker-registry:5000/mahin/mof-docker-demo:v1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-p94zw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-p94zw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-p94zw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/demo-6c59fb8f77-9x6sr to k8-slave2
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8eee497a2176c7f5782222f804cc63a4abac7f4a2fc7813016793857ae1b1dff" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "95e72bfc6f6c13de7f5c96eb76b012c2e6639ca03f4c2f270b23ed1a09b90413" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "566370012e4a1d32af2ef9035ff64d743cd81f36f25d2724e7b033e393b8247e" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7d499e40f572cfc29ecfb44f8376493df56a44213b1c1e9333b65499a0c288cd" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "53241e64de1e4470712b4061e2c82f44916d654bc532f8f1d12e5d5d4e136914" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "fd168faab4546f988dc38fc56df2f71cf80c922e86d3f869be15a43f08328f99" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e578afe329abb0cba64802dfa480e00f2bbbb8c80be537791c24a31c853eb62f" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "a3cb32dba55907ca907fc4f38f7ca05ef6db10a6af2dd1fa3c4db166e4ab9ffe" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7e4368ba8ec460b3c94de24ab0a04b6c799eb28df885cbbacfc3bb3ffa8c1e67" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 10m (x4 over 10m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c4aaa8f8cd2dc1eff788baf04774c4ecc845568d00ed1b386df311ec224eb6f3" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
Normal SandboxChanged 56s (x551 over 10m) kubelet Pod sandbox changed, it will be killed and re-created.
azureuser#k8-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default demo-6c59fb8f77-2jq6k 0/1 ContainerCreating 0 5m23s
kube-system coredns-f9fd979d6-q8s9b 1/1 Running 2 27h
kube-system coredns-f9fd979d6-qnm4j 1/1 Running 2 27h
kube-system etcd-k8-master 1/1 Running 2 27h
kube-system kube-apiserver-k8-master 1/1 Running 3 27h
kube-system kube-controller-manager-k8-master 1/1 Running 3 27h
kube-system kube-flannel-ds-kqz4t 0/1 CrashLoopBackOff 92 27h
kube-system kube-flannel-ds-szqzn 1/1 Running 3 27h
kube-system kube-flannel-ds-v9q47 0/1 CrashLoopBackOff 142 27h
kube-system kube-proxy-4mb47 1/1 Running 2 27h
kube-system kube-proxy-54m9b 1/1 Running 2 27h
kube-system kube-proxy-wdxfz 1/1 Running 1 27h
kube-system kube-scheduler-k8-master 1/1 Running 3 27h
kubernetes-dashboard dashboard-metrics-scraper-7b59f7d4df-zmlvs 0/1 ContainerCreating 0 27h
kubernetes-dashboard kubernetes-dashboard-665f4c5ff-cnsvn 0/1 ContainerCreating 0 6h3m
To fix the flannel crashloopbackoff we did Kubeadm reset and after some time this problem showed up again.
Current we are working with one master and two worker node.
My cluster details as follows:
azureuser#k8-master:~$ kubectl config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://52.150.11.168:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin#kubernetes
current-context: kubernetes-admin#kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
Docker version:
azureuser#k8-master:~$ sudo docker version
[sudo] password for azureuser:
Client:
Version: 19.03.6
API version: 1.40
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Wed Oct 14 19:00:27 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.6
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: 369ce74a3c
Built: Wed Oct 14 16:52:50 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.3-0ubuntu1~18.04.2
GitCommit:
runc:
Version: spec: 1.0.1-dev
GitCommit:
docker-init:
Version: 0.18.0
GitCommit:
kubeadm version :
azureuser#k8-master:~$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:15:05Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
The flannel is crashing whenever I tried to schedule pod creation.
Background
I think your issue is cased by your 2 Flannel CNI pods CrashLoopBackOff status.
Your error
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8eee497a2176c7f5782222f804cc63a4abac7f4a2fc7813016793857ae1b1dff" network for pod "demo-6c59fb8f77-9x6sr": networkPlugin cni failed to set up pod "demo-6c59fb8f77-9x6sr_default" network: open /run/flannel/subnet.env: no such file or directory
is pointing that pod cannot be created due to lack of /run/flannel/subnet.env file.
In Flannel Github document you can find:
Flannel runs a small, single binary agent called flanneld on each host, and is responsible for allocating a subnet lease to each host out of a larger, preconfigured address space.
Meaning, to proper work, Flannel pod should be running on each node as it contains subnets information. From your outputs I can see that only 1 is working properly out of 3 Flannel pods.
NAMESPACE NAME READY STATUS RESTARTS AGE
...
kube-system kube-flannel-ds-kqz4t 0/1 CrashLoopBackOff 92 27h
kube-system kube-flannel-ds-szqzn 1/1 Running 3 27h
kube-system kube-flannel-ds-v9q47 0/1 CrashLoopBackOff 142 27h
If mentioned pod was scheduled on node where flannel pod is not working it won't be created due to CNI network issues. Besides your demo pod, also kubernetes-dashboard pods have the same issue with ContainerCreating status.
Conclusion
Your demo pod cannot be scheduled as Kubernetes encounter some network issues related with flannel configuration file (...network: open /run/flannel/subnet.env: no such file or directory).
Your flannel pods restarts counts is very high as for 27 hours. You have to determine why and fix it. It might be lack of resources, network issues with your infrastructure or many other reasons. Once all flannel pods will be working correctly, your shouldn't encounter this error.
Solution
You have to make flannel pods works correctly on each node.
Additional Troubleshooting Details
For detailed investigation please provide
$ kubectl describe kube-flannel-ds-kqz4t -n kube-system
$ kubectl describe kube-flannel-ds-v9q47 -n kube-system
Logs details would be also helpful
$ kubectl logs kube-flannel-ds-kqz4t -n kube-system
$ kubectl logs kube-flannel-ds-v9q47 -n kube-system
Please replace kubectl get pods --all-namespaces with kubectl get pods -o wide -A and output of kubectl get nodes -o wide.
If you will provide those information, it should be possible to determine root cause of flannel pods issues and I will edit this answer with exact solution.

How to setup a kubernetes cluster in the purely IPV6 setting?

I have two PCs, A is with Windows 10 installed, B is with Ubuntu 20.04 installed.
In general, A and B are in the different local networks, for example, A is in 10.0.7.X, B is in 10.1.7.X, and they can not ping each other successfully, but they can visit each other by IPV6 address.
Now I want to setup a cluster in these two PCs, A is the master node and the B is the slave node.
In A, I have installed the Hyper-V and installed a Ubuntu 20.04 in the Hyper-V virtual machine and I can setup an IPV4 based Kubernetes cluster when putting them in the same local network (e.g., 10.0.7.X).
Now I'm trying to setup the cluster in purely IPV6 setting. I've tried the following command:
kubeadm init \
--apiserver-advertise-address=2400:dd01:1032:7:4070:XXXX:XXXX:XXXX\
--pod-network-cidr 10.244.0.0/16 \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--cri-socket /var/run/dockershim.sock \
--feature-gates IPv6DualStack=true
and add the network manager based on
curl https://docs.projectcalico.org/manifests/canal.yaml -O
kubectl apply -f canal.yaml
and the outputs of kubectl get pods -A are:
root#master:~# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-8f59968d4-jsw55 0/1 ContainerCreating 0 65s
kube-system canal-gwgjw 0/2 CrashLoopBackOff 2 66s
kube-system coredns-6c76c8bb89-6xjc2 0/1 ContainerCreating 0 98s
kube-system coredns-6c76c8bb89-ww6cg 0/1 ContainerCreating 0 98s
kube-system etcd-master 1/1 Running 0 104s
kube-system kube-apiserver-master 1/1 Running 0 104s
kube-system kube-controller-manager-master 1/1 Running 0 104s
kube-system kube-proxy-nd9bt 1/1 Running 0 98s
kube-system kube-scheduler-master 1/1 Running 0 104s
and the outputs of systemctl status kubelet is:
ubuntu#master:~$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2020-10-19 18:24:59 CST; 1min 24s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 98042 (kubelet)
Tasks: 24 (limit: 4657)
Memory: 43.6M
CGroup: /system.slice/kubelet.service
├─ 98042 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_co>
└─100328 /opt/cni/bin/calico
10月 19 18:26:17 master kubelet[98042]: I1019 18:26:17.206835 98042 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 8b35778f95a8f3fed7e15a1d29456f6067bba1e827ca584f1474e073a085b41e
10月 19 18:26:17 master kubelet[98042]: I1019 18:26:17.213580 98042 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: a8b573738e9b72c47c55f49b6946812206251770a4d8490002a04e80991680ef
10月 19 18:26:17 master kubelet[98042]: E1019 18:26:17.225830 98042 pod_workers.go:191] Error syncing pod fcdd3939-7388-4308-bad6-bd419adcd039 ("canal-gwgjw_kube-system(fcdd3939-7388-4308-bad6-bd419adcd039)"), skipping: failed to "StartContainer" for "kube-flannel" with >
10月 19 18:26:22 master kubelet[98042]: W1019 18:26:22.801044 98042 cni.go:333] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "ea68e0f25fbd137253c572e36bcb9cc9a29ebb4850571411a3ed55a5858f4ebc"
10月 19 18:26:23 master kubelet[98042]: W1019 18:26:23.828962 98042 cni.go:333] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "8f789a2f1bd4d9d6a6f590b6cbb218b2dda0589ce30242efbf8391e596f4050b"
10月 19 18:26:23 master kubelet[98042]: E1019 18:26:23.922994 98042 cni.go:387] Error deleting kube-system_calico-kube-controllers-8f59968d4-jsw55/ea68e0f25fbd137253c572e36bcb9cc9a29ebb4850571411a3ed55a5858f4ebc from network calico/k8s-pod-network: error getting ClusterI>
10月 19 18:26:23 master kubelet[98042]: E1019 18:26:23.925412 98042 remote_runtime.go:140] StopPodSandbox "ea68e0f25fbd137253c572e36bcb9cc9a29ebb4850571411a3ed55a5858f4ebc" from runtime service failed: rpc error: code = Unknown desc = networkPlugin cni failed to teardown>
10月 19 18:26:23 master kubelet[98042]: E1019 18:26:23.925499 98042 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "ea68e0f25fbd137253c572e36bcb9cc9a29ebb4850571411a3ed55a5858f4ebc"}
10月 19 18:26:23 master kubelet[98042]: E1019 18:26:23.925575 98042 kuberuntime_manager.go:677] killPodWithSyncResult failed: failed to "KillPodSandbox" for "bab09661-da02-414f-8faa-ccb02c8267dd" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin c>
10月 19 18:26:23 master kubelet[98042]: E1019 18:26:23.925617 98042 pod_workers.go:191] Error syncing pod bab09661-da02-414f-8faa-ccb02c8267dd ("calico-kube-controllers-8f59968d4-jsw55_kube-system(bab09661-da02-414f-8faa-ccb02c8267dd)"), skipping: failed to "KillPodSandb>
ubuntu#master:~$
Does anybody know how to setup a kubernetes cluster in purely ipv6 setting? Thanks.