KubeletNotReady - Failed to initialize CSINodeInfo - kubernetes

I m having below error while installing Kubernetes cluster in Ubunutu 18.04. Kubernetes master is ready. I' m using flannel as pod network. I m going to add my first node to the cluster using join command.
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Wed, 11 Dec 2019 05:43:02 +0000 Wed, 11 Dec 2019 05:38:47 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 11 Dec 2019 05:43:02 +0000 Wed, 11 Dec 2019 05:38:47 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 11 Dec 2019 05:43:02 +0000 Wed, 11 Dec 2019 05:38:47 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready False Wed, 11 Dec 2019 05:43:02 +0000 Wed, 11 Dec 2019 05:38:47 +0000 KubeletNotReady Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condition; caused by: the server could not find the requested resource
Update:
I noticed below in worker node
root#worker02:~# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Wed 2019-12-11 06:47:41 UTC; 27s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 14247 (kubelet)
Tasks: 14 (limit: 2295)
CGroup: /system.slice/kubelet.service
└─14247 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driv
Dec 11 06:47:43 worker02 kubelet[14247]: I1211 06:47:43.085292 14247 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "flannel-cfg" (UniqueName: "kuber
Dec 11 06:47:43 worker02 kubelet[14247]: I1211 06:47:43.086115 14247 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "flannel-token-nbss2" (UniqueName
Dec 11 06:47:43 worker02 kubelet[14247]: I1211 06:47:43.087975 14247 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy" (UniqueName: "kubern
Dec 11 06:47:43 worker02 kubelet[14247]: I1211 06:47:43.088104 14247 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "xtables-lock" (UniqueName: "kube
Dec 11 06:47:43 worker02 kubelet[14247]: I1211 06:47:43.088153 14247 reconciler.go:156] Reconciler: start to sync state
Dec 11 06:47:45 worker02 kubelet[14247]: E1211 06:47:45.130889 14247 csi_plugin.go:267] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condit
Dec 11 06:47:48 worker02 kubelet[14247]: E1211 06:47:48.134042 14247 csi_plugin.go:267] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condit
Dec 11 06:47:50 worker02 kubelet[14247]: E1211 06:47:50.538096 14247 csi_plugin.go:267] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condit
Dec 11 06:47:53 worker02 kubelet[14247]: E1211 06:47:53.131425 14247 csi_plugin.go:267] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condit
Dec 11 06:47:56 worker02 kubelet[14247]: E1211 06:47:56.840529 14247 csi_plugin.go:267] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condit
Please let me know how to fix this?

I have the same issue, however, my situation is that I have a running k8s cluster, and suddenly, the CISNodeInfo issue happened on some of my k8s node, and the node not in the cluster anymore. Search with google for some days, finally get the answer from
Node cannot join #86094
Just editing /var/lib/kubelet/config.yaml to add:
featureGates:
CSIMigration: false
... at the end of the file seemed to allow the cluster to start as expected.

Link to install Kubernetes
This link is only for 1 cluster with 1 master node. If you want to added worker nodes. You need to specify your IP and name of the machine in /etc/hosts file of your master node. Then instatiate your kubernetes master. Once it is started. Then join your worker nodes to the master. Make sure you install kubectl, and docker in your worker node. If you want the master only to manage kubernetes cluster then please skip step 26 of the link shared.

Related

kubernetes master node is NOTReady | Unable to register node "node-server", it is forbidden

I've been trying to troubleshoot my kubernetes cluster since my master node is NOT Ready. I've followed guides on StackOverflow and Kubernetes troubleshooting guide but I am not able to pinpoint the issue. I'm relatively new to kubernetes.
Here's what I have tried:
#kubectl get nodes
NAME STATUS ROLES AGE VERSION
NodeName NotReady master 213d v1.16.2
#kubectl describe node NodeName
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 12 Jan 2021 15:41:24 +0530 Tue, 12 Jan 2021 15:41:24 +0530 CalicoIsUp Calico is running on this node
MemoryPressure Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
# sudo journalctl -u kubelet -n 100 --no-pager
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.581359 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.681814 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.782649 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.883846 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: I0226 12:23:03.912585 11311 kubelet_node_status.go:286] Setting node annotation to enable volume controller attach/detach
Feb 26 12:23:03 devportal-test kubelet[11311]: I0226 12:23:03.918664 11311 kubelet_node_status.go:72] Attempting to register node devportal-test
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.926545 11311 kubelet_node_status.go:94] Unable to register node "devportal-test" with API server: nodes "devportal-test" is forbidden: node "NodeName" is not allowed to modify node "devportal-test"
Feb 26 12:23:05 devportal-test kubelet[11311]: E0226 12:23:05.893160 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:05 devportal-test kubelet[11311]: E0226 12:23:05.993770 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:06 devportal-test kubelet[11311]: E0226 12:23:06.095640 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:06 devportal-test kubelet[11311]: E0226 12:23:06.147085 11311 controller.go:135] failed to ensure node lease exists, will retry in 7s, error: leases.coordination.k8s.io "devportal-test" is forbidden: User "system:node:NodeName" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease": can only access node lease with the same name as the requesting node
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6d85fdfbd8-dkxjx 1/1 Terminating 8 213d
calico-kube-controllers-6d85fdfbd8-jsxjd 0/1 Pending 0 28d
calico-node-v5w2w 1/1 Running 8 213d
coredns-5644d7b6d9-g8rnl 1/1 Terminating 16 213d
coredns-5644d7b6d9-vgzg2 0/1 Pending 0 28d
coredns-5644d7b6d9-z8dzw 1/1 Terminating 16 213d
coredns-5644d7b6d9-zmcjr 0/1 Pending 0 28d
etcd-NodeName 1/1 Running 34 213d
kube-apiserver-NodeName 1/1 Running 85 213d
kube-controller-manager-NodeName 1/1 Running 790 213d
kube-proxy-gd5jx 1/1 Running 9 213d
kube-scheduler-NodeName 1/1 Running 800 213d
local-path-provisioner-56db8cbdb5-gqgqr 1/1 Running 3 44d
# kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

How to fix "The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) error

I'm setting up a new kubernetes setup in my local PC having below specifications. While trying to initiate Kubernetes cluster I'm facing some issues. Need your inputs.
OS version: Linux server.cent.com 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 3enter code here`0 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Docker version: Docker version 1.13.1, build 07f3374/1.13.1
[root#server ~]# rpm -qa |grep -i kube
kubectl-1.13.2-0.x86_64
kubernetes-cni-0.6.0-0.x86_64
kubeadm-1.13.2-0.x86_64
kubelet-1.13.2-0.x86_64
Issue facing is:
[root#server ~]# kubeadm init --apiserver-advertise-address=192.168.203.154 --pod-network-cidr=10.244.0.0/16
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
Kubelet status:
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.354902 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.456166 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.558500 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.660833 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.763840 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.867118 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:09 server.cent.com kubelet[10994]: E0129 09:34:09.968783 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.071722 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.173396 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.274892 10994 kubelet.go:2266] node "server.cent.com" not found
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.292021 10994 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.328447 10994 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubeenter code herelet.go:453: Failed to list *v1.Node: Get https://192.168.20?
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.3` `29742 10994 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168
Jan 29 09:34:10 server.cent.com kubelet[10994]: E0129 09:34:10.376238 10994 kubelet.go:2266] node "server.cent.com" not found
I have tried the same in all these versions, but same issue: 1.13.2, 1.12.0, 1.11.0, 1.10.0, and 1.9.0
I faced this problem while installing k8s on Fedora Core OS. Then I did
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
systemctl daemon-reload
systemctl restart docker
See: https://kubernetes.io/docs/setup/production-environment/container-runtimes/
Then docker restart failed and I overcome that by creating a new file /etc/systemd/system/docker.service.d/docker.conf with following content
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd
See: https://docs.docker.com/config/daemon/
After that, everything was fine and able to setup k8s cluster.
As per your outputs seems kubelet service is not able to establish connection to Kubernetes api server, therefore it hasn't passed health check during installation. The reasons might be different, however I do suggest to wipe your current kubeadm setup and proceed with installation from scratch, good tutorial for that you can find within a similar case, or even you can follow official Kubernetes kubeadm installation guidelines.
For investigation purposes you can use Kubeadm Troubleshooting guide.
In case you have any doubts about installation steps or any other related questions just write a comment below this answer.

ICP 2.1.0.3 Install Timeout: FAILED - RETRYING: Waiting for kube-dns to start

Looks like issues is because of CNI (calico) but not sure what is the fix in ICP (see journalctl -u kubelet logs below)
ICP Installer Log:
FAILED! => {"attempts": 100, "changed": true, "cmd": "kubectl -n kube-system get daemonset kube-dns -o=custom-columns=A:.status.numberAvailable,B:.status.desiredNumberScheduled --no-headers=true | tr -s \" \" | awk '$1 == $2 {print \"READY\"}'", "delta": "0:00:00.403489", "end": "2018-07-08 09:04:21.922839", "rc": 0, "start": "2018-07-08 09:04:21.519350", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
journalctl -u kubelet:
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.548157 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: nodes is forbidden: User "kubelet" cannot list nodes at the cluster scope
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.549872 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "kubelet" cannot list pods at the cluster scope
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.555379 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: services is forbidden: User "kubelet" cannot list services at the cluster scope
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.738411 2763 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"k8s-master-10.50.50.201.153f85e7528e5906", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"k8s-master-10.50.50.201", UID:"b0ed63e50c3259666286e5a788d12b81", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{scheduler}"}, Reason:"Started", Message:"Started container", Source:v1.EventSource{Component:"kubelet", Host:"10.50.50.201"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbec8c296b58a5506, ext:106413065445, loc:(*time.Location)(0xb58e300)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbec8c296b58a5506, ext:106413065445, loc:(*time.Location)(0xb58e300)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "kubelet" cannot create events in the namespace "kube-system"' (will not retry!)
Jul 08 22:40:43 dev-master hyperkube[2763]: E0708 22:40:43.938806 2763 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jul 08 22:40:44 dev-master hyperkube[2763]: E0708 22:40:44.556337 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: nodes is forbidden: User "kubelet" cannot list nodes at the cluster scope
Jul 08 22:40:44 dev-master hyperkube[2763]: E0708 22:40:44.557513 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "kubelet" cannot list pods at the cluster scope
Jul 08 22:40:44 dev-master hyperkube[2763]: E0708 22:40:44.561007 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: services is forbidden: User "kubelet" cannot list services at the cluster scope
Jul 08 22:40:45 dev-master hyperkube[2763]: E0708 22:40:45.557584 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: nodes is forbidden: User "kubelet" cannot list nodes at the cluster scope
Jul 08 22:40:45 dev-master hyperkube[2763]: E0708 22:40:45.558375 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "kubelet" cannot list pods at the cluster scope
Jul 08 22:40:45 dev-master hyperkube[2763]: E0708 22:40:45.561807 2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: services is forbidden: User "kubelet" cannot list services at the cluster scope
Jul 08 22:40:46 dev-master hyperkube[2763]: I0708 22:40:46.393905 2763 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jul 08 22:40:46 dev-master hyperkube[2763]: I0708 22:40:46.396261 2763 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jul 08 22:40:46 dev-master hyperkube[2763]: E0708 22:40:46.397540 2763 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: nodes is forbidden: User "kubelet" cannot create nodes at the cluster scope
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.161949 9689 cni.go:259] Error adding network: no configured Calico pools
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.161980 9689 cni.go:227] Error while adding to cni network: no configured Calico pools
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468392 9689 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "kube-dns-splct_kube-system" network: no configured Calico
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468455 9689 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-dns-splct_kube-system(113e64b2-82e6-11e8-83bb-0242a9e42805)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468479 9689 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-splct_kube-system(113e64b2-82e6-11e8-83bb-0242a9e42805)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468556 9689 pod_workers.go:186] Error syncing pod 113e64b2-82e6-11e8-83bb-0242a9e42805 ("kube-dns-splct_kube-system(113e64b2-82e6-11e8-83bb-0242a9e42805)"), skipping: failed to "CreatePodSandbox" for "kube-d
Jul 08 19:43:48 dev-master hyperkube[9689]: I0708 19:43:48.938222 9689 kuberuntime_manager.go:513] Container {Name:calico-node Image:ibmcom/calico-node:v3.0.4 Command:[] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[{Name:ETCD_ENDPOINTS Value: ValueFrom:&EnvVarSource
Jul 08 19:43:48 dev-master hyperkube[9689]: e:FELIX_HEALTHENABLED Value:true ValueFrom:nil} {Name:IP_AUTODETECTION_METHOD Value:can-reach=10.50.50.201 ValueFrom:nil}] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:lib-modules ReadOnly:true MountPath:/lib/m
Jul 08 19:43:48 dev-master hyperkube[9689]: I0708 19:43:48.938449 9689 kuberuntime_manager.go:757] checking backoff for container "calico-node" in pod "calico-node-wpln7_kube-system(10107b3e-82e6-11e8-83bb-0242a9e42805)"
Jul 08 19:43:48 dev-master hyperkube[9689]: I0708 19:43:48.938699 9689 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=calico-node pod=calico-node-wpln7_kube-system(10107b3e-82e6-11e8-83bb-0242a9e42805)
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.938735 9689 pod_workers.go:186] Error syncing pod 10107b3e-82e6-11e8-83bb-0242a9e42805 ("calico-node-wpln7_kube-system(10107b3e-82e6-11e8-83bb-0242a9e42805)"), skipping: failed to "StartContainer" for "calic
lines 4918-4962/4962 (END)
docker ps (master node): Container-> k8s_POD_kube-dns-splct_kube-system-* is repeatedly crashing.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ed24d636fdd1 ibmcom/pause:3.0 "/pause" 1 second ago Up Less than a second k8s_POD_kube-dns-splct_kube-system_113e64b2-82e6-11e8-83bb-0242a9e42805_121
49b648837900 ibmcom/calico-cni "/install-cni.sh" 5 minutes ago Up 5 minutes k8s_install-cni_calico-node-wpln7_kube-system_10107b3e-82e6-11e8-83bb-0242a9e42805_0
933ff30177de ibmcom/calico-kube-controllers "/usr/bin/kube-contr…" 5 minutes ago Up 5 minutes k8s_calico-kube-controllers_calico-kube-controllers-759f7fc556-mm5tg_kube-system_1010712e-82e6-11e8-83bb-0242a9e42805_0
12e9262299af ibmcom/pause:3.0 "/pause" 6 minutes ago Up 5 minutes k8s_POD_calico-kube-controllers-759f7fc556-mm5tg_kube-system_1010712e-82e6-11e8-83bb-0242a9e42805_0
8dcb2b2b3eb5 ibmcom/pause:3.0 "/pause" 6 minutes ago Up 5 minutes k8s_POD_calico-node-wpln7_kube-system_10107b3e-82e6-11e8-83bb-0242a9e42805_0
9486ff78df31 ibmcom/tiller "/tiller" 6 minutes ago Up 6 minutes k8s_tiller_tiller-deploy-c59888d97-7nwph_kube-system_016019ab-82e6-11e8-83bb-0242a9e42805_0
e5588f68af1b ibmcom/pause:3.0 "/pause" 6 minutes ago Up 6 minutes k8s_POD_tiller-deploy-c59888d97-7nwph_kube-system_016019ab-82e6-11e8-83bb-0242a9e42805_0
e80460d857ff ibmcom/icp-image-manager "/icp-image-manager …" 10 minutes ago Up 10 minutes k8s_image-manager_image-manager-0_kube-system_7b7554ce-82e5-11e8-83bb-0242a9e42805_0
e207175f19b7 ibmcom/registry "/entrypoint.sh /etc…" 10 minutes ago Up 10 minutes k8s_icp-registry_image-manager-0_kube-system_7b7554ce-82e5-11e8-83bb-0242a9e42805_0
477faf0668f3 ibmcom/pause:3.0 "/pause" 10 minutes ago Up 10 minutes k8s_POD_image-manager-0_kube-system_7b7554ce-82e5-11e8-83bb-0242a9e42805_0
8996bb8c37b7 d4b6454d4873 "/hyperkube schedule…" 10 minutes ago Up 10 minutes k8s_scheduler_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
835ee941432c d4b6454d4873 "/hyperkube apiserve…" 10 minutes ago Up 10 minutes k8s_apiserver_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
de409ff63cb2 d4b6454d4873 "/hyperkube controll…" 10 minutes ago Up 10 minutes k8s_controller-manager_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
716032a308ea ibmcom/pause:3.0 "/pause" 10 minutes ago Up 10 minutes k8s_POD_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
bd9e64e3d6a2 d4b6454d4873 "/hyperkube proxy --…" 12 minutes ago Up 12 minutes k8s_proxy_k8s-proxy-10.50.50.201_kube-system_3e068267cfe8f990cd2c9a4635be044d_0
bab3c9ef7e40 ibmcom/pause:3.0 "/pause" 12 minutes ago Up 12 minutes k8s_POD_k8s-proxy-10.50.50.201_kube-system_3e068267cfe8f990cd2c9a4635be044d_0
Kubectl (master node): I believe kube should have been initialized and running by this time but seems like it is not.
kubectl get pods -s 127.0.0.1:8888 --all-namespaces
The connection to the server 127.0.0.1:8888 was refused - did you specify the right host or port?
Following are the options I tried:
Create cluster with both IP_IP enabled and disabled. As all
nodes are on same subnet, IP_IP setup should not have impact.
Etcd running on a separate node and as part of master node
ifconfig tunl0 returns following (i.e. w/o IP assignment) in all of the above scenarios :
tunl0 Link encap:IPIP Tunnel HWaddr
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
'calicoctl get profile' returns empty and so does 'calicoctl get nodes' which I believe is because, calico is not configured yet.
Other checks, thoughts and options?
Calico Kube Contoller Logs (repeated):
2018-07-09 05:46:08.440 [WARNING][1] cache.go 278: Value for key has changed, queueing update to reprogram key="kns.default" type=v3.Profile
2018-07-09 05:46:08.440 [WARNING][1] cache.go 278: Value for key has changed, queueing update to reprogram key="kns.kube-public" type=v3.Profile
2018-07-09 05:46:08.440 [WARNING][1] cache.go 278: Value for key has changed, queueing update to reprogram key="kns.kube-system" type=v3.Profile
2018-07-09 05:46:08.440 [INFO][1] namespace_controller.go 223: Create/Update Profile in Calico datastore key="kns.default"
2018-07-09 05:46:08.441 [INFO][1] namespace_controller.go 246: Update Profile in Calico datastore with resource version key="kns.default"
2018-07-09 05:46:08.442 [INFO][1] namespace_controller.go 252: Successfully updated profile key="kns.default"
2018-07-09 05:46:08.442 [INFO][1] namespace_controller.go 223: Create/Update Profile in Calico datastore key="kns.kube-public"
2018-07-09 05:46:08.446 [INFO][1] namespace_controller.go 246: Update Profile in Calico datastore with resource version key="kns.kube-public"
2018-07-09 05:46:08.447 [INFO][1] namespace_controller.go 252: Successfully updated profile key="kns.kube-public"
2018-07-09 05:46:08.447 [INFO][1] namespace_controller.go 223: Create/Update Profile in Calico datastore key="kns.kube-system"
2018-07-09 05:46:08.465 [INFO][1] namespace_controller.go 246: Update Profile in Calico datastore with resource version key="kns.kube-system"
2018-07-09 05:46:08.476 [INFO][1] namespace_controller.go 252: Successfully updated profile key="kns.kube-system"
Firstly, from ICP 2.1.0.3, the insecure port 8888 for K8S apiserver is disabled, so you can not use this insecure port to talk to Kubenetes.
For this issue, could you let me know the below information or outputs.
The network configurations in your environment.
-> ifconfig -a
The route table:
-> route
The contents of your /ect/hosts file:
-> cat /etc/hosts
The ICP installation configurations files.
-> config.yaml & hosts
Well issue seemed to be with the Docker storage driver (btrfs) that I was using. Once I switched to 'Overlay', I was able to move forward.
I had the same experience, at least the same high level error message:
"FAILED - RETRYING: Waiting for kube-dns to start".
I had to do 2 things to pass this installation step:
change the hostname (and the entry in my /etc/hosts) to lowercase. Calico doesn't like uppercase
comment the localhost entry in /etc/hosts (#127.0.0.1 localhost.localdomain localhost)
After doing it, installation completed fine.

Kubelet Not Starting: Crashing with Exit Status: 255/n/a

Ubuntu 16.04 LTS, Docker 17.12.1, Kubernetes 1.10.0
Kubelet not starting:
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
Note: No issue with v1.9.1
LOGS:
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.518085 20051 docker_service.go:249] Docker Info: &{ID:WDJK:3BCI:BGCM:VNF3:SXGW:XO5G:KJ3Z:EKIH:XGP7:XJGG:LFBL:YWAJ Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:1 Driver:btrfs DriverStatus:[[Build Version Btrfs v4.15.1] [Library Vers
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.521232 20051 docker_service.go:262] Setting cgroupDriver to cgroupfs
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.532834 20051 remote_runtime.go:43] Connecting to runtime service unix:///var/run/dockershim.sock
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.533812 20051 kuberuntime_manager.go:186] Container runtime docker initialized, version: 18.05.0-ce, apiVersion: 1.37.0
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.534071 20051 csi_plugin.go:61] kubernetes.io/csi: plugin initializing...
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.534846 20051 kubelet.go:903] Accelerators feature is deprecated and will be removed in v1.11. Please use device plugins instead. They can be enabled using the DevicePlugins feature gate.
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.535035 20051 kubelet.go:909] GPU manager init error: couldn't get a handle to the library: unable to open a handle to the library, GPU feature is disabled.
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535082 20051 server.go:129] Starting to listen on 0.0.0.0:10250
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.535164 20051 kubelet.go:1282] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data for container /
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535189 20051 server.go:944] Started kubelet
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.535555 20051 event.go:209] Unable to write event: 'Post https://10.50.50.201:8001/api/v1/namespaces/default/events: dial tcp 10.50.50.201:8001: getsockopt: connection refused' (may retry after sleeping)
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.535825 20051 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536202 20051 status_manager.go:140] Starting to sync pod status with apiserver
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536253 20051 kubelet.go:1782] Starting kubelet main sync loop.
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536285 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536464 20051 volume_manager.go:247] Starting Kubelet Volume Manager
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.536613 20051 desired_state_of_world_populator.go:129] Desired state populator starts to run
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.538574 20051 server.go:299] Adding debug handlers to kubelet server.
Jun 22 06:45:55 dev-master hyperkube[20051]: W0622 06:45:55.538664 20051 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.539199 20051 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.636465 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.636795 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.638630 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.638954 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.836686 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.839219 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:55 dev-master hyperkube[20051]: I0622 06:45:55.841028 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:55 dev-master hyperkube[20051]: E0622 06:45:55.841357 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.236826 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.241590 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:56 dev-master hyperkube[20051]: I0622 06:45:56.245081 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.245475 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.492206 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://10.50.50.201:8001/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.493216 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.50.50.201:8001/api/v1/pods?fieldSelector=spec.nodeName%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: co
Jun 22 06:45:56 dev-master hyperkube[20051]: E0622 06:45:56.494240 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://10.50.50.201:8001/api/v1/nodes?fieldSelector=metadata.name%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connecti
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.036893 20051 kubelet.go:1799] skipping pod synchronization - [container runtime is down]
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.045705 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.047489 20051 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.047787 20051 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: Post https://10.50.50.201:8001/api/v1/nodes: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.413319 20051 event.go:209] Unable to write event: 'Post https://10.50.50.201:8001/api/v1/namespaces/default/events: dial tcp 10.50.50.201:8001: getsockopt: connection refused' (may retry after sleeping)
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.492781 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://10.50.50.201:8001/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connection refused
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.493560 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.50.50.201:8001/api/v1/pods?fieldSelector=spec.nodeName%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: co
Jun 22 06:45:57 dev-master hyperkube[20051]: E0622 06:45:57.494574 20051 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://10.50.50.201:8001/api/v1/nodes?fieldSelector=metadata.name%3D10.50.50.201&limit=500&resourceVersion=0: dial tcp 10.50.50.201:8001: getsockopt: connecti
Jun 22 06:45:57 dev-master hyperkube[20051]: W0622 06:45:57.549477 20051 manager.go:340] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.659932 20051 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661447 20051 cpu_manager.go:155] [cpumanager] starting with none policy
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661459 20051 cpu_manager.go:156] [cpumanager] reconciling every 10s
Jun 22 06:45:57 dev-master hyperkube[20051]: I0622 06:45:57.661468 20051 policy_none.go:42] [cpumanager] none policy: Start
Jun 22 06:45:57 dev-master hyperkube[20051]: W0622 06:45:57.661523 20051 fs.go:539] stat failed on /dev/loop10 with error: no such file or directory
Jun 22 06:45:57 dev-master hyperkube[20051]: F0622 06:45:57.661535 20051 kubelet.go:1359] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 126 in cached partitions map
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jun 22 06:45:57 dev-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
Run the following command on all your nodes. It worked for me.
swapoff -a
I have found a lot of the same error messages in your logs:
dial tcp 10.50.50.201:8001: getsockopt: connection refused
There may be several problems:
IP address and/or Port are incorrect
no access from the Worker to the Master
something wrong with your Master, for example, kube-apiserver is down
You should look in that direction.
user1188867's answer is definitely correct.
I want to add a piece of information for further reference for those not using Ubuntu.
I hit this on a cluster with Clear Linux on bare metal. I'm attaching here instructions on how to detect the issue in such environment and disable swap to solve.
First, launching sudo systemctl status kubelet after reboot produces the following:
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─0-cni.conf, 0-containerd.conf, 10-cgroup-driver.conf
Active: activating (auto-restart) (Result: exit-code) since Thu 2020-12-17 11:04:37 CET; 2s ago
Docs: http://kubernetes.io/docs/
Process: 2404 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, st>
Process: 2405 ExecStartPost=/usr/bin/kubelet-version-check.sh store (code=exited, status=0/SUCCESS)
Main PID: 2404 (code=exited, status=255/EXCEPTION)
CPU: 683ms
The issue was actually the existence of the swap file.
To disable it:
add the nofail option to /usr/lib/systemd/system/var-swapfile.swap. I personally used the following one-liner: sudo sed -i s/Options=/Options=nofail,/ /usr/lib/systemd/system/var-swapfile.swap
Disable swapping: sudo swapoff -a
Delete the swap file: sudo rm /var/swapfile
This procedure on Clear Linux persists the deactivation of the swap on reboots.
I had the same exit status, but my kubelet failed to start due to a limitation on the number of max_user_watches. The following got the kubelet working again
https://github.com/google/cadvisor/issues/1581#issuecomment-367616070
This problem can also arise if the service is not enabled after the Docker installation. After the next reboot, Docker is not started and kubelet cannot start either. So don't forget after the Docker installation:
sudo systemctl enable docker

kubernetes pod status always "pending"

I am following the Fedora getting started guide (https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/getting-started-guides/fedora/fedora_ansible_config.md) and trying to run the pod fedoraapache. But kubectl always shows fedoraapache as pending:
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS
fedoraapache fedoraapache fedora/apache 192.168.226.144/192.168.226.144 name=fedoraapache Pending
Since it is pending, I cannot run kubectl log pod fedoraapache. So,
I instead run kubectl describe pod fedoraapache, which shows the following errors:
Fri, 20 Mar 2015 22:00:05 +0800 Fri, 20 Mar 2015 22:00:05 +0800 1 {kubelet 192.168.226.144} implicitly required container POD created Created with docker id d4877bdffd4f2a13a17d4cc93c27c1c93d5494807b39ee8a823f5d9350e404d4
Fri, 20 Mar 2015 22:00:05 +0800 Fri, 20 Mar 2015 22:00:05 +0800 1 {kubelet 192.168.226.144} failedSync Error syncing pod, skipping: API error (500): Cannot start container d4877bdffd4f2a13a17d4cc93c27c1c93d5494807b39ee8a823f5d9350e404d4: (exit status 1)
Fri, 20 Mar 2015 22:00:15 +0800 Fri, 20 Mar 2015 22:00:15 +0800 1 {kubelet 192.168.226.144} implicitly required container POD created Created with docker id 1c32b4c6e1aad0e575f6a155aebefcd5dd96857b12c47a63bfd8562fba961747
Fri, 20 Mar 2015 22:00:15 +0800 Fri, 20 Mar 2015 22:00:15 +0800 1 {kubelet 192.168.226.144} implicitly required container POD failed Failed to start with docker id 1c32b4c6e1aad0e575f6a155aebefcd5dd96857b12c47a63bfd8562fba961747 with error: API error (500): Cannot start container 1c32b4c6e1aad0e575f6a155aebefcd5dd96857b12c47a63bfd8562fba961747: (exit status 1)
Fri, 20 Mar 2015 22:00:15 +0800 Fri, 20 Mar 2015 22:00:15 +0800 1 {kubelet 192.168.226.144} failedSync Error syncing pod, skipping: API error (500): Cannot start container 1c32b4c6e1aad0e575f6a155aebefcd5dd96857b12c47a63bfd8562fba961747: (exit status 1)
Fri, 20 Mar 2015 22:00:25 +0800 Fri, 20 Mar 2015 22:00:25 +0800 1 {kubelet 192.168.226.144} failedSync Error syncing pod, skipping: API error (500): Cannot start container 8b117ee5c6bf13f0e97b895c367ce903e2a9efbd046a663c419c389d9953c55e: (exit status 1)
Fri, 20 Mar 2015 22:00:25 +0800 Fri, 20 Mar 2015 22:00:25 +0800 1 {kubelet 192.168.226.144} implicitly required container POD created Created with docker id 8b117ee5c6bf13f0e97b895c367ce903e2a9efbd046a663c419c389d9953c55e
Fri, 20 Mar 2015 22:00:25 +0800 Fri, 20 Mar 2015 22:00:25 +0800 1 {kubelet 192.168.226.144} implicitly required container POD failed Failed to start with docker id 8b117ee5c6bf13f0e97b895c367ce903e2a9efbd046a663c419c389d9953c55e with error: API error (500): Cannot start container 8b117ee5c6bf13f0e97b895c367ce903e2a9efbd046a663c419c389d9953c55e: (exit status 1)
Fri, 20 Mar 2015 22:00:35 +0800 Fri, 20 Mar 2015 22:00:35 +0800 1 {kubelet 192.168.226.144} implicitly required container POD failed Failed to start with docker id 4b463040842b6a45db2ab154652fd2a27550dbd2e1a897c98473cd0b66d2d614 with error: API error (500): Cannot start container 4b463040842b6a45db2ab154652fd2a27550dbd2e1a897c98473cd0b66d2d614: (exit status 1)
Fri, 20 Mar 2015 22:00:35 +0800 Fri, 20 Mar 2015 22:00:35 +0800 1 {kubelet 192.168.226.144} implicitly required container POD created Created with docker id 4b463040842b6a45db2ab154652fd2a27550dbd2e1a897c98473cd0b66d2d614
Fri, 20 Mar 2015 21:42:35 +0800 Fri, 20 Mar 2015 22:00:35 +0800 109 {kubelet 192.168.226.144} implicitly required container POD pulled Successfully pulled image "kubernetes/pause:latest"
Fri, 20 Mar 2015 22:00:35 +0800 Fri, 20 Mar 2015 22:00:35 +0800 1 {kubelet 192.168.226.144} failedSync Error syncing pod, skipping: API error (500): Cannot start container 4b463040842b6a45db2ab154652fd2a27550dbd2e1a897c98473cd0b66d2d614: (exit status 1)
There are several reasons container can fail to start:
the container command itself fails and exits -> check your docker image and start up script to make sure it works.
Use
sudo docker ps -a to find the offending container and
sudo docker logs <container> to check for failure inside the container
a dependency is not there: that happens for example when one tries to mount a volume that is not present, for example Secrets that are not created yet.
--> make sure the dependent volumes are created.
Kubelet is unable to start the container we use for holding the network namespace. Some things to try are:
Can you manually pull and run gcr.io/google_containers/pause:0.8.0? (This is the image used for the network namespace container at head right now.)
As mentioned already, /var/log/kubelet.log should have more detail; but log location is distro-dependent, so check https://github.com/GoogleCloudPlatform/kubernetes/wiki/Debugging-FAQ#checking-logs.
Step one is to describe the pod and see the problems:
$ kubectl describe pod <pod Name>
or
If you use master node, then you can configure the master node to run pods with:
$ kubectl taint nodes --all node-role.kubernetes.io/master-
Check if is Kubelet is running on your machine. I came across this problem one time and discovered that Kubelet was not running, which explains why the pod status was stuck in "pending". Kubelet runs as a systemd service in my environment, so if that is also the case for you then the following command will help you check its status:
systemctl status kubelet