kube-proxy failing to start "error: unrecognized key:" - kubernetes

I just upgraded my 1.10.0 kubernetes cluster to 1.10.12.
i also updates a node or two to the same version.
however, i now see that:
kube-proxy-r5ts5 0/1 CrashLoopBackOff 5 3m 134.79.129.110 gpu03
showing the logs gives:
# kubectl -n kube-system logs -f kube-proxy-r5ts5
error: unrecognized key:
help? i do not know how to troubleshoot this further.
coincidentally, i added a new node at the same time, and i see that weave also has problems starting:
# kubectl -n kube-system logs -f weave-net-mb299 weave
FATA: 2018/12/20 01:43:35.703088 [kube-peers] Could not get peers: Get https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: i/o timeout
Failed to get peers
# kubectl -n kube-system logs -f weave-net-mb299 weave-npc
ERROR: logging before flag.Parse: E1220 01:44:02.447197 28249 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:230: Failed to list *v1.NetworkPolicy: Get https://10.96.0.1:443/apis/networking.k8s.io/v1/networkpolicies?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
i guess this is because kube-proxy isn't up.
# kubectl -n kube-system describe pods kube-proxy-r5ts5
Name: kube-proxy-r5ts5
Namespace: kube-system
Node: gpu02/134.79.129.96
Start Time: Thu, 20 Dec 2018 02:01:10 +0000
Labels: controller-revision-hash=3231443654
k8s-app=kube-proxy
pod-template-generation=4
Annotations: <none>
Status: Running
IP: 134.79.129.96
Controlled By: DaemonSet/kube-proxy
Containers:
kube-proxy:
Container ID: docker://1bcfca6db8f68d7130de86947343a24f9fc23b506ea295509933473f3d830845
Image: gcr.io/google_containers/kube-proxy-amd64:v1.10.12
Image ID: docker-pullable://gcr.io/google_containers/kube-proxy-amd64#sha256:a9ed73c3526033cd3cf732b4a84de9d211f425ef08cce4f0535617cadf0f4200
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/kube-proxy
--config=/var/lib/kube-proxy/config.conf
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 20 Dec 2018 02:04:00 +0000
Finished: Thu, 20 Dec 2018 02:04:00 +0000
Ready: False
Restart Count: 5
Environment: <none>
Mounts:
/lib/modules from lib-modules (ro)
/run/xtables.lock from xtables-lock (rw)
/var/lib/kube-proxy from kube-proxy (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-proxy-token-m4hvr (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
kube-proxy:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-proxy
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
kube-proxy-token-m4hvr:
Type: Secret (a volume populated by a Secret)
SecretName: kube-proxy-token-m4hvr
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulMountVolume 3m kubelet, gpu02 MountVolume.SetUp succeeded for volume "xtables-lock"
Normal SuccessfulMountVolume 3m kubelet, gpu02 MountVolume.SetUp succeeded for volume "lib-modules"
Normal SuccessfulMountVolume 3m kubelet, gpu02 MountVolume.SetUp succeeded for volume "kube-proxy"
Normal SuccessfulMountVolume 3m kubelet, gpu02 MountVolume.SetUp succeeded for volume "kube-proxy-token-m4hvr"
Normal Started 2m (x4 over 3m) kubelet, gpu02 Started container
Warning BackOff 2m (x7 over 3m) kubelet, gpu02 Back-off restarting failed container
Normal Pulled 2m (x5 over 3m) kubelet, gpu02 Container image "gcr.io/google_containers/kube-proxy-amd64:v1.10.12" already present on machine
Normal Created 2m (x5 over 3m) kubelet, gpu02 Created container
probably not related, but i did have problems with cri-tools and kubeadm join saying that it couldn't find dockershim.sock. so i did a rpm -e --nodeps cri-tools and that appeared to fix the join. i'm pretty sure the docker subsystem is working as i can see other kubernetes pods on the machine (eg k8s_POD_weave-net-mb299_kube-system, k8s_weave-npc_weave-net-mb299_kube-system)
a snapshot of the logs from one of the minions:
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.459850 10526 cni.go:227] Error while adding to cni network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/e7ba3feb145f2004ac730c96eb6e1f7c91ad30d70515984de37d325b98abb616: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.637709 10526 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "hub-85c95bbd57-bx4sr_jupyter-prod" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/de1f07ee792f8d2e666efffdf756774ebab0558e279e6f0e8375d520ca7cb63e: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.637826 10526 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "hub-85c95bbd57-bx4sr_jupyter-prod(bd2287cb-0475-11e9-90de-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "hub-85c95bbd57-bx4sr_jupyter-prod" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/de1f07ee792f8d2e666efffdf756774ebab0558e279e6f0e8375d520ca7cb63e: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.637852 10526 kuberuntime_manager.go:646] createPodSandbox for pod "hub-85c95bbd57-bx4sr_jupyter-prod(bd2287cb-0475-11e9-90de-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "hub-85c95bbd57-bx4sr_jupyter-prod" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/de1f07ee792f8d2e666efffdf756774ebab0558e279e6f0e8375d520ca7cb63e: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.637947 10526 pod_workers.go:186] Error syncing pod bd2287cb-0475-11e9-90de-fa163e21c438 ("hub-85c95bbd57-bx4sr_jupyter-prod(bd2287cb-0475-11e9-90de-fa163e21c438)"), skipping: failed to "CreatePodSandbox" for "hub-85c95bbd57-bx4sr_jupyter-prod(bd2287cb-0475-11e9-90de-fa163e21c438)" with CreatePodSandboxError: "CreatePodSandbox for pod \"hub-85c95bbd57-bx4sr_jupyter-prod(bd2287cb-0475-11e9-90de-fa163e21c438)\" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod \"hub-85c95bbd57-bx4sr_jupyter-prod\" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/de1f07ee792f8d2e666efffdf756774ebab0558e279e6f0e8375d520ca7cb63e: dial tcp 127.0.0.1:6784: getsockopt: connection refused"
Dec 20 08:41:19 gpu01 kubelet[10526]: W1220 08:41:19.661793 10526 container.go:507] Failed to update stats for container "/libcontainer_14802_systemd_test_default.slice": read /sys/fs/cgroup/cpu,cpuacct/libcontainer_14802_systemd_test_default.slice/cpuacct.usage: no such device, continuing to push stats
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.745423 10526 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nvidia-device-plugin-daemonset-ljmv9_kube-system" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/e7ba3feb145f2004ac730c96eb6e1f7c91ad30d70515984de37d325b98abb616: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.745492 10526 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nvidia-device-plugin-daemonset-ljmv9_kube-system" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/e7ba3feb145f2004ac730c96eb6e1f7c91ad30d70515984de37d325b98abb616: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.745526 10526 kuberuntime_manager.go:646] createPodSandbox for pod "nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nvidia-device-plugin-daemonset-ljmv9_kube-system" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/e7ba3feb145f2004ac730c96eb6e1f7c91ad30d70515984de37d325b98abb616: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:19 gpu01 kubelet[10526]: E1220 08:41:19.745640 10526 pod_workers.go:186] Error syncing pod ad93d43c-f986-11e8-a0db-fa163e21c438 ("nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)"), skipping: failed to "CreatePodSandbox" for "nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)" with CreatePodSandboxError: "CreatePodSandbox for pod \"nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)\" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod \"nvidia-device-plugin-daemonset-ljmv9_kube-system\" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/e7ba3feb145f2004ac730c96eb6e1f7c91ad30d70515984de37d325b98abb616: dial tcp 127.0.0.1:6784: getsockopt: connection refused"
Dec 20 08:41:19 gpu01 kubelet[10526]: W1220 08:41:19.858313 10526 pod_container_deletor.go:77] Container "e7ba3feb145f2004ac730c96eb6e1f7c91ad30d70515984de37d325b98abb616" not found in pod's containers
Dec 20 08:41:19 gpu01 kubelet[10526]: W1220 08:41:19.934213 10526 pod_container_deletor.go:77] Container "de1f07ee792f8d2e666efffdf756774ebab0558e279e6f0e8375d520ca7cb63e" not found in pod's containers
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.696842 10526 cni.go:259] Error adding network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/264521a208ca5f0a3081b5b40e6f0176624c44ee40d0b02e31e4f148194faa78: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.696892 10526 cni.go:227] Error while adding to cni network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/264521a208ca5f0a3081b5b40e6f0176624c44ee40d0b02e31e4f148194faa78: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: W1220 08:41:20.697306 10526 container.go:393] Failed to create summary reader for "/libcontainer_14936_systemd_test_default.slice": none of the resources are being tracked.
Dec 20 08:41:20 gpu01 kubelet[10526]: W1220 08:41:20.697520 10526 container.go:393] Failed to create summary reader for "/libcontainer_14941_systemd_test_default.slice": none of the resources are being tracked.
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.708833 10526 cni.go:259] Error adding network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/296ffa649c2fdb61d7b0e10aa9e0051fbcb2931a0f12dc471820a0b58ad4fc4a: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.708860 10526 cni.go:227] Error while adding to cni network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/296ffa649c2fdb61d7b0e10aa9e0051fbcb2931a0f12dc471820a0b58ad4fc4a: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.860952 10526 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nvidia-device-plugin-daemonset-ljmv9_kube-system" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/264521a208ca5f0a3081b5b40e6f0176624c44ee40d0b02e31e4f148194faa78: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.861039 10526 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nvidia-device-plugin-daemonset-ljmv9_kube-system" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/264521a208ca5f0a3081b5b40e6f0176624c44ee40d0b02e31e4f148194faa78: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.861067 10526 kuberuntime_manager.go:646] createPodSandbox for pod "nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nvidia-device-plugin-daemonset-ljmv9_kube-system" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/264521a208ca5f0a3081b5b40e6f0176624c44ee40d0b02e31e4f148194faa78: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.861167 10526 pod_workers.go:186] Error syncing pod ad93d43c-f986-11e8-a0db-fa163e21c438 ("nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)"), skipping: failed to "CreatePodSandbox" for "nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)" with CreatePodSandboxError: "CreatePodSandbox for pod \"nvidia-device-plugin-daemonset-ljmv9_kube-system(ad93d43c-f986-11e8-a0db-fa163e21c438)\" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod \"nvidia-device-plugin-daemonset-ljmv9_kube-system\" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/264521a208ca5f0a3081b5b40e6f0176624c44ee40d0b02e31e4f148194faa78: dial tcp 127.0.0.1:6784: getsockopt: connection refused"
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.954796 10526 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "hub-85c95bbd57-bx4sr_jupyter-prod" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/296ffa649c2fdb61d7b0e10aa9e0051fbcb2931a0f12dc471820a0b58ad4fc4a: dial tcp 127.0.0.1:6784: getsockopt: connection refused
Dec 20 08:41:20 gpu01 kubelet[10526]: E1220 08:41:20.954851 10526 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "hub-85c95bbd57-bx4sr_jupyter-prod(bd2287cb-0475-11e9-90de-fa163e21c438)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "hub-85c95bbd57-bx4sr_jupyter-prod" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/296ffa649c2fdb61d7b0e10aa9e0051fbcb2931a0f12dc471820a0b58ad4fc4a: dial tcp 127.0.0.1:6784: getsockopt: connection refused

Related

coredns connection refused error while setting up kubernetes cluster

I've got a kubernetes cluster set up with kubeadm. I haven't deployed any pods yet, but the coredns pods are stuck in a ContainerCreating status.
[root#master-node ~]# kubectl get -A pods
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-f5kjh 0/1 ContainerCreating 0 151m
kube-system coredns-64897985d-xz9nt 0/1 ContainerCreating 0 151m
[...]
When I check it out with kubectl describe I see this:
[root#master-node ~]# kubectl describe -n kube-system pod coredns-64897985d-f5kjh
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 22m (x570 over 145m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4974dadd11fecf1ebfbcccd75701641b752426808889895672f34e6934776207": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/4974dadd11fecf1ebfbcccd75701641b752426808889895672f34e6934776207": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "bce2558b24468c0d0e83fe1eedf2fa70108420a466d000b74ceaf351e595007d": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/bce2558b24468c0d0e83fe1eedf2fa70108420a466d000b74ceaf351e595007d": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e53e79bc3642c9a0c2b240dc174931af9f5dddf7d5b7df50382fcb3fea351df9": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/e53e79bc3642c9a0c2b240dc174931af9f5dddf7d5b7df50382fcb3fea351df9": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b6da6e72057c3b48ac6ced3ba6b81917111e94c20216b65126a2733462139ed1": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/b6da6e72057c3b48ac6ced3ba6b81917111e94c20216b65126a2733462139ed1": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "09416534b75ef7beea279f9389eb1a732b6a288c3b170a489e04cce01c294fa2": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/09416534b75ef7beea279f9389eb1a732b6a288c3b170a489e04cce01c294fa2": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "411fe06179ab24a3999b1c034bc99452d99249bbb6cb966b496f7a8b467e1806": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/411fe06179ab24a3999b1c034bc99452d99249bbb6cb966b496f7a8b467e1806": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e0fc2a5d4852cd31eca4b473f614cadcb9235a2a325c01b469110bfd6bbf9a3b": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/e0fc2a5d4852cd31eca4b473f614cadcb9235a2a325c01b469110bfd6bbf9a3b": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4528997239e55f7ef546c0af9cc7c12cf5fe4942a370ed2a772ba7fc405773d2": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/4528997239e55f7ef546c0af9cc7c12cf5fe4942a370ed2a772ba7fc405773d2": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b534273b4fe3b893cdeac05555e47429bc7578c1e0c0095481fe155637f0c4ae": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/b534273b4fe3b893cdeac05555e47429bc7578c1e0c0095481fe155637f0c4ae": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "afc479a4bfa16cef4367ecfee74333dfa9bbf12c59995446792f22c8e39ca16d": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/afc479a4bfa16cef4367ecfee74333dfa9bbf12c59995446792f22c8e39ca16d": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 3m50s (x61 over 16m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a9254528ba611403a9b2293a2201c8758ff4adf75fd4a1d2b9690d15446cc92a": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/a9254528ba611403a9b2293a2201c8758ff4adf75fd4a1d2b9690d15446cc92a": dial tcp 127.0.0.1:6784: connect: connection refused
Any idea what could be causing this?
Turns out this is a firewall issue. I was using Weavenet as my CNI, which requires port 6784 to be open to work. You can see this in the error, where it's trying to access 127.0.0.1:6784 and getting the connection refused (pretty obvious in hindsight). I fixed it by opening port 6784 on my firewall. For firewalld, I did
firewall-cmd --permanent --add-port=6784/tcp
firewall-cmd --reload
This might be a security problem. The weavenet docs said something about how this port should only be accessible to certain processes or something, not sure. For my application security isn't a big concern so I didn't bother looking into it.

Failed to create pod sandbox [flannel]

I am running into this error on random pods. Thank you #matthew-l-daniel for the comment - as I didn't know where to start.
Here is the contents of /opt/cni/bin on the node
:/opt/cni/bin$ ls
bridge host-local loopback
Here are the kubelet logs for a container that failed.
Jan 30 15:42:00 ip-172-20-39-216 kubelet[32233]: E0130 15:42:00.924370 32233 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "postgres-core-0_service-master-459cf23(d8acae2f-24a2-11e9-b79c-0a0d1213cce2)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "postgres-core-0": Error response from daemon: grpc: the connection is unavailable
Jan 30 15:42:00 ip-172-20-39-216 kubelet[32233]: E0130 15:42:00.924380 32233 kuberuntime_manager.go:647] createPodSandbox for pod "postgres-core-0_service-master-459cf23(d8acae2f-24a2-11e9-b79c-0a0d1213cce2)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "postgres-core-0": Error response from daemon: grpc: the connection is unavailable
Jan 30 15:42:00 ip-172-20-39-216 kubelet[32233]: E0130 15:42:00.924427 32233 pod_workers.go:186] Error syncing pod d8acae2f-24a2-11e9-b79c-0a0d1213cce2 ("postgres-core-0_service-master-459cf23(d8acae2f-24a2-11e9-b79c-0a0d1213cce2)"), skipping: failed to "CreatePodSandbox" for "postgres-core-0_service-master-459cf23(d8acae2f-24a2-11e9-b79c-0a0d1213cce2)" with CreatePodSandboxError: "CreatePodSandbox for pod \"postgres-core-0_service-master-459cf23(d8acae2f-24a2-11e9-b79c-0a0d1213cce2)\" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod \"postgres-core-0\": Error response from daemon: grpc: the connection is unavailable"
As for flannel container logs, there are many flannel pods running - and all are healthy.
Kubernetes v 1.10.11
Docker version 17.03.2-ce, build f5ec1e2
Flannel logs
E0130 15:34:16.536354 1 vxlan_network.go:187] DelFDB failed: no such file or directory
E0130 15:34:16.536411 1 vxlan_network.go:191] failed to delete vxlanRoute (100.107.178.0/24 -> 100.107.178.0): no such process
E0130 17:33:44.848163 1 vxlan_network.go:187] DelFDB failed: no such file or directory
E0130 17:33:44.848219 1 vxlan_network.go:191] failed to delete vxlanRoute (100.107.201.0/24 -> 100.107.201.0): no such process

Why does kubernetes v13's API randomly go down?

cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
Clean install for kubernetes using kubeadm init following the steps directly in the docs. Tried with flannel, weavenet and Calico.
After about 5-10 minutes, after a watch kubectl get nodes, I'll get these messages at random, leaving me with an inaccessible cluster that I can't apply any .yml files to.
Unable to connect to the server: net/http: TLS handshake timeout
The connection to the server 66.70.180.162:6443 was refused - did you specify the right host or port?
Unable to connect to the server: http2: server sent GOAWAY and closed the connection; LastStreamID=1, ErrCode=NO_ERROR, debug=""
kubelet is fine aside from it showing it can't get random services from 66.70.180.162 (the master node)
[root#play ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2018-12-10 13:57:17 EST; 21min ago
Docs: https://kubernetes.io/docs/
Main PID: 3411939 (kubelet)
CGroup: /system.slice/kubelet.service
└─3411939 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-...
Dec 10 14:18:53 play kubelet[3411939]: E1210 14:18:53.811213 3411939 reflector.go:134] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/c...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.011239 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://66.70.180.162:6443/api/v1/...nection refused
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.211160 3411939 reflector.go:134] object-"kube-system"/"kube-proxy-token-n5qjm": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.411190 3411939 reflector.go:134] object-"kube-system"/"coredns-token-7qjzv": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube-sy...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.611103 3411939 reflector.go:134] object-"kube-system"/"coredns": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/conf...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.811105 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://66.70.180.162:6443/api/v1/nod...nection refused
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.011204 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://66.70.180.162:6443/api...nection refused
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.211132 3411939 reflector.go:134] object-"kube-system"/"weave-net-token-5zb86": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube-...
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.411281 3411939 reflector.go:134] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/c...
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.611125 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://66.70.180.162:6443/api/v1/...nection refused
Hint: Some lines were ellipsized, use -l to show in full.
A docker container that runs coredns shows issues with getting resources from the what looks like anything in k8s default Service Subnet CIDR range (showing a VPS on a separate hosting provider using a local IP here)
.:53
2018-12-10T10:34:52.589Z [INFO] CoreDNS-1.2.6
2018-12-10T10:34:52.589Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
...
E1210 10:55:53.286644 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E1210 10:55:53.290019 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:318: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
Kubeapi is just showing random failures and it looks like its mixing in IPv6.
I1210 19:23:09.067462 1 trace.go:76] Trace[1029933921]: "Get /api/v1/nodes/play" (started: 2018-12-10 19:23:00.256692931 +0000 UTC m=+188.530973072) (total time: 8.810746081s):
Trace[1029933921]: [8.810746081s] [8.810715241s] END
E1210 19:23:09.068687 1 available_controller.go:316] v2beta1.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v2beta1.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.069678 1 available_controller.go:316] v1. failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1./status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.073019 1 available_controller.go:316] v1beta1.apiextensions.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.apiextensions.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.074112 1 available_controller.go:316] v1beta1.batch failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.batch/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.075151 1 available_controller.go:316] v2beta2.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v2beta2.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.077408 1 available_controller.go:316] v1.authorization.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.authorization.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.078457 1 available_controller.go:316] v1.networking.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.networking.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.079449 1 available_controller.go:316] v1beta1.coordination.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.coordination.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.080558 1 available_controller.go:316] v1.authentication.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.authentication.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.081628 1 available_controller.go:316] v1beta1.scheduling.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.scheduling.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.082803 1 available_controller.go:316] v1.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.083845 1 available_controller.go:316] v1beta1.events.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.events.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.084882 1 available_controller.go:316] v1beta1.storage.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.storage.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.085985 1 available_controller.go:316] v1.apps failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.apps/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.087019 1 available_controller.go:316] v1beta1.apps failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.apps/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.088113 1 available_controller.go:316] v1beta1.certificates.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.certificates.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.089164 1 available_controller.go:316] v1.storage.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.storage.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.090268 1 available_controller.go:316] v1beta1.authentication.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.authentication.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
W1210 19:23:28.996746 1 controller.go:181] StopReconciling() timed out
And I'm out of troubleshooting steps.

network service in kubernetes worker nodes

I have installed 3 servers kubernetes setup by following https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
I created calico network service in the master node. my question should I create calico service in worker nodes also?
I am getting below error in worker node when i create pod
ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"
Jul 11 13:25:05 ip-172-31-20-212 kubelet: I0711 13:25:05.144142 23325 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=calico-node pod=calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)
Jul 11 13:25:05 ip-172-31-20-212 kubelet: E0711 13:25:05.144169 23325 pod_workers.go:186] Error syncing pod e17770a3-8507-11e8-962c-0ac29e406ef0 ("calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=calico-node pod=calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.221953 23325 cni.go:280] Error deleting network: context deadline exceeded
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222595 23325 remote_runtime.go:115] StopPodSandbox "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-k2xsm_default" network: context deadline exceeded
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222630 23325 kuberuntime_manager.go:799] Failed to stop sandbox {"docker" "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326"}
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222664 23325 kuberuntime_manager.go:594] killPodWithSyncResult failed: failed to "KillPodSandbox" for "67e18616-850d-11e8-962c-0ac29e406ef0" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"test2-597bdc85dc-k2xsm_default\" network: context deadline exceeded"
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222685 23325 pod_workers.go:186] Error syncing pod 67e18616-850d-11e8-962c-0ac29e406ef0 ("test2-597bdc85dc-k2xsm_default(67e18616-850d-11e8-962c-0ac29e406ef0)"), skipping: failed to "KillPodSandbox" for "67e18616-850d-11e8-962c-0ac29e406ef0" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"test2-597bdc85dc-k2xsm_default\" network: context deadline exceeded"
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.007944 23325 cni.go:280] Error deleting network: context deadline exceeded
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.008783 23325 remote_runtime.go:115] StopPodSandbox "4b14d68c7bc892594dedd1f62d92414574a3fb00873a805b62707c7a63bfdfe7" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-qmc85_default" network: context deadline exceeded
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.008819 23325 kuberuntime_gc.go:153] Failed to stop sandbox "4b14d68c7bc892594dedd1f62d92414574a3fb00873a805b62707c7a63bfdfe7" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-qmc85_default" network: context deadline exceeded
Jul 11 13:25:19 ip-172-31-20-212 kubelet: W0711 13:25:19.145386 23325 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326"
I tried to install calico network in worker nodes as well with below mentioned commands but no luck getting error ..
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
Every single node needs calico service running, that's general knowledge.

Kubernetes Authentication issue

I started looking at different ways of using authentication on kubernetes. Of course, I started with the simplest option, static password file. Basically, I created a file named users.csv with the following content:
mauro,maurosil,maurosil123,group_mauro
When I start minikube using this file, it hangs at the cluster components (starting cluster components). The command I use is:
minikube --extra-config=apiserver.Authentication.PasswordFile.BasicAuthFile=~/temp/users.csv start
After a while (~ 10 minutes), the minikube start command fails with the following error message:
E0523 10:23:57.391692 30932 util.go:151] Error uploading error message: : Post https://clouderrorreporting.googleapis.com/v1beta1/projects/k8s-minikube/events:report?key=AIzaSyACUwzG0dEPcl-eOgpDKnyKoUFgHdfoFuA: x509: certificate signed by unknown authority
I can see that there are several errors on the log (minikube logs):
ay 23 09:47:32 minikube kubelet[3301]: E0523 09:47:32.473157 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.414460 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://192.168.99.100:8443/api/v1/nodes?fieldSelector=metadata.name%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.470604 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://192.168.99.100:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.474548 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: I0523 09:47:34.086654 3301 kubelet_node_status.go:271] Setting node annotation to enable volume controller attach/detach
May 23 09:47:34 minikube kubelet[3301]: I0523 09:47:34.090697 3301 kubelet_node_status.go:82] Attempting to register node minikube
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.091108 3301 kubelet_node_status.go:106] Unable to register node "minikube" with API server: Post https://192.168.99.100:8443/api/v1/nodes: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.370484 3301 event.go:209] Unable to write event: 'Patch https://192.168.99.100:8443/api/v1/namespaces/default/events/minikube.15313c5b8cf5913c: dial tcp 192.168.99.100:8443: getsockopt: connection refused' (may retry after sleeping)
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.419833 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://192.168.99.100:8443/api/v1/nodes?fieldSelector=metadata.name%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.472826 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://192.168.99.100:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.479619 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
I also logged in the minikube VM (minikube ssh) and I noticed that the apiserver docker container is down. Looking at the logs of this container I see the following error:
error: unknown flag: --Authentication.PasswordFile.BasicAuthFile
Therefore, I changed my command to something like:
minikube start --extra-config=apiserver.basic-auth-file=~/temp/users.csv
It failed again but now the container shows a different error. The error is no longer related to invalid flag. Instead, it complains that the file not found (no such file or directory). I also tried to specify a file on the minikube vm (/var/lib/localkube) but I had the same issue.
The minikube version is:
minikube version: v0.26.0
When I start minikube without considering the authentication, it works fine. Are there any other steps that I need to do?
Mauro
You will need to mount the file into the docker container that runs apiserver. Pls see a hack that worked: https://github.com/kubernetes/minikube/issues/1898#issuecomment-402714802