Error-Getsockopt: connection refused - Kubernetes apiserver - kubernetes

Am facing an unexpected issue today, earlier today I noticed that "kubectl get pods" was returning an "Unable to connect to the server: EOF". Upon further investigation I found out that Kubernetes apiserver is unable to connect to 127.0.0.1:443. I have been unable to resolve this problem, any assistance would be highly appreciated. Below are the logs I found.
Nov 20 18:13:30 ip-172-31-152-166.us-west-2.compute.internal kubelet[6398]: E1120 18:13:30.362106 6398 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:481: Failed to list *v1.Node: Get https://127.0.0.1/api/v1/nodes?fieldSelector=metadata.name%3Dip-172-31-152-166.us-west-2.compute.internal&limit=500&resourceVersion=0: dial tcp 127.0.0.1:443: getsockopt: connection refused
Nov 20 18:13:30 ip-172-31-152-166.us-west-2.compute.internal kubelet[6398]: E1120 18:13:30.362928 6398 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://127.0.0.1/api/v1/pods?fieldSelector=spec.nodeName%3Dip-172-31-152-166.us-west-2.compute.internal&limit=500&resourceVersion=0: dial tcp 127.0.0.1:443: getsockopt: connection refused
Nov 20 18:13:30 ip-172-31-152-166.us-west-2.compute.internal kubelet[6398]: E1120 18:13:30.363719 6398 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:472: Failed to list *v1.Service: Get https://127.0.0.1/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:443: getsockopt: connection refused

Related

Why does kubernetes v13's API randomly go down?

cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
Clean install for kubernetes using kubeadm init following the steps directly in the docs. Tried with flannel, weavenet and Calico.
After about 5-10 minutes, after a watch kubectl get nodes, I'll get these messages at random, leaving me with an inaccessible cluster that I can't apply any .yml files to.
Unable to connect to the server: net/http: TLS handshake timeout
The connection to the server 66.70.180.162:6443 was refused - did you specify the right host or port?
Unable to connect to the server: http2: server sent GOAWAY and closed the connection; LastStreamID=1, ErrCode=NO_ERROR, debug=""
kubelet is fine aside from it showing it can't get random services from 66.70.180.162 (the master node)
[root#play ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2018-12-10 13:57:17 EST; 21min ago
Docs: https://kubernetes.io/docs/
Main PID: 3411939 (kubelet)
CGroup: /system.slice/kubelet.service
└─3411939 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-...
Dec 10 14:18:53 play kubelet[3411939]: E1210 14:18:53.811213 3411939 reflector.go:134] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/c...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.011239 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://66.70.180.162:6443/api/v1/...nection refused
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.211160 3411939 reflector.go:134] object-"kube-system"/"kube-proxy-token-n5qjm": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.411190 3411939 reflector.go:134] object-"kube-system"/"coredns-token-7qjzv": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube-sy...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.611103 3411939 reflector.go:134] object-"kube-system"/"coredns": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/conf...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.811105 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://66.70.180.162:6443/api/v1/nod...nection refused
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.011204 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://66.70.180.162:6443/api...nection refused
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.211132 3411939 reflector.go:134] object-"kube-system"/"weave-net-token-5zb86": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube-...
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.411281 3411939 reflector.go:134] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/c...
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.611125 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://66.70.180.162:6443/api/v1/...nection refused
Hint: Some lines were ellipsized, use -l to show in full.
A docker container that runs coredns shows issues with getting resources from the what looks like anything in k8s default Service Subnet CIDR range (showing a VPS on a separate hosting provider using a local IP here)
.:53
2018-12-10T10:34:52.589Z [INFO] CoreDNS-1.2.6
2018-12-10T10:34:52.589Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
...
E1210 10:55:53.286644 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E1210 10:55:53.290019 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:318: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
Kubeapi is just showing random failures and it looks like its mixing in IPv6.
I1210 19:23:09.067462 1 trace.go:76] Trace[1029933921]: "Get /api/v1/nodes/play" (started: 2018-12-10 19:23:00.256692931 +0000 UTC m=+188.530973072) (total time: 8.810746081s):
Trace[1029933921]: [8.810746081s] [8.810715241s] END
E1210 19:23:09.068687 1 available_controller.go:316] v2beta1.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v2beta1.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.069678 1 available_controller.go:316] v1. failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1./status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.073019 1 available_controller.go:316] v1beta1.apiextensions.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.apiextensions.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.074112 1 available_controller.go:316] v1beta1.batch failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.batch/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.075151 1 available_controller.go:316] v2beta2.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v2beta2.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.077408 1 available_controller.go:316] v1.authorization.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.authorization.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.078457 1 available_controller.go:316] v1.networking.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.networking.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.079449 1 available_controller.go:316] v1beta1.coordination.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.coordination.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.080558 1 available_controller.go:316] v1.authentication.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.authentication.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.081628 1 available_controller.go:316] v1beta1.scheduling.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.scheduling.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.082803 1 available_controller.go:316] v1.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.083845 1 available_controller.go:316] v1beta1.events.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.events.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.084882 1 available_controller.go:316] v1beta1.storage.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.storage.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.085985 1 available_controller.go:316] v1.apps failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.apps/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.087019 1 available_controller.go:316] v1beta1.apps failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.apps/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.088113 1 available_controller.go:316] v1beta1.certificates.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.certificates.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.089164 1 available_controller.go:316] v1.storage.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.storage.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.090268 1 available_controller.go:316] v1beta1.authentication.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.authentication.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
W1210 19:23:28.996746 1 controller.go:181] StopReconciling() timed out
And I'm out of troubleshooting steps.

GKE very slow to start with nodepool - cluster and k8s/gcloud api unaivalable

We have a GKE cluster composed of 7 nodes and 9 microservices at the moment.
We also add 2 nodepools with 2 nodes by default.
We use istio to do load balancing between microservices.
Our CI environment creates everything with a script.
The issue is that it takes minutes for the cluster to be available with the nodepools.
My main question is: why is the api unavailable during this time?
There are also a lot of errors in the logs for kube-system, here is a small excerpt:
k8s.io/dns/pkg/dns/dns.go:192: Failed to list *v1.Service: Get https://10.0.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused
k8s.io/dns/pkg/dns/dns.go:189: Failed to list *v1.Endpoints: Get https://10.0.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused
github.com/GoogleCloudPlatform/k8s-stackdriver/event-exporter/watchers/watcher.go:55: Failed to list *v1.Event: Get https://10.0.0.1:443/api/v1/events?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
"ERROR: logging before flag.Parse: E1114 09:50:42.925080 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
"
k8s.io/dns/pkg/dns/dns.go:189: Failed to list *v1.Endpoints: Get https://10.0.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused
k8s.io/dns/pkg/dns/dns.go:192: Failed to list *v1.Service: Get https://10.0.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused
"ERROR: logging before flag.Parse: E1114 09:50:42.873176 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
"
k8s.io/heapster/metrics/heapster.go:331: Failed to list *v1.Pod: Get https://10.0.0.1:443/api/v1/pods?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
k8s.io/heapster/metrics/processors/namespace_based_enricher.go:90: Failed to list *v1.Namespace: Get https://10.0.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
k8s.io/heapster/metrics/util/util.go:32: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
k8s.io/heapster/metrics/util/util.go:32: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
k8s.io/heapster/metrics/util/util.go:32: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
Error while getting cluster status: Get https://10.0.0.1:443/api/v1/nodes: dial tcp 10.0.0.1:443: getsockopt: connection refused
k8s.io/dns/pkg/dns/dns.go:192: Failed to list *v1.Service: Get https://10.0.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused
k8s.io/dns/pkg/dns/dns.go:189: Failed to list *v1.Endpoints: Get https://10.0.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused
github.com/GoogleCloudPlatform/k8s-stackdriver/event-exporter/watchers/watcher.go:55: Failed to list *v1.Event: Get https://10.0.0.1:443/api/v1/events?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/heapster.go:254: Failed to list *v1.Pod: Get https://10.0.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/processors/namespace_based_enricher.go:85: Failed to list *v1.Namespace: Get https://10.0.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
"ERROR: logging before flag.Parse: E1114 09:50:41.824128 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused
"
It takes time for the GCE resources to be created. In any environment, to provision a VM &/or multiple VMs in general takes some time. The endpoint is not available because the master is not yet ready. Once the cluster is created, you can add the 2 additional node pools without interrupting the master.

coredns container is getting restarted repeatedly in the kubernetes worker node

I have a kubernetes cluster with 1 master node and 2 worker nodes. I have installed flannel as network plugin. The coredns pods inside the worker node keeps on going to CrashLoopBackOff state and then to running state. does anyone know what could be the reason.
if YES, please helm me with the solution
logs:
E0820 08:27:52.824620 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0820 08:27:52.824913 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:315: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:27:54.825587 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0820 08:27:54.827941 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:27:56.827227 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:27:58.831233 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
2018/08/20 08:27:59 [INFO] SIGTERM: Shutting down servers then terminating
E0820 08:28:00.839084 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:315: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:28:02.847262 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host

network service in kubernetes worker nodes

I have installed 3 servers kubernetes setup by following https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
I created calico network service in the master node. my question should I create calico service in worker nodes also?
I am getting below error in worker node when i create pod
ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"
Jul 11 13:25:05 ip-172-31-20-212 kubelet: I0711 13:25:05.144142 23325 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=calico-node pod=calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)
Jul 11 13:25:05 ip-172-31-20-212 kubelet: E0711 13:25:05.144169 23325 pod_workers.go:186] Error syncing pod e17770a3-8507-11e8-962c-0ac29e406ef0 ("calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=calico-node pod=calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.221953 23325 cni.go:280] Error deleting network: context deadline exceeded
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222595 23325 remote_runtime.go:115] StopPodSandbox "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-k2xsm_default" network: context deadline exceeded
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222630 23325 kuberuntime_manager.go:799] Failed to stop sandbox {"docker" "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326"}
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222664 23325 kuberuntime_manager.go:594] killPodWithSyncResult failed: failed to "KillPodSandbox" for "67e18616-850d-11e8-962c-0ac29e406ef0" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"test2-597bdc85dc-k2xsm_default\" network: context deadline exceeded"
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222685 23325 pod_workers.go:186] Error syncing pod 67e18616-850d-11e8-962c-0ac29e406ef0 ("test2-597bdc85dc-k2xsm_default(67e18616-850d-11e8-962c-0ac29e406ef0)"), skipping: failed to "KillPodSandbox" for "67e18616-850d-11e8-962c-0ac29e406ef0" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"test2-597bdc85dc-k2xsm_default\" network: context deadline exceeded"
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.007944 23325 cni.go:280] Error deleting network: context deadline exceeded
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.008783 23325 remote_runtime.go:115] StopPodSandbox "4b14d68c7bc892594dedd1f62d92414574a3fb00873a805b62707c7a63bfdfe7" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-qmc85_default" network: context deadline exceeded
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.008819 23325 kuberuntime_gc.go:153] Failed to stop sandbox "4b14d68c7bc892594dedd1f62d92414574a3fb00873a805b62707c7a63bfdfe7" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-qmc85_default" network: context deadline exceeded
Jul 11 13:25:19 ip-172-31-20-212 kubelet: W0711 13:25:19.145386 23325 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326"
I tried to install calico network in worker nodes as well with below mentioned commands but no luck getting error ..
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
Every single node needs calico service running, that's general knowledge.

Kubernetes Authentication issue

I started looking at different ways of using authentication on kubernetes. Of course, I started with the simplest option, static password file. Basically, I created a file named users.csv with the following content:
mauro,maurosil,maurosil123,group_mauro
When I start minikube using this file, it hangs at the cluster components (starting cluster components). The command I use is:
minikube --extra-config=apiserver.Authentication.PasswordFile.BasicAuthFile=~/temp/users.csv start
After a while (~ 10 minutes), the minikube start command fails with the following error message:
E0523 10:23:57.391692 30932 util.go:151] Error uploading error message: : Post https://clouderrorreporting.googleapis.com/v1beta1/projects/k8s-minikube/events:report?key=AIzaSyACUwzG0dEPcl-eOgpDKnyKoUFgHdfoFuA: x509: certificate signed by unknown authority
I can see that there are several errors on the log (minikube logs):
ay 23 09:47:32 minikube kubelet[3301]: E0523 09:47:32.473157 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.414460 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://192.168.99.100:8443/api/v1/nodes?fieldSelector=metadata.name%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.470604 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://192.168.99.100:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.474548 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: I0523 09:47:34.086654 3301 kubelet_node_status.go:271] Setting node annotation to enable volume controller attach/detach
May 23 09:47:34 minikube kubelet[3301]: I0523 09:47:34.090697 3301 kubelet_node_status.go:82] Attempting to register node minikube
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.091108 3301 kubelet_node_status.go:106] Unable to register node "minikube" with API server: Post https://192.168.99.100:8443/api/v1/nodes: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.370484 3301 event.go:209] Unable to write event: 'Patch https://192.168.99.100:8443/api/v1/namespaces/default/events/minikube.15313c5b8cf5913c: dial tcp 192.168.99.100:8443: getsockopt: connection refused' (may retry after sleeping)
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.419833 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://192.168.99.100:8443/api/v1/nodes?fieldSelector=metadata.name%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.472826 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://192.168.99.100:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.479619 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
I also logged in the minikube VM (minikube ssh) and I noticed that the apiserver docker container is down. Looking at the logs of this container I see the following error:
error: unknown flag: --Authentication.PasswordFile.BasicAuthFile
Therefore, I changed my command to something like:
minikube start --extra-config=apiserver.basic-auth-file=~/temp/users.csv
It failed again but now the container shows a different error. The error is no longer related to invalid flag. Instead, it complains that the file not found (no such file or directory). I also tried to specify a file on the minikube vm (/var/lib/localkube) but I had the same issue.
The minikube version is:
minikube version: v0.26.0
When I start minikube without considering the authentication, it works fine. Are there any other steps that I need to do?
Mauro
You will need to mount the file into the docker container that runs apiserver. Pls see a hack that worked: https://github.com/kubernetes/minikube/issues/1898#issuecomment-402714802