I started looking at different ways of using authentication on kubernetes. Of course, I started with the simplest option, static password file. Basically, I created a file named users.csv with the following content:
mauro,maurosil,maurosil123,group_mauro
When I start minikube using this file, it hangs at the cluster components (starting cluster components). The command I use is:
minikube --extra-config=apiserver.Authentication.PasswordFile.BasicAuthFile=~/temp/users.csv start
After a while (~ 10 minutes), the minikube start command fails with the following error message:
E0523 10:23:57.391692 30932 util.go:151] Error uploading error message: : Post https://clouderrorreporting.googleapis.com/v1beta1/projects/k8s-minikube/events:report?key=AIzaSyACUwzG0dEPcl-eOgpDKnyKoUFgHdfoFuA: x509: certificate signed by unknown authority
I can see that there are several errors on the log (minikube logs):
ay 23 09:47:32 minikube kubelet[3301]: E0523 09:47:32.473157 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.414460 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://192.168.99.100:8443/api/v1/nodes?fieldSelector=metadata.name%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.470604 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://192.168.99.100:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:33 minikube kubelet[3301]: E0523 09:47:33.474548 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: I0523 09:47:34.086654 3301 kubelet_node_status.go:271] Setting node annotation to enable volume controller attach/detach
May 23 09:47:34 minikube kubelet[3301]: I0523 09:47:34.090697 3301 kubelet_node_status.go:82] Attempting to register node minikube
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.091108 3301 kubelet_node_status.go:106] Unable to register node "minikube" with API server: Post https://192.168.99.100:8443/api/v1/nodes: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.370484 3301 event.go:209] Unable to write event: 'Patch https://192.168.99.100:8443/api/v1/namespaces/default/events/minikube.15313c5b8cf5913c: dial tcp 192.168.99.100:8443: getsockopt: connection refused' (may retry after sleeping)
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.419833 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://192.168.99.100:8443/api/v1/nodes?fieldSelector=metadata.name%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.472826 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://192.168.99.100:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
May 23 09:47:34 minikube kubelet[3301]: E0523 09:47:34.479619 3301 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.99.100:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 192.168.99.100:8443: getsockopt: connection refused
I also logged in the minikube VM (minikube ssh) and I noticed that the apiserver docker container is down. Looking at the logs of this container I see the following error:
error: unknown flag: --Authentication.PasswordFile.BasicAuthFile
Therefore, I changed my command to something like:
minikube start --extra-config=apiserver.basic-auth-file=~/temp/users.csv
It failed again but now the container shows a different error. The error is no longer related to invalid flag. Instead, it complains that the file not found (no such file or directory). I also tried to specify a file on the minikube vm (/var/lib/localkube) but I had the same issue.
The minikube version is:
minikube version: v0.26.0
When I start minikube without considering the authentication, it works fine. Are there any other steps that I need to do?
Mauro
You will need to mount the file into the docker container that runs apiserver. Pls see a hack that worked: https://github.com/kubernetes/minikube/issues/1898#issuecomment-402714802
Related
Am facing an unexpected issue today, earlier today I noticed that "kubectl get pods" was returning an "Unable to connect to the server: EOF". Upon further investigation I found out that Kubernetes apiserver is unable to connect to 127.0.0.1:443. I have been unable to resolve this problem, any assistance would be highly appreciated. Below are the logs I found.
Nov 20 18:13:30 ip-172-31-152-166.us-west-2.compute.internal kubelet[6398]: E1120 18:13:30.362106 6398 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:481: Failed to list *v1.Node: Get https://127.0.0.1/api/v1/nodes?fieldSelector=metadata.name%3Dip-172-31-152-166.us-west-2.compute.internal&limit=500&resourceVersion=0: dial tcp 127.0.0.1:443: getsockopt: connection refused
Nov 20 18:13:30 ip-172-31-152-166.us-west-2.compute.internal kubelet[6398]: E1120 18:13:30.362928 6398 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://127.0.0.1/api/v1/pods?fieldSelector=spec.nodeName%3Dip-172-31-152-166.us-west-2.compute.internal&limit=500&resourceVersion=0: dial tcp 127.0.0.1:443: getsockopt: connection refused
Nov 20 18:13:30 ip-172-31-152-166.us-west-2.compute.internal kubelet[6398]: E1120 18:13:30.363719 6398 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:472: Failed to list *v1.Service: Get https://127.0.0.1/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:443: getsockopt: connection refused
cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
Clean install for kubernetes using kubeadm init following the steps directly in the docs. Tried with flannel, weavenet and Calico.
After about 5-10 minutes, after a watch kubectl get nodes, I'll get these messages at random, leaving me with an inaccessible cluster that I can't apply any .yml files to.
Unable to connect to the server: net/http: TLS handshake timeout
The connection to the server 66.70.180.162:6443 was refused - did you specify the right host or port?
Unable to connect to the server: http2: server sent GOAWAY and closed the connection; LastStreamID=1, ErrCode=NO_ERROR, debug=""
kubelet is fine aside from it showing it can't get random services from 66.70.180.162 (the master node)
[root#play ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2018-12-10 13:57:17 EST; 21min ago
Docs: https://kubernetes.io/docs/
Main PID: 3411939 (kubelet)
CGroup: /system.slice/kubelet.service
└─3411939 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-...
Dec 10 14:18:53 play kubelet[3411939]: E1210 14:18:53.811213 3411939 reflector.go:134] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/c...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.011239 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://66.70.180.162:6443/api/v1/...nection refused
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.211160 3411939 reflector.go:134] object-"kube-system"/"kube-proxy-token-n5qjm": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.411190 3411939 reflector.go:134] object-"kube-system"/"coredns-token-7qjzv": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube-sy...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.611103 3411939 reflector.go:134] object-"kube-system"/"coredns": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/conf...
Dec 10 14:18:54 play kubelet[3411939]: E1210 14:18:54.811105 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://66.70.180.162:6443/api/v1/nod...nection refused
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.011204 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://66.70.180.162:6443/api...nection refused
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.211132 3411939 reflector.go:134] object-"kube-system"/"weave-net-token-5zb86": Failed to list *v1.Secret: Get https://66.70.180.162:6443/api/v1/namespaces/kube-...
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.411281 3411939 reflector.go:134] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Get https://66.70.180.162:6443/api/v1/namespaces/kube-system/c...
Dec 10 14:18:55 play kubelet[3411939]: E1210 14:18:55.611125 3411939 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://66.70.180.162:6443/api/v1/...nection refused
Hint: Some lines were ellipsized, use -l to show in full.
A docker container that runs coredns shows issues with getting resources from the what looks like anything in k8s default Service Subnet CIDR range (showing a VPS on a separate hosting provider using a local IP here)
.:53
2018-12-10T10:34:52.589Z [INFO] CoreDNS-1.2.6
2018-12-10T10:34:52.589Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
...
E1210 10:55:53.286644 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E1210 10:55:53.290019 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:318: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
Kubeapi is just showing random failures and it looks like its mixing in IPv6.
I1210 19:23:09.067462 1 trace.go:76] Trace[1029933921]: "Get /api/v1/nodes/play" (started: 2018-12-10 19:23:00.256692931 +0000 UTC m=+188.530973072) (total time: 8.810746081s):
Trace[1029933921]: [8.810746081s] [8.810715241s] END
E1210 19:23:09.068687 1 available_controller.go:316] v2beta1.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v2beta1.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.069678 1 available_controller.go:316] v1. failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1./status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.073019 1 available_controller.go:316] v1beta1.apiextensions.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.apiextensions.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.074112 1 available_controller.go:316] v1beta1.batch failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.batch/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.075151 1 available_controller.go:316] v2beta2.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v2beta2.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.077408 1 available_controller.go:316] v1.authorization.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.authorization.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.078457 1 available_controller.go:316] v1.networking.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.networking.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.079449 1 available_controller.go:316] v1beta1.coordination.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.coordination.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.080558 1 available_controller.go:316] v1.authentication.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.authentication.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.081628 1 available_controller.go:316] v1beta1.scheduling.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.scheduling.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.082803 1 available_controller.go:316] v1.autoscaling failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.autoscaling/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.083845 1 available_controller.go:316] v1beta1.events.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.events.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.084882 1 available_controller.go:316] v1beta1.storage.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.storage.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.085985 1 available_controller.go:316] v1.apps failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.apps/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.087019 1 available_controller.go:316] v1beta1.apps failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.apps/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.088113 1 available_controller.go:316] v1beta1.certificates.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.certificates.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.089164 1 available_controller.go:316] v1.storage.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1.storage.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
E1210 19:23:09.090268 1 available_controller.go:316] v1beta1.authentication.k8s.io failed with: Put https://[::1]:6443/apis/apiregistration.k8s.io/v1/apiservices/v1beta1.authentication.k8s.io/status: dial tcp [::1]:6443: connect: connection refused
W1210 19:23:28.996746 1 controller.go:181] StopReconciling() timed out
And I'm out of troubleshooting steps.
I have had a running k8s cluster for 2 days and then it has started behaving strangely.
My specific question is on kube-proxy. kube-proxy is not updating iptables.
From kube-proxy logs, I can see it failed to connect to kubernetes-apiserver (in my case connection is kube-prxy --> Haproxy --> k8s API server). But the pod is shown as RUNNING.
Question: I am expecting kube-proxy pod to be down if it is not able to register with apiserver for events.
How do I achieve this behavior via liveness probes?
Note: After killing the pod, kube-proxy works fine.
kube-proxy logs
sudo docker logs 1de375c94fd4 -f
W0910 15:18:22.091902 1 server.go:195] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
I0910 15:18:22.091962 1 feature_gate.go:226] feature gates: &{{} map[]}
time="2018-09-10T15:18:22Z" level=warning msg="Running modprobe ip_vs failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.15.0-33-generic/modules.dep.bin'\nmodprobe: WARNING: Module ip_vs not found in directory /lib/modules/4.15.0-33-generic`, error: exit status 1"
time="2018-09-10T15:18:22Z" level=error msg="Could not get ipvs family information from the kernel. It is possible that ipvs is not enabled in your kernel. Native loadbalancing will not work until this is fixed."
I0910 15:18:22.185086 1 server.go:409] Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config.
I0910 15:18:22.186885 1 server_others.go:140] Using iptables Proxier.
W0910 15:18:22.438408 1 server.go:601] Failed to retrieve node info: nodes "$(node_name)" not found
W0910 15:18:22.438494 1 proxier.go:306] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
I0910 15:18:22.438595 1 server_others.go:174] Tearing down inactive rules.
I0910 15:18:22.861478 1 server.go:444] Version: v1.10.2
I0910 15:18:22.867003 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 2883584
I0910 15:18:22.867046 1 conntrack.go:52] Setting nf_conntrack_max to 2883584
I0910 15:18:22.867267 1 conntrack.go:83] Setting conntrack hashsize to 720896
I0910 15:18:22.893396 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0910 15:18:22.893505 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0910 15:18:22.893737 1 config.go:102] Starting endpoints config controller
I0910 15:18:22.893749 1 controller_utils.go:1019] Waiting for caches to sync for endpoints config controller
I0910 15:18:22.893742 1 config.go:202] Starting service config controller
I0910 15:18:22.893765 1 controller_utils.go:1019] Waiting for caches to sync for service config controller
I0910 15:18:22.993904 1 controller_utils.go:1026] Caches are synced for endpoints config controller
I0910 15:18:22.993921 1 controller_utils.go:1026] Caches are synced for service config controller
W0910 16:13:28.276082 1 reflector.go:341] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: watch of *core.Endpoints ended with: very short watch: k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Unexpected watch close - watch lasted less than a second and no items received
W0910 16:13:28.276083 1 reflector.go:341] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: watch of *core.Service ended with: very short watch: k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Unexpected watch close - watch lasted less than a second and no items received
E0910 16:13:29.276678 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:29.276677 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:30.277201 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:30.278009 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:31.277723 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:31.278574 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:32.278197 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:32.279134 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:33.278684 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:33.279587 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
Question: I am expecting kube-proxy pod to be down if it is not able
to register with apiserver for events.
The kube-proxy is not supposed to go down. It listens for events on the kube-apiserver and performs whatever it needs to do when a change/deployment happens. The rationale that I can think of is that it may be caching information to keep the iptables on your system consistent. Kubernetes is designed in such a way that if your master/kube-apiserver/or master components go down, then traffic should still be flowing to the nodes with no downtime.
How do I achieve this behavior via liveness probes?
You can always add liveness probes to the kube-proxy DaemonSet but it's not a recommended practice:
spec:
containers:
- command:
- /usr/local/bin/kube-proxy
- --config=/var/lib/kube-proxy/config.conf
image: k8s.gcr.io/kube-proxy-amd64:v1.11.2
imagePullPolicy: IfNotPresent
name: kube-proxy
resources: {}
securityContext:
privileged: true
livenessProbe:
exec:
command:
- curl <apiserver>:10256/healthz
initialDelaySeconds: 5
periodSeconds: 5
Make sure that --healthz-port is enabled on the kube-apiserver.
I have a kubernetes cluster with 1 master node and 2 worker nodes. I have installed flannel as network plugin. The coredns pods inside the worker node keeps on going to CrashLoopBackOff state and then to running state. does anyone know what could be the reason.
if YES, please helm me with the solution
logs:
E0820 08:27:52.824620 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0820 08:27:52.824913 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:315: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:27:54.825587 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0820 08:27:54.827941 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:27:56.827227 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:27:58.831233 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
2018/08/20 08:27:59 [INFO] SIGTERM: Shutting down servers then terminating
E0820 08:28:00.839084 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:315: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0820 08:28:02.847262 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
I have installed 3 servers kubernetes setup by following https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
I created calico network service in the master node. my question should I create calico service in worker nodes also?
I am getting below error in worker node when i create pod
ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"
Jul 11 13:25:05 ip-172-31-20-212 kubelet: I0711 13:25:05.144142 23325 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=calico-node pod=calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)
Jul 11 13:25:05 ip-172-31-20-212 kubelet: E0711 13:25:05.144169 23325 pod_workers.go:186] Error syncing pod e17770a3-8507-11e8-962c-0ac29e406ef0 ("calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=calico-node pod=calico-node-ngwhq_kube-system(e17770a3-8507-11e8-962c-0ac29e406ef0)"
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.221953 23325 cni.go:280] Error deleting network: context deadline exceeded
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222595 23325 remote_runtime.go:115] StopPodSandbox "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-k2xsm_default" network: context deadline exceeded
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222630 23325 kuberuntime_manager.go:799] Failed to stop sandbox {"docker" "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326"}
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222664 23325 kuberuntime_manager.go:594] killPodWithSyncResult failed: failed to "KillPodSandbox" for "67e18616-850d-11e8-962c-0ac29e406ef0" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"test2-597bdc85dc-k2xsm_default\" network: context deadline exceeded"
Jul 11 13:25:07 ip-172-31-20-212 kubelet: E0711 13:25:07.222685 23325 pod_workers.go:186] Error syncing pod 67e18616-850d-11e8-962c-0ac29e406ef0 ("test2-597bdc85dc-k2xsm_default(67e18616-850d-11e8-962c-0ac29e406ef0)"), skipping: failed to "KillPodSandbox" for "67e18616-850d-11e8-962c-0ac29e406ef0" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod \"test2-597bdc85dc-k2xsm_default\" network: context deadline exceeded"
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.007944 23325 cni.go:280] Error deleting network: context deadline exceeded
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.008783 23325 remote_runtime.go:115] StopPodSandbox "4b14d68c7bc892594dedd1f62d92414574a3fb00873a805b62707c7a63bfdfe7" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-qmc85_default" network: context deadline exceeded
Jul 11 13:25:12 ip-172-31-20-212 kubelet: E0711 13:25:12.008819 23325 kuberuntime_gc.go:153] Failed to stop sandbox "4b14d68c7bc892594dedd1f62d92414574a3fb00873a805b62707c7a63bfdfe7" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "test2-597bdc85dc-qmc85_default" network: context deadline exceeded
Jul 11 13:25:19 ip-172-31-20-212 kubelet: W0711 13:25:19.145386 23325 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "22fe8b5db360011aa79afadfe91a46bfef0322092478d378ef657d3babfc1326"
I tried to install calico network in worker nodes as well with below mentioned commands but no luck getting error ..
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
unable to recognize "https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml": Get http://localhost:8080/api?timeout=32s: dial tcp 127.0.0.1:8080: connect: connection refused
Every single node needs calico service running, that's general knowledge.