Recover kubernetes-dashboard pod? - kubernetes-dashboard

I've installed kubernetes on Ubuntu 18.04.5 LTS (Bionic Beaver). And now try to run the kubernetes-dashboard. However it keeps crashing.
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-7b59f7d4df-wj7ts 1/1 Running 0 153m
kubernetes-dashboard-74d688b6bc-2c6m6 0/1 CrashLoopBackOff 32 153m
$ kubectl logs kubernetes-dashboard-74d688b6bc-2c6m6 -n kubernetes-dashboard
2021/04/23 08:32:26 Using namespace: kubernetes-dashboard
2021/04/23 08:32:26 Starting overwatch
2021/04/23 08:32:26 Using in-cluster config to connect to apiserver
2021/04/23 08:32:26 Using secret token for csrf signing
2021/04/23 08:32:26 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004e69e0)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:41 +0x446
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:66
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc00047a080)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:501 +0xc6
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc00047a080)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:469 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:550
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x20d
What did I miss or how can I recover from this?

TRY THE FOLLOWING
Check kubectl get svc | grep kubernetes and then kubectl describe svc kubernetes - Correct it if there is anything wrong here.
Define Hostname In /etc/hosts
#vim /etc/hosts
YOUR_IP HOSTNAME
Disable Firewall
Restart Kubelet
Restart Docker
Flush IP Tables

Related

Why does 'unauthorized' appear in the startup of kubernetes cluster?

I start k8s on my local machine and find most pods are not ready:
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-65dbdb44db-zrfpm 0/1 Running 0 3m9s
kube-system dashboard-metrics-scraper-545bbb8767-nfnbg 1/1 Running 1 16m
kube-system kube-flannel-ds-amd64-nm7kr 0/1 CrashLoopBackOff 5 16m
kube-system kubernetes-dashboard-65665f84db-rqxpv 0/1 CrashLoopBackOff 4 118s
kube-system metrics-server-869ffc99cd-6fhfl 0/1 CrashLoopBackOff 5 16m
Then I check the status of flannel:
kubectl logs kube-flannel-ds-amd64-nm7kr -n kube-system
The log tells me that something is unauthorized:
I0809 10:23:51.307347 1 main.go:518] Determining IP address of default interface
I0809 10:23:51.308840 1 main.go:531] Using interface with name wlo1 and address 192.168.1.102
I0809 10:23:51.308894 1 main.go:548] Defaulting external address to interface address (192.168.1.102)
W0809 10:23:51.308917 1 client_config.go:517] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
E0809 10:23:51.620449 1 main.go:243] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-nm7kr': Unauthorized
Then I check the coredns:
E0809 10:28:06.807582 1 reflector.go:153] pkg/mod/k8s.io/client-go#v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Namespace: Unauthorized
Then I check the metrics-server:
2020/08/09 10:28:06 Starting overwatch
2020/08/09 10:28:06 Using namespace: kube-system
2020/08/09 10:28:06 Using in-cluster config to connect to apiserver
2020/08/09 10:28:06 Using secret token for csrf signing
2020/08/09 10:28:06 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Unauthorized
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc00036f0e0)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:41 +0x446
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:66
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc000114080)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:501 +0xc6
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc000114080)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:469 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:550
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x20d
I use kubernetes 1.18.2.
So what's wrong with that?

Kubespray : Netchecker connectivity check fails

I deployed a Kubernetes (v1.17.5) cluster on OpenStack instances using Kubespray. Those instances are CentOS 7.6.1811 qcow2 images imported in Glance.
The install was successful, and I can see my nodes and pods with kubectl commands.
I used the deploy_netchecker option to deploy NetChecker and test the network within my cluster, and set network_plugin="flannel".
I also tried kube_proxy_mode="iptables", but it doesn't seem to affect the result.
That's pretty much all the changes I did in the k8s-cluster.yml file.
All the pods are running, services too :
[centos#cl1-master-0 ~]$ kubectl get svc --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 46h
default netchecker-service NodePort 10.233.13.213 <none> 8081:31081/TCP 46h
kube-system coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 46h
kube-system dashboard-metrics-scraper ClusterIP 10.233.59.12 <none> 8000/TCP 46h
kube-system kubernetes-dashboard ClusterIP 10.233.63.20 <none> 443/TCP 46h
But netchecker API gives the following answer :
[root#localhost ~]# curl http://X.X.X.X:31081/api/v1/connectivity_check
{"Message":"Connectivity check fails. Reason: there are absent or outdated pods; look up the payload","Absent":["netchecker-agent-hostnet-kk56x","netchecker-agent-hostnet-klldn","netchecker-agent-hostnet-r2vqs","netchecker-agent-hostnet-wqhjs"],"Outdated":["netchecker-agent-4jsgf","netchecker-agent-c9pcf","netchecker-agent-hostnet-jzbfv","netchecker-agent-vxgpf"]}
For an unknown reason, I cannot access the API from a cluster node with localhost, so I used a floating IP with OpenStack.
Here are some logs from the agent :
[centos#cl1-master-0 ~]$ sudo vi /var/log/pods/default_netchecker-agent-vjnwl_d8290268-3ea4-4e3c-acb4-295ab162a735/netchecker-agent/0.log
{"log":"I0701 13:04:01.814246 1 agent.go:135] Response status code: 200\n","stream":"stderr","time":"2020-07-01T13:04:01.81437579Z"}
{"log":"I0701 13:04:01.814272 1 agent.go:128] Sleep for 15 second(s)\n","stream":"stderr","time":"2020-07-01T13:04:01.814393199Z"}
{"log":"I0701 13:04:16.817398 1 agent.go:55] Send payload via URL: http://netchecker-service:8081/api/v1/agents/netchecker-agent-vjnwl\n","stream":"stderr","time":"2020-07-01T13:04:16.817786735Z"}
[centos#cl1-master-0 ~]$ sudo vi /var/log/pods/default_netchecker-agent-hostnet-klldn_d5fa6e72-885f-44e1-97a6-880a25e6d6d6/netchecker-agent/0.log
{"log":"E0701 13:05:22.804428 1 agent.go:133] Error while sending info. Details: Post http://netchecker-service:8081/api/v1/agents/netchecker-agent-hostnet-klldn: dial tcp 10.233.13.213:8081: i/o timeout\n","stream":"stderr","time":"2020-07-01T13:05:22.805138032Z"}
{"log":"I0701 13:05:22.804474 1 agent.go:128] Sleep for 15 second(s)\n","stream":"stderr","time":"2020-07-01T13:05:22.805190295Z"}
{"log":"I0701 13:05:37.807140 1 agent.go:55] Send payload via URL: http://netchecker-service:8081/api/v1/agents/netchecker-agent-hostnet-klldn\n","stream":"stderr","time":"2020-07-01T13:05:37.807309111Z"}
Logs from the server do not indicate any error.
I tried to check DNS resolve with the following :
[centos#cl1-master-0 ~]$ kubectl exec -it netchecker-agent-4jsgf -- /bin/sh
/ $ nslookup kubernetes.default
Server: 169.254.25.10
Address 1: 169.254.25.10
nslookup: can't resolve 'kubernetes.default'
[centos#cl1-master-0 ~]$ kubectl exec -it netchecker-agent-4jsgf -- cat /etc/resolv.conf
nameserver 169.254.25.10
search default.svc.cluster.local svc.cluster.local cluster.local openstacklocal
options ndots:5
169.254.25.10 is the IP of the nodelocaldns, but it doesn't seem to query the coredns service deployed.
When I use nslookup netchecker-service.default.svc.cluster.local 10.233.0.3, with the coredns IP, I get a correct answer.
What can be wrong with my configuration ?
Thanks in advance
UPDATE : The plugin Flannel has an issue and contains a fix to apply on all nodes of the cluster. Once done, the pods successfully report back to the netchecker server.
UPDATE : The plugin Flannel has an issue and contains a fix to apply on all nodes of the cluster. Once done, the pods successfully report back to the netchecker server.

Can not login to kubernetes dashboard dial tcp 172.17.0.6:8443: connect: connection refused

I deployed Kubernetes v1.15.2 dashboard successfully. Checking the cluster info:
$ kubectl cluster-info
Kubernetes master is running at http://172.19.104.231:8080
kubernetes-dashboard is running at http://172.19.104.231:8080/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
When I access the dashboard, the result is:
[root#ops001 ~]# curl -L http://172.19.104.231:8080/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Error: 'dial tcp 172.17.0.6:8443: connect: connection refused'
Trying to reach: 'https://172.17.0.6:8443/'
This is dashboard status:
[root#ops001 ~]# kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-74d7cc788-mk9c7 1/1 Running 0 92m
What should I do to access the dashboard? When I using proxy to access dashboard UI:
$ kubectl proxy --address='localhost' --port=8086 --accept-hosts='^*$'
Starting to serve on 127.0.0.1:8086
the result is:
[root#ops001 ~]# curl -L http://127.0.0.1:8086/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Error: 'dial tcp 172.17.0.6:8443: connect: connection refused'
Trying to reach: 'https://172.17.0.6:8443/'
What should I do to fix this problem?
I finally find the problem is kubernetes dashboard pod container does not communicate with the proxy nginx container. Because the proxy container is deployed before kubernetes flannel,and not in the same network.Trying to add proxy nginx container to flannel network would solve the problem.Check current flannel network:
[root#ops001 conf.d]# cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.30.0.0/16
FLANNEL_SUBNET=172.30.224.1/21
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
generate start parameter of docker:
./mk-docker-opts.sh -d /run/docker_opts.env -c
check the parameter:
[root#ops001 conf.d] cat /run/docker_opts.env
DOCKER_OPTS=" --bip=172.30.224.1/21 --ip-masq=false --mtu=1450"
add paramter to docker service:
# vim /lib/systemd/system/docker.service
EnvironmentFile=/run/docker_opts.env
ExecStart=/usr/bin/dockerd $DOCKER_OPTS -H fd://
restart docker and the container would join the flannel network,could communicate with each other:
systemctl daemon-reload
systemctl restart docker
hope this may help you!

Coredns in CrashLoopBackOff (kubernetes 1.11)

I'm trying to install kubernetes on an Ubuntu 16.04 VM, followed instructions at https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/, using weave as my pod network add-on.
I'm seeing similar issue as coredns pods have CrashLoopBackOff or Error state, but I didn't see a solution there, and the versions I'm using are different:
kubeadm 1.11.4-00
kubectl 1.11.4-00
kubelet 1.11.4-00
kubernetes-cni 0.6.0-00
Docker version 1.13.1-cs8, build 91ca5f2
weave script 2.5.0
weave 2.5.0
I'm running behind a corporate firewall, so I set my proxy variables, then ran kubeadm init as follows:
# echo $http_proxy
http://135.28.13.11:8080
# echo $https_proxy
http://135.28.13.11:8080
# echo $no_proxy
127.0.0.1,135.21.27.139,135.0.0.0/8,10.96.0.0/12,10.32.0.0/12
# kubeadm init --pod-network-cidr=10.32.0.0/12
# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
# kubectl taint nodes --all node-role.kubernetes.io/master-
Both coredns pods stay in CrashLoopBackOff
# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default hostnames-674b556c4-2b5h2 1/1 Running 0 5h 10.32.0.6 mtpnjvzonap001 <none>
default hostnames-674b556c4-4bzdj 1/1 Running 0 5h 10.32.0.5 mtpnjvzonap001 <none>
default hostnames-674b556c4-64gx5 1/1 Running 0 5h 10.32.0.4 mtpnjvzonap001 <none>
kube-system coredns-78fcdf6894-s7rvx 0/1 CrashLoopBackOff 18 1h 10.32.0.7 mtpnjvzonap001 <none>
kube-system coredns-78fcdf6894-vxwgv 0/1 CrashLoopBackOff 80 6h 10.32.0.2 mtpnjvzonap001 <none>
kube-system etcd-mtpnjvzonap001 1/1 Running 0 6h 135.21.27.139 mtpnjvzonap001 <none>
kube-system kube-apiserver-mtpnjvzonap001 1/1 Running 0 1h 135.21.27.139 mtpnjvzonap001 <none>
kube-system kube-controller-manager-mtpnjvzonap001 1/1 Running 0 6h 135.21.27.139 mtpnjvzonap001 <none>
kube-system kube-proxy-2c4tx 1/1 Running 0 6h 135.21.27.139 mtpnjvzonap001 <none>
kube-system kube-scheduler-mtpnjvzonap001 1/1 Running 0 1h 135.21.27.139 mtpnjvzonap001 <none>
kube-system weave-net-bpx22 2/2 Running 0 6h 135.21.27.139 mtpnjvzonap001 <none>
coredns pods have this entry in their log
E1114 20:59:13.848196 1 reflector.go:205]
github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed
to list *v1.Service: Get
https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0:
dial tcp 10.96.0.1:443: i/o timeout
This suggests to me that coredns cannot access apiserver pod using its cluster IP:
# kubectl describe svc/kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 135.21.27.139:6443
Session Affinity: None
Events: <none>
I also went through the troubleshooting steps at https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/
I created a busybox pod for testing
I created the hostnames deployment successfully
I exposed the hostnames deployment successfully
From the busybox pod, I accessed the hostnames service by its cluster IP successfully
from the node, I accessed the hostnames service by its cluster IP successfully
So in short, I created the hostnames service which had a cluster IP in 10.96.0.0/12 space (as expected), and it works, but for some reason, pods cannot access the apiserver's cluster IP of 10.96.0.1, though from the node I can access 10.96.0.1:
# wget --no-check-certificate https://10.96.0.1/hello
--2018-11-14 21:44:25-- https://10.96.0.1/hello
Connecting to 10.96.0.1:443... connected.
WARNING: cannot verify 10.96.0.1's certificate, issued by ‘CN=kubernetes’:
Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 403 Forbidden
2018-11-14 21:44:25 ERROR 403: Forbidden.
Some other things I checked, based on advice from others who reported a similar problem:
# sysctl net.ipv4.conf.all.forwarding
net.ipv4.conf.all.forwarding = 1
# sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
# iptables-save | egrep ':INPUT|:OUTPUT|:POSTROUTING|:FORWARD'
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [11:692]
:POSTROUTING ACCEPT [11:692]
:INPUT ACCEPT [1697:364811]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [1652:363693]
# ls -l /usr/sbin/conntrack
-rwxr-xr-x 1 root root 65632 Jan 24 2016 /usr/sbin/conntrack
# systemctl status firewalld
● firewalld.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
I checked the log for kube-proxy, did not see any errors.
I also tried deleting coredns pods, apiserver pod; they are recreated (as expected), but the problem remains.
Here's a copy of the log from the weave container
# kubectl logs -n kube-system weave-net-bpx22 weave
DEBU: 2018/11/14 15:56:10.909921 [kube-peers] Checking peer "aa:53:be:75:71:f7" against list &{[]}
Peer not in list; removing persisted data
INFO: 2018/11/14 15:56:11.041807 Command line options: map[name:aa:53:be:75:71:f7 nickname:mtpnjvzonap001 ipalloc-init:consensus=1 ipalloc-range:10.32.0.0/12 db-prefix:/weavedb/weave-net docker-api: expect-npc:true host-root:/host http-addr:127.0.0.1:6784 metrics-addr:0.0.0.0:6782 conn-limit:100 datapath:datapath no-dns:true port:6783]
INFO: 2018/11/14 15:56:11.042230 weave 2.5.0
INFO: 2018/11/14 15:56:11.198348 Bridge type is bridged_fastdp
INFO: 2018/11/14 15:56:11.198372 Communication between peers is unencrypted.
INFO: 2018/11/14 15:56:11.203206 Our name is aa:53:be:75:71:f7(mtpnjvzonap001)
INFO: 2018/11/14 15:56:11.203249 Launch detected - using supplied peer list: [135.21.27.139]
INFO: 2018/11/14 15:56:11.216398 Checking for pre-existing addresses on weave bridge
INFO: 2018/11/14 15:56:11.229313 [allocator aa:53:be:75:71:f7] No valid persisted data
INFO: 2018/11/14 15:56:11.233391 [allocator aa:53:be:75:71:f7] Initialising via deferred consensus
INFO: 2018/11/14 15:56:11.233443 Sniffing traffic on datapath (via ODP)
INFO: 2018/11/14 15:56:11.234120 ->[135.21.27.139:6783] attempting connection
INFO: 2018/11/14 15:56:11.234302 ->[135.21.27.139:49182] connection accepted
INFO: 2018/11/14 15:56:11.234818 ->[135.21.27.139:6783|aa:53:be:75:71:f7(mtpnjvzonap001)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/11/14 15:56:11.234843 ->[135.21.27.139:49182|aa:53:be:75:71:f7(mtpnjvzonap001)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/11/14 15:56:11.236010 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2018/11/14 15:56:11.236424 Listening for metrics requests on 0.0.0.0:6782
INFO: 2018/11/14 15:56:11.990529 [kube-peers] Added myself to peer list &{[{aa:53:be:75:71:f7 mtpnjvzonap001}]}
DEBU: 2018/11/14 15:56:11.995901 [kube-peers] Nodes that have disappeared: map[]
10.32.0.1
135.21.27.139
DEBU: 2018/11/14 15:56:12.075738 registering for updates for node delete events
INFO: 2018/11/14 15:56:41.279799 Error checking version: Get https://checkpoint-api.weave.works/v1/check/weave-net?arch=amd64&flag_docker-version=none&flag_kernel-version=4.4.0-135-generic&flag_kubernetes-cluster-size=1&flag_kubernetes-cluster-uid=ce66cb23-e825-11e8-abc3-525400314503&flag_kubernetes-version=v1.11.4&os=linux&signature=EJdydeNysrC7LC5xAJAKyDvxXCvkeWUFzepdk3QDfr0%3D&version=2.5.0: dial tcp 74.125.196.121:443: i/o timeout
INFO: 2018/11/14 20:52:47.025412 Error checking version: Get https://checkpoint-api.weave.works/v1/check/weave-net?arch=amd64&flag_docker-version=none&flag_kernel-version=4.4.0-135-generic&flag_kubernetes-cluster-size=1&flag_kubernetes-cluster-uid=ce66cb23-e825-11e8-abc3-525400314503&flag_kubernetes-version=v1.11.4&os=linux&signature=EJdydeNysrC7LC5xAJAKyDvxXCvkeWUFzepdk3QDfr0%3D&version=2.5.0: dial tcp 74.125.196.121:443: i/o timeout
INFO: 2018/11/15 01:46:32.842792 Error checking version: Get https://checkpoint-api.weave.works/v1/check/weave-net?arch=amd64&flag_docker-version=none&flag_kernel-version=4.4.0-135-generic&flag_kubernetes-cluster-size=1&flag_kubernetes-cluster-uid=ce66cb23-e825-11e8-abc3-525400314503&flag_kubernetes-version=v1.11.4&os=linux&signature=EJdydeNysrC7LC5xAJAKyDvxXCvkeWUFzepdk3QDfr0%3D&version=2.5.0: dial tcp 74.125.196.121:443: i/o timeout
INFO: 2018/11/15 09:06:03.624359 Error checking version: Get https://checkpoint-api.weave.works/v1/check/weave-net?arch=amd64&flag_docker-version=none&flag_kernel-version=4.4.0-135-generic&flag_kubernetes-cluster-size=1&flag_kubernetes-cluster-uid=ce66cb23-e825-11e8-abc3-525400314503&flag_kubernetes-version=v1.11.4&os=linux&signature=EJdydeNysrC7LC5xAJAKyDvxXCvkeWUFzepdk3QDfr0%3D&version=2.5.0: dial tcp 172.217.9.147:443: i/o timeout
INFO: 2018/11/15 14:34:02.070893 Error checking version: Get https://checkpoint-api.weave.works/v1/check/weave-net?arch=amd64&flag_docker-version=none&flag_kernel-version=4.4.0-135-generic&flag_kubernetes-cluster-size=1&flag_kubernetes-cluster-uid=ce66cb23-e825-11e8-abc3-525400314503&flag_kubernetes-version=v1.11.4&os=linux&signature=EJdydeNysrC7LC5xAJAKyDvxXCvkeWUFzepdk3QDfr0%3D&version=2.5.0: dial tcp 172.217.9.147:443: i/o timeout
Here are the events for the 2 coredns pods
# kubectl get events -n kube-system --field-selector involvedObject.name=coredns-78fcdf6894-6f9q6
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
41m 20h 245 coredns-78fcdf6894-6f9q6.1568eab25f0acb02 Pod spec.containers{coredns} Normal Killing kubelet, mtpnjvzonap001 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
26m 20h 248 coredns-78fcdf6894-6f9q6.1568ea920f72ddd4 Pod spec.containers{coredns} Normal Pulled kubelet, mtpnjvzonap001 Container image "k8s.gcr.io/coredns:1.1.3" already present on machine
5m 20h 1256 coredns-78fcdf6894-6f9q6.1568eaa1fd9216d2 Pod spec.containers{coredns} Warning Unhealthy kubelet, mtpnjvzonap001 Liveness probe failed: HTTP probe failed with statuscode: 503
1m 19h 2963 coredns-78fcdf6894-6f9q6.1568eb75f2b1af3e Pod spec.containers{coredns} Warning BackOff kubelet, mtpnjvzonap001 Back-off restarting failed container
# kubectl get events -n kube-system --field-selector involvedObject.name=coredns-78fcdf6894-skjwz
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
6m 20h 1259 coredns-78fcdf6894-skjwz.1568eaa181fbeffe Pod spec.containers{coredns} Warning Unhealthy kubelet, mtpnjvzonap001 Liveness probe failed: HTTP probe failed with statuscode: 503
1m 19h 2969 coredns-78fcdf6894-skjwz.1568eb7578188f24 Pod spec.containers{coredns} Warning BackOff kubelet, mtpnjvzonap001 Back-off restarting failed container
#
Any help or further troubleshooting steps are welcome
I had the same problem and needed to allow several ports in my firewall: 22, 53, 6443, 6783, 6784, 8285.
I copied the rules from an existing healthy cluster. Probably only 6443, shown above as the target port for the coredns service, is required for this error and the others are for other services I run in my cluster.
With Ubuntu this was uncomplicated firewall
ufw allow 22/tcp # allowed for ssh, included in case you had firewall disabled altogether
ufw allow 6443
ufw allow 53
ufw allow 8285
ufw allow 6783
ufw allow 6784

Kubernetes dashboard not working, “already exists” and “could not find the requested resource (get services heapster)”

I am new to Kubernetes
The goal is to get Kubernetes cluster dashboard working
The Kubernetes cluster was deployed using Kubespray: github.com/kubernetes-incubator/kubespray
Versions:
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.6", GitCommit:"4bc5e7f9a6c25dc4c03d4d656f2cefd21540e28c", GitTreeState:"clean", BuildDate:"2017-09-15T08:51:21Z", GoVersion:"go1.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3+coreos.0", GitCommit:"42de91f04e456f7625941a6c4aaedaa69708be1b", GitTreeState:"clean", BuildDate:"2017-08-07T19:44:31Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
When I do kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml --validate=false as described here
I get:
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml": secrets "kubernetes-dashboard-certs" already exists
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml": serviceaccounts "kubernetes-dashboard" already exists
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml": roles.rbac.authorization.k8s.io "kubernetes-dashboard-minimal" already exists
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml": rolebindings.rbac.authorization.k8s.io "kubernetes-dashboard-minimal" already exists
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml": deployments.extensions "kubernetes-dashboard" already exists
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml": services "kubernetes-dashboard" already exists
When I run kubectl get services --namespace kube-system, I get:
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.233.0.3 <none> 53/UDP,53/TCP 10d
kubernetes-dashboard 10.233.28.132 <none> 80/TCP 9d
When I try to reach the dashboard kubernetes cluster, I get Connection refused
kubectl logs --namespace=kube-system kubernetes-dashboard-4167803980-1dz53 output:
2017/09/27 10:54:11 Using in-cluster config to connect to apiserver
2017/09/27 10:54:11 Using service account token for csrf signing
2017/09/27 10:54:11 No request provided. Skipping authorization
2017/09/27 10:54:11 Starting overwatch
2017/09/27 10:54:11 Successful initial request to the apiserver, version: v1.7.3+coreos.0
2017/09/27 10:54:11 New synchronizer has been registered: kubernetes-dashboard-key-holder-kube-system. Starting
2017/09/27 10:54:11 Starting secret synchronizer for kubernetes-dashboard-key-holder in namespace kube-system
2017/09/27 10:54:11 Initializing secret synchronizer synchronously using secret kubernetes-dashboard-key-holder from namespace kube-system
2017/09/27 10:54:11 Initializing JWE encryption key from synchronized object
2017/09/27 10:54:11 Creating in-cluster Heapster client
2017/09/27 10:54:11 Serving securely on HTTPS port: 8443
2017/09/27 10:54:11 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Other outputs:
kubectl get pods --namespace=kube-system:
NAME READY STATUS RESTARTS AGE
calico-node-bqckz 1/1 Running 0 12d
calico-node-r9svd 1/1 Running 2 12d
calico-node-w3tps 1/1 Running 0 12d
kube-apiserver-kubetest1 1/1 Running 0 12d
kube-apiserver-kubetest2 1/1 Running 0 12d
kube-controller-manager-kubetest1 1/1 Running 2 12d
kube-controller-manager-kubetest2 1/1 Running 2 12d
kube-dns-3888408129-n0m8d 3/3 Running 0 12d
kube-dns-3888408129-z8xx3 3/3 Running 0 12d
kube-proxy-kubetest1 1/1 Running 0 12d
kube-proxy-kubetest2 1/1 Running 0 12d
kube-proxy-kubetest3 1/1 Running 0 12d
kube-scheduler-kubetest1 1/1 Running 2 12d
kube-scheduler-kubetest2 1/1 Running 2 12d
kubedns-autoscaler-1629318612-sd924 1/1 Running 0 12d
kubernetes-dashboard-4167803980-1dz53 1/1 Running 0 1d
nginx-proxy-kubetest3 1/1 Running 0 12d
kubectl proxy:
Starting to serve on 127.0.0.1:8001panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2692f20]
goroutine 1 [running]:
k8s.io/kubernetes/pkg/kubectl.(*ProxyServer).ServeOnListener(0x0, 0x3a95a60, 0xc420114110, 0x17, 0xc4208b7c28)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubectl/proxy_server.go:201 +0x70
k8s.io/kubernetes/pkg/kubectl/cmd.RunProxy(0x3aa5ec0, 0xc42074e960, 0x3a7f1e0, 0xc42000c018, 0xc4201d7200, 0x0, 0x0)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/proxy.go:156 +0x774
k8s.io/kubernetes/pkg/kubectl/cmd.NewCmdProxy.func1(0xc4201d7200, 0xc4203586e0, 0x0, 0x2)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/proxy.go:79 +0x4f
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc4201d7200, 0xc420358500, 0x2, 0x2, 0xc4201d7200, 0xc420358500)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:603 +0x234
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc4202e4240, 0x5000107, 0x0, 0xffffffffffffffff)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:689 +0x2fe
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0xc4202e4240, 0xc42074e960, 0x3a7f1a0)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:648 +0x2b
k8s.io/kubernetes/cmd/kubectl/app.Run(0x0, 0x0)
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubectl/app/kubectl.go:39 +0xd5
main.main()
/private/tmp/kubernetes-cli-20170915-41661-iccjh1/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubectl/kubectl.go:26 +0x22
kubectl top nodes:
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
kubectl get svc --namespace=kube-system:
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.233.0.3 <none> 53/UDP,53/TCP 12d
kubernetes-dashboard 10.233.28.132 <none> 80/TCP 11d
curl http://localhost:8001/ui:
curl: (7) Failed to connect to 10.2.3.211 port 8001: Connection refused
How can I get the dashboard working? Appreciate your help.
you may be installing dashboard version 1.7. try installing version 1.6.3 its well tested.
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.6.3/src/deploy/kubernetes-dashboard.yaml
Update 10/2/17:
can you try this: Delete and install 1.6.3 version.
kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.6.3/src/deploy/kubernetes-dashboard.yaml
I believe the kubernetes dashboard is by default available already if you are deploying it through GCP or Azure. The first error explains this already. To verify, you may do type the following command to look for the pods/service in the namespace kube-system.
>kubectl get pods --namespace=kube-system
>kubectl get svc --namespace=kube-system
From the above command, you should find your available kubernetes dashboard and so you don't need to deploy it again. To access the dashboard, you could type the following command.
>kubectl proxy
This will make the Dashboard available at http://localhost:8001/ui on the machine where you type this command.
But to understand more about your problem, may I know which version of kubernetes and what environment are you using now? Also, it will be great if you could show me the result of these two commands.
>kubectl get pods --namespace=kube-system
>kubectl top nodes