Can not login to kubernetes dashboard dial tcp 172.17.0.6:8443: connect: connection refused - kubernetes

I deployed Kubernetes v1.15.2 dashboard successfully. Checking the cluster info:
$ kubectl cluster-info
Kubernetes master is running at http://172.19.104.231:8080
kubernetes-dashboard is running at http://172.19.104.231:8080/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
When I access the dashboard, the result is:
[root#ops001 ~]# curl -L http://172.19.104.231:8080/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Error: 'dial tcp 172.17.0.6:8443: connect: connection refused'
Trying to reach: 'https://172.17.0.6:8443/'
This is dashboard status:
[root#ops001 ~]# kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-74d7cc788-mk9c7 1/1 Running 0 92m
What should I do to access the dashboard? When I using proxy to access dashboard UI:
$ kubectl proxy --address='localhost' --port=8086 --accept-hosts='^*$'
Starting to serve on 127.0.0.1:8086
the result is:
[root#ops001 ~]# curl -L http://127.0.0.1:8086/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Error: 'dial tcp 172.17.0.6:8443: connect: connection refused'
Trying to reach: 'https://172.17.0.6:8443/'
What should I do to fix this problem?

I finally find the problem is kubernetes dashboard pod container does not communicate with the proxy nginx container. Because the proxy container is deployed before kubernetes flannel,and not in the same network.Trying to add proxy nginx container to flannel network would solve the problem.Check current flannel network:
[root#ops001 conf.d]# cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.30.0.0/16
FLANNEL_SUBNET=172.30.224.1/21
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
generate start parameter of docker:
./mk-docker-opts.sh -d /run/docker_opts.env -c
check the parameter:
[root#ops001 conf.d] cat /run/docker_opts.env
DOCKER_OPTS=" --bip=172.30.224.1/21 --ip-masq=false --mtu=1450"
add paramter to docker service:
# vim /lib/systemd/system/docker.service
EnvironmentFile=/run/docker_opts.env
ExecStart=/usr/bin/dockerd $DOCKER_OPTS -H fd://
restart docker and the container would join the flannel network,could communicate with each other:
systemctl daemon-reload
systemctl restart docker
hope this may help you!

Related

Recover kubernetes-dashboard pod?

I've installed kubernetes on Ubuntu 18.04.5 LTS (Bionic Beaver). And now try to run the kubernetes-dashboard. However it keeps crashing.
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-7b59f7d4df-wj7ts 1/1 Running 0 153m
kubernetes-dashboard-74d688b6bc-2c6m6 0/1 CrashLoopBackOff 32 153m
$ kubectl logs kubernetes-dashboard-74d688b6bc-2c6m6 -n kubernetes-dashboard
2021/04/23 08:32:26 Using namespace: kubernetes-dashboard
2021/04/23 08:32:26 Starting overwatch
2021/04/23 08:32:26 Using in-cluster config to connect to apiserver
2021/04/23 08:32:26 Using secret token for csrf signing
2021/04/23 08:32:26 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004e69e0)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:41 +0x446
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:66
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc00047a080)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:501 +0xc6
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc00047a080)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:469 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:550
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x20d
What did I miss or how can I recover from this?
TRY THE FOLLOWING
Check kubectl get svc | grep kubernetes and then kubectl describe svc kubernetes - Correct it if there is anything wrong here.
Define Hostname In /etc/hosts
#vim /etc/hosts
YOUR_IP HOSTNAME
Disable Firewall
Restart Kubelet
Restart Docker
Flush IP Tables

How can I get CoreDNS to resolve on my Raspberry Pi Kubernetes cluster?

I've followed a number of online tutorials to set up a Kubernetes cluster on four Raspberry Pi 4s. I ended up using Flannel as the networking plugin as that seems to be the only one that actually works on RPi, with a pod network CIDR of 10.244.0.0/16, per this guide from 2017. Most everything is working... all of the base pods in the kube-system namespace are running/healthy, and I can pull down images and launch new containers. At first I wasn't able to get any pod logs, but that was quickly remedied by opening up port 10250 on each node.
But there still seems to be a problem DNS resolution. I should clarify that DNS resolution on the hosts clearly does work, as the cluster is able to download any container image I specify. But once a container is running, it isn't able to "dial out" to anything. As a test, I'm running the arm32v7/buildpack-deps:latest container in a pod. It pulls the image from Docker hub just fine. But when I shell into it and simply type curl https://www.google.com it hangs before eventually timing out. And the same is true of any pod I launch that needs to interact with the external Internet: they hang and hang and hang.
Here are all the networking-related commands I've already run on each node:
sudo iptables -P FORWARD ACCEPT
sudo iptables -A FORWARD -i cni0 -j ACCEPT
sudo iptables -A FORWARD -o cni0 -j ACCEPT
sudo ufw allow ssh
sudo ufw allow 443 # can't remember why i ran this one
sudo ufw allow 6443
sudo ufw allow 8080 # this one might not be strictly necessary, either
sudo ufw allow 10250
sudo ufw default allow routed
sudo ufw enable
I'm not entirely sure that the last two iptables commands did anything; I grabbed them from the comment section of that guide I linked to earlier. I know that guide assumes one is using kube-dns but it's also 3 years old so I am using the (newer) default, coredns, instead.
What am I missing? I feel like I'm so close to having this cluster fully operational, but obviously I need functioning DNS!
UPDATE: I know that it's a DNS problem, and not general Internet connectivity, for two reasons: (1) the cluster itself can pull down any image I specify from Dockerhub, and (2) when I shell into a running container that has curl and execute curl -H "Host: www.google.com" 142.250.73.206, it successfully returns the Google homepage HTML. But as mentioned if I try and do my earlier curl command using the hostname, that times out.
Create a simple Pod to use as a test environment for DNS diagnosing:
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
kubectl apply -f dnsutils.yaml
Check the status of Pod
$ kubectl get pods dnsutils
NAME READY STATUS RESTARTS AGE
dnsutils 1/1 Running 0 <some-time>
Once that Pod is running, you can exec nslookup in that environment. If you see something like the following, DNS is working correctly.
$ kubectl exec -i -t dnsutils -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10
Name: kubernetes.default
Address 1: 10.0.0.1
If the nslookup command fails, check the following:
Take a look inside the resolv.conf file.
kubectl exec -ti dnsutils -- cat /etc/resolv.conf
Verify that the search path and name server are set up like the following (note that search path may vary for different cloud providers):
search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.gce_project_id.internal
nameserver 10.0.0.10
options ndots:5
Errors such as the following indicate a problem with the CoreDNS (or kube-dns) add-on or with associated Services:
$ kubectl exec -i -t dnsutils -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10
nslookup: can't resolve 'kubernetes.default'
OR
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'kubernetes.default'
Check if the DNS pod is running
$ kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
...
coredns-7b96bf9f76-5hsxb 1/1 Running 0 1h
coredns-7b96bf9f76-mvmmt 1/1 Running 0 1h
...
Check for errors in the DNS pod
Here is an example of a healthy CoreDNS log:
$ kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
2018/08/15 14:37:17 [INFO] CoreDNS-1.2.2
2018/08/15 14:37:17 [INFO] linux/amd64, go1.10.3, 2e322f6
CoreDNS-1.2.2
linux/amd64, go1.10.3, 2e322f6
2018/08/15 14:37:17 [INFO] plugin/reload: Running configuration MD5 = 24e6c59e83ce706f07bcc82c31b1ea1c
Verify that the DNS service is up by using the kubectl get service command.
$ kubectl get svc --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
...
kube-dns ClusterIP 10.0.0.10 <none> 53/UDP,53/TCP 1h
...
You can verify that DNS endpoints are exposed by using the kubectl get endpoints command.
$ kubectl get endpoints kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.180.3.17:53,10.180.3.17:53 1h
You can verify if queries are being received by CoreDNS by adding the log plugin to the CoreDNS configuration (aka Corefile). The CoreDNS Corefile is held in a ConfigMap named coredns. To edit it, use the command:
$ kubectl -n kube-system edit configmap coredns
Then add log in the Corefile section per the example below:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
log
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
After saving the changes, it may take up to minute or two for Kubernetes to propagate these changes to the CoreDNS pods.
Next, make some queries and view the logs per the sections above in this document. If CoreDNS pods are receiving the queries, you should see them in the logs.
Here is an example of a query in the log:
.:53
2018/08/15 14:37:15 [INFO] CoreDNS-1.2.0
2018/08/15 14:37:15 [INFO] linux/amd64, go1.10.3, 2e322f6
CoreDNS-1.2.0
linux/amd64, go1.10.3, 2e322f6
2018/09/07 15:29:04 [INFO] plugin/reload: Running configuration MD5 = 162475cdf272d8aa601e6fe67a6ad42f
2018/09/07 15:29:04 [INFO] Reloading complete
172.17.0.18:41675 - [07/Sep/2018:15:29:11 +0000] 59925 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd,ra 106 0.000066649s
As pointed out in the comments: The configuration of kubeadm seems fine.
Your pods have the correct /etc/resolv.conf and they should work.
It's pretty hard to clarily determine the problem - many things can be happend here.
My guess: There something not right with ufw.
You can easily proof it: Disable ufw on all nodes (with ufw disable).
I'm not hundred percent sure which ports are needed. I'm using iptables for my single node k8s and at the start I had many problems FORWARD vs INPUT rules. In docker all ports are forwarded.
So I guess there is something wrong with FORWARD-rules and/or the dns-ports (53/udp and 53/tcp).
Good luck.

flannel restart very often

Flannel on node restarts always.
Log as follows:
root#debian:~# docker logs faa668852544
I0425 07:14:37.721766 1 main.go:514] Determining IP address of default interface
I0425 07:14:37.724855 1 main.go:527] Using interface with name eth0 and address 192.168.50.19
I0425 07:14:37.815135 1 main.go:544] Defaulting external address to interface address (192.168.50.19)
E0425 07:15:07.825910 1 main.go:241] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-arm-bg9rn': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-arm-bg9rn: dial tcp 10.96.0.1:443: i/o timeout
master configuration:
ubuntu: 16.04
node:
embedded system with debian rootfs(linux4.9).
kubernetes version:v1.14.1
docker version:18.09
flannel version:v0.11.0
I hope flannel run normal on node.
First, for flannel to work correctly, you must pass --pod-network-cidr=10.244.0.0/16 to kubeadm init.
kubeadm init --pod-network-cidr=10.244.0.0/16
Set /proc/sys/net/bridge/bridge-nf-call-iptables to 1 by running
sysctl net.bridge.bridge-nf-call-iptables=1
Next is to create the clusterrole and clusterrolebinding
as follows:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

Kubernetes unable to retrieve logs

I have kubeadm cluster deployed in CentOS VM. while trying to deploy ingress controller following github i noticed that i'm unable to see logs:
kubectl logs -n ingress-nginx nginx-ingress-controller-697f7c6ddb-x9xkh --previous
Error from server: Get https://192.168.56.34:10250/containerLogs/ingress-nginx/nginx-ingress-controller-697f7c6ddb-x9xkh/nginx-ingress-controller?previous=true: dial tcp 192.168.56.34:10250: getsockopt: connection timed out
In 192.168.56.34 (node1) netstat returns:
tcp6 0 0 :::10250 :::* LISTEN 1068/kubelet
In fact i'm unable to see any logs despite the status of the pod.
I disabled both the firewalld and SELinux.
I used proxy to enable kubernertes to download images, now i removed the proxy.
When navigating to the url in the error above i get Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=proxy)
I'm also able to fetch my nodes:
kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 32d v1.9.3
k8s-node1 Ready <none> 30d v1.9.3
k8s-node2 NotReady <none> 32d v1.9.3
getsockopt: connection timed out
Is 99.99999% a firewall issue. If it was "connection refused" then showing the output of netstat would be meaningful, but (as you can see) kubelet is listening on that port just fine -- it's the networking configuration between the machine that is running kubectl and "192.168.56.34" that is incorrectly configured to allow traffic.
The apiserver expects that everyone who would want to view logs (or use kubectl exec) can reach that port on every Node in the cluster; so be sure you don't just fix the firewall rule(s) for that one Node -- fix it for all of them.
This message is from the apiserver running on your master. The command kubectl logs, running on your local machine, fetches logs via the apiserver. So the error message reveals a firewall misconfiguration between the master and the node(s) (port 10250)

Can't Connect to Kubernetes Service from Inside Service Pod?

I create a one-replica zookeeper + kafka cluster with the official kafka chart from the official incubator repo:
helm install --name mykafka -f kafka.yaml incubator/kafka
This gives me two pods:
kubectl get pods
NAME READY STATUS
mykafka-kafka-0 1/1 Running
mykafka-zookeeper-0 1/1 Running
And four services (in addition to the default kubernetes service)
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP
mykafka-kafka ClusterIP 10.108.143.59 <none> 9092/TCP
mykafka-kafka-headless ClusterIP None <none> 9092/TCP
mykafka-zookeeper ClusterIP 10.109.43.48 <none> 2181/TCP
mykafka-zookeeper-headless ClusterIP None <none> 2888/TCP,3888/TCP
If I shell into the zookeeper pod:
> kubectl exec -it mykafka-zookeeper-0 -- /bin/bash
I use the curl tool to test TCP connectivity. I expect a communications error as the server isn't using HTTP, but if curl can't even connect and I have to ctrl-C out, then the TCP connection isn't working.
I can access the local pod through curl localhost:2181:
root#mykafka-zookeeper-0:/# curl localhost:2181
curl: (52) Empty reply from server
I can access other pod through curl mykafka-kafka:9092:
root#mykafka-zookeeper-0:/# curl mykafka-kafka:9092
curl: (56) Recv failure: Connection reset by peer
But I can't access mykafka-zookeeper:2181. That name resolves to the cluster IP, but the attempt to TCP connect hangs until I ctrl-C:
root#mykafka-zookeeper-0:/# curl -v mykafka-zookeeper:2181
* Rebuilt URL to: mykafka-zookeeper:2181/
* Trying 10.109.43.48...
^C
Similarly, I can shell into the kafka pod:
> kubectl exec -it mykafka-kafka-0 -- /bin/bash
Connecting to the Zookeeper pod by the service name works fine:
root#mykafka-kafka-0:/# curl mykafka-zookeeper:2181
curl: (52) Empty reply from server
Connecting to localhost kafka works fine:
root#mykafka-kafka-0:/# curl localhost:9092
curl: (56) Recv failure: Connection reset by peer
But connecting to the Kafka pod by the service name doesn't work and I must ctrl-C the curl attempt:
curl -v mykafka-kafka:9092
* Rebuilt URL to: mykafka-kafka:9092/
* Hostname was NOT found in DNS cache
* Trying 10.108.143.59...
^C
Can anyone explain why using I can only connect to a Kubernetes service from outside the service and not from within the service?
I believe what you're experiencing can be resolved by looking at how your kubelet is set up to run. There is a setting you can toggle when starting up the kubelet called --hairpin-mode. By default this setting is set to the string promiscuous, where a pod can't connect to its own service, but you can change it to be hairpin-veth, which would allow a pod to connect to its own service.
There are a few issues on the topic, but this seems to be referenced the most:
https://github.com/kubernetes/kubernetes/issues/45790