Kubernetes DNS lookg not working from worker node - connection timed out; no servers could be reached - kubernetes

I have build new Kubernetes cluster v1.20.1 single master and single node with Calico CNI.
I deployed the busybox pod in default namespace.
# kubectl get pods busybox -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 12m 10.203.0.129 node02 <none> <none>
nslookup not working
kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'kubernetes.default'
cluster is running RHEL 8 with latest update
followed this steps: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
nslookup command not able to reach nameserver
# kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
command terminated with exit code 1
resolve.conf file
# kubectl exec -ti dnsutils -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5
DNS pods running
# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-74ff55c5b-472vx 1/1 Running 1 85m
coredns-74ff55c5b-c75bq 1/1 Running 1 85m
DNS pod logs
kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
Service is defined
# kubectl get svc --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 86m
**I can see the endpoints of DNS pod**
# kubectl get endpoints kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.203.0.5:53,10.203.0.6:53,10.203.0.5:53 + 3 more... 86m
enabled the logging, but didn't see traffic coming to DNS pod
# kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
I can ping DNS POD
# kubectl exec -i -t dnsutils -- ping 10.203.0.5
PING 10.203.0.5 (10.203.0.5): 56 data bytes
64 bytes from 10.203.0.5: seq=0 ttl=62 time=6.024 ms
64 bytes from 10.203.0.5: seq=1 ttl=62 time=6.052 ms
64 bytes from 10.203.0.5: seq=2 ttl=62 time=6.175 ms
64 bytes from 10.203.0.5: seq=3 ttl=62 time=6.000 ms
^C
--- 10.203.0.5 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 6.000/6.062/6.175 ms
nmap show port filtered
# ke netshoot-6f677d4fdf-5t5cb -- nmap 10.203.0.5
Starting Nmap 7.80 ( https://nmap.org ) at 2021-01-15 22:29 UTC
Nmap scan report for 10.203.0.5
Host is up (0.0060s latency).
Not shown: 997 closed ports
PORT STATE SERVICE
53/tcp filtered domain
8080/tcp filtered http-proxy
8181/tcp filtered intermapper
Nmap done: 1 IP address (1 host up) scanned in 14.33 seconds
If I schedule the POD on master node, nslookup works nmap show port open?
# ke netshoot -- bash
bash-5.0# nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
nmap -p 53 10.96.0.10
Starting Nmap 7.80 ( https://nmap.org ) at 2021-01-15 22:46 UTC
Nmap scan report for kube-dns.kube-system.svc.cluster.local (10.96.0.10)
Host is up (0.000098s latency).
PORT STATE SERVICE
53/tcp open domain
Nmap done: 1 IP address (1 host up) scanned in 0.14 seconds
Why nslookup from POD running on worker node is not working? how to troubleshoot this issue?
I re-build the server two times, still same issue.
Thanks
SR
Update adding kubeadm config file
# cat kubeadm-config.yaml
---
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
kubeletExtraArgs:
cgroup-driver: "systemd"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "master01:6443"
networking:
dnsDomain: cluster.local
podSubnet: 10.0.0.0/14
serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs
"

First of all, according to the docs - please note that Calico and kubeadm support Centos/RHEL 7+.
In both Calico and kubeadm documentation we can see that they only support RHEL7+.
By default RHEL8 uses nftables instead of iptables ( we can still use iptables but "iptables" on RHEL8 is actually using the kernel's nft framework in the background - look at "Running Iptables on RHEL 8").
9.2.1. nftables replaces iptables as the default network packet filtering framework
I believe that nftables may cause this network issues because as we can find on nftables adoption page:
Kubernetes does not support nftables yet.
Note: For now I highly recommend you to use RHEL7 instead of RHEL8.
With that in mind, I'll present some information that may help you with RHEL8.
I have reproduced your issue and found a solution that works for me.
First I opened ports required by Calico - these ports can be found
here under "Network requirements".
As workaround:
Next I reverted to the old iptables backend on all cluster
nodes, you can easily do so by setting FirewallBackend in
/etc/firewalld/firewalld.conf to iptables as described
here.
Finally I restarted firewalld to make the new rules active.
I've tried nslookup from Pod running on worker node (kworker) and it seems to work correctly.
root#kmaster:~# kubectl get pod,svc -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/web 1/1 Running 0 112s 10.99.32.1 kworker <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.99.0.1 <none> 443/TCP 5m51s <none>
root#kmaster:~# kubectl exec -it web -- bash
root#web:/# nslookup kubernetes.default
Server: 10.99.0.10
Address: 10.99.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.99.0.1
root#web:/#

In my situation, we're using the K3S cluster. And the new agent couldn't make the default(ClusterFirst) DNS query. After lots of research, I found I need to change the kube-proxy cluster-cidr args to make the DNS work successfully.
Hope this info is useful for others.

I ran into the same issue setting up a vanilla kubeadm 1.25 cluster on RHEL8 and #matt_j's answer lead me to another solution that avoids nftables by using ipvs mode in kube-proxy.
Just modify the kube-proxy ConfigMap in kube-system namespace so the config.conf file has this value;
...
data:
config.conf:
...
mode: "ipvs"
...
And ensure kube-proxy or your nodes are restarted.

Related

K8 DNS not resolving

My K8 DNS isn't resolving, thus I did follow the debugging steps as mentioned here. As I am new to K8, can someone point me to the issue I am facing? I cant extract any useful information out of the debugging steps.
cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
kubectl version
Client Version: version.Info{Major:"1", Minor:"20",
GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:10:43Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:02:01Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
kubectl get namespace
NAME STATUS AGE
default Active 7d4h
kubectl get pods dnsutils
NAME READY STATUS RESTARTS AGE
dnsutils 1/1 Running 18 18h
kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
kubectl exec -ti dnsutils -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local
cluster.local
options ndots:5
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS
AGE
coredns-74ff55c5b-6vsml 1/1 Running 12 7d4h
coredns-74ff55c5b-mww7g 1/1 Running 12 7d4h
kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
[INFO] Reloading
[INFO] plugin/health: Going into lameduck mode for 5s
[INFO] plugin/reload: Running configuration MD5 = 3d3f6363f05ccd60e0f885f0eca6c5ff
[INFO] Reloading complete
[INFO] 10.244.0.1:16732 - 59651 "HINFO IN 6307445054232439722.7934820194057826263. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.006053527s
[INFO] 127.0.0.1:58672 - 59651 "HINFO IN 6307445054232439722.7934820194057826263. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.00658948s
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
[INFO] Reloading
[INFO] plugin/health: Going into lameduck mode for 5s
[INFO] plugin/reload: Running configuration MD5 = 3d3f6363f05ccd60e0f885f0eca6c5ff
[INFO] Reloading complete
[INFO] 10.244.0.62:56364 - 32900 "HINFO IN 2808379183970575835.6786373795048579500. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.004922932s
[INFO] 127.0.0.1:48277 - 32900 "HINFO IN 2808379183970575835.6786373795048579500. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.007889024s
[INFO] 10.244.0.62:49106 - 59651 "HINFO IN 6307445054232439722.7934820194057826263. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.005058199s
kubectl get svc --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 7d4h
monitoring-influxdb ClusterIP 10.102.51.183 <none> 8086/TCP 4d21h
kubectl get endpoints kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.244.0.45:53,10.244.0.47:53,10.244.0.45:53 + 3 more... 7d4h
cat /run/systemd/resolve/resolv.conf
nameserver 8.8.8.8
nameserver 2001:4860:4860::8888
cat /etc/systemd/resolved.conf
[Resolve]
DNS=8.8.8.8 2001:4860:4860::8888
cat /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad
It is kinda odd, that both resolv.conf have different values. Also, I have no clue (if I would have to set the DNS IP manually) which IP to choose.
kubeadm config view
apiServer:
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.20.5
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
Update
The dnsutils assigned pod's IP is 10.244.2.20 and not reachable from the single k8 master node.
ping 10.244.2.20
There were several issues with my configuration. First off: I did use an incompatible docker version (20.10.5) which isn't supported yet. Hence, I don't know whether this issue also arises when using a supported docker version. However, even with this incompatible docker version, I was able to fix the issue with following steps:
1. DNS misconfiguration
I don't know who/what will set the resolved.conf's DNS entries, but my entry was clearly wrong. First, we need to obtain the K8's DNS Cluster-IP Address:
kubectl get services --all-namespaces -o wide
You will receive all services within all namespaces, including the kube-dns Cluster-IP. In my case It looks like following
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 11d k8s-app=kube-dns
kube-system monitoring-influxdb ClusterIP 10.102.51.183 <none> 8086/TCP 9d k8s-app=influxdb
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.110.126.218 <none> 8000/TCP 11d k8s-app=dashboard-metrics-scraper
kubernetes-dashboard kubernetes-dashboard ClusterIP 10.98.164.199 <none> 443/TCP 11d k8s-app=kubernetes-dashboard
Use that DNS within your resolved.conf file. Where that file is located, depends on your OS. In my case (Ubuntu 20.04) /etc/systemd/resolved.conf.
nano /etc/systemd/resolved.conf
[Resolve]
DNS=10.96.0.10 8.8.8.8 2001:4860:4860::8888
2. Re-Join all nodes
I did use UFW next to IPTables, which was somehow messing with the configuration. Hence, I did remove all nodes, installed a fresh OS and re-joined the cluster; without activating UFW.
3. Forward packet policy
In some versions docker modifies the iptables, such that packets will be dropped in packet-forward scenarios. Override this behaviour on all nodes with:
iptables -P FORWARD ACCEPT
Just to be sure, also enable ipv4 forwarding with:
echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
What is the operating system you are using. I was using redhat enterprise Linux and had the similar error.
I have removed everything in /etc/resolv.conf and kept only ip of dns server and it worked.
What is the network policy you are using, for me calico didn't work. I used kube-router with above /etc/resolv.conf setting.
Thanks.

How can I get CoreDNS to resolve on my Raspberry Pi Kubernetes cluster?

I've followed a number of online tutorials to set up a Kubernetes cluster on four Raspberry Pi 4s. I ended up using Flannel as the networking plugin as that seems to be the only one that actually works on RPi, with a pod network CIDR of 10.244.0.0/16, per this guide from 2017. Most everything is working... all of the base pods in the kube-system namespace are running/healthy, and I can pull down images and launch new containers. At first I wasn't able to get any pod logs, but that was quickly remedied by opening up port 10250 on each node.
But there still seems to be a problem DNS resolution. I should clarify that DNS resolution on the hosts clearly does work, as the cluster is able to download any container image I specify. But once a container is running, it isn't able to "dial out" to anything. As a test, I'm running the arm32v7/buildpack-deps:latest container in a pod. It pulls the image from Docker hub just fine. But when I shell into it and simply type curl https://www.google.com it hangs before eventually timing out. And the same is true of any pod I launch that needs to interact with the external Internet: they hang and hang and hang.
Here are all the networking-related commands I've already run on each node:
sudo iptables -P FORWARD ACCEPT
sudo iptables -A FORWARD -i cni0 -j ACCEPT
sudo iptables -A FORWARD -o cni0 -j ACCEPT
sudo ufw allow ssh
sudo ufw allow 443 # can't remember why i ran this one
sudo ufw allow 6443
sudo ufw allow 8080 # this one might not be strictly necessary, either
sudo ufw allow 10250
sudo ufw default allow routed
sudo ufw enable
I'm not entirely sure that the last two iptables commands did anything; I grabbed them from the comment section of that guide I linked to earlier. I know that guide assumes one is using kube-dns but it's also 3 years old so I am using the (newer) default, coredns, instead.
What am I missing? I feel like I'm so close to having this cluster fully operational, but obviously I need functioning DNS!
UPDATE: I know that it's a DNS problem, and not general Internet connectivity, for two reasons: (1) the cluster itself can pull down any image I specify from Dockerhub, and (2) when I shell into a running container that has curl and execute curl -H "Host: www.google.com" 142.250.73.206, it successfully returns the Google homepage HTML. But as mentioned if I try and do my earlier curl command using the hostname, that times out.
Create a simple Pod to use as a test environment for DNS diagnosing:
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
kubectl apply -f dnsutils.yaml
Check the status of Pod
$ kubectl get pods dnsutils
NAME READY STATUS RESTARTS AGE
dnsutils 1/1 Running 0 <some-time>
Once that Pod is running, you can exec nslookup in that environment. If you see something like the following, DNS is working correctly.
$ kubectl exec -i -t dnsutils -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10
Name: kubernetes.default
Address 1: 10.0.0.1
If the nslookup command fails, check the following:
Take a look inside the resolv.conf file.
kubectl exec -ti dnsutils -- cat /etc/resolv.conf
Verify that the search path and name server are set up like the following (note that search path may vary for different cloud providers):
search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.gce_project_id.internal
nameserver 10.0.0.10
options ndots:5
Errors such as the following indicate a problem with the CoreDNS (or kube-dns) add-on or with associated Services:
$ kubectl exec -i -t dnsutils -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10
nslookup: can't resolve 'kubernetes.default'
OR
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'kubernetes.default'
Check if the DNS pod is running
$ kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
...
coredns-7b96bf9f76-5hsxb 1/1 Running 0 1h
coredns-7b96bf9f76-mvmmt 1/1 Running 0 1h
...
Check for errors in the DNS pod
Here is an example of a healthy CoreDNS log:
$ kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
2018/08/15 14:37:17 [INFO] CoreDNS-1.2.2
2018/08/15 14:37:17 [INFO] linux/amd64, go1.10.3, 2e322f6
CoreDNS-1.2.2
linux/amd64, go1.10.3, 2e322f6
2018/08/15 14:37:17 [INFO] plugin/reload: Running configuration MD5 = 24e6c59e83ce706f07bcc82c31b1ea1c
Verify that the DNS service is up by using the kubectl get service command.
$ kubectl get svc --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
...
kube-dns ClusterIP 10.0.0.10 <none> 53/UDP,53/TCP 1h
...
You can verify that DNS endpoints are exposed by using the kubectl get endpoints command.
$ kubectl get endpoints kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.180.3.17:53,10.180.3.17:53 1h
You can verify if queries are being received by CoreDNS by adding the log plugin to the CoreDNS configuration (aka Corefile). The CoreDNS Corefile is held in a ConfigMap named coredns. To edit it, use the command:
$ kubectl -n kube-system edit configmap coredns
Then add log in the Corefile section per the example below:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
log
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
After saving the changes, it may take up to minute or two for Kubernetes to propagate these changes to the CoreDNS pods.
Next, make some queries and view the logs per the sections above in this document. If CoreDNS pods are receiving the queries, you should see them in the logs.
Here is an example of a query in the log:
.:53
2018/08/15 14:37:15 [INFO] CoreDNS-1.2.0
2018/08/15 14:37:15 [INFO] linux/amd64, go1.10.3, 2e322f6
CoreDNS-1.2.0
linux/amd64, go1.10.3, 2e322f6
2018/09/07 15:29:04 [INFO] plugin/reload: Running configuration MD5 = 162475cdf272d8aa601e6fe67a6ad42f
2018/09/07 15:29:04 [INFO] Reloading complete
172.17.0.18:41675 - [07/Sep/2018:15:29:11 +0000] 59925 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd,ra 106 0.000066649s
As pointed out in the comments: The configuration of kubeadm seems fine.
Your pods have the correct /etc/resolv.conf and they should work.
It's pretty hard to clarily determine the problem - many things can be happend here.
My guess: There something not right with ufw.
You can easily proof it: Disable ufw on all nodes (with ufw disable).
I'm not hundred percent sure which ports are needed. I'm using iptables for my single node k8s and at the start I had many problems FORWARD vs INPUT rules. In docker all ports are forwarded.
So I guess there is something wrong with FORWARD-rules and/or the dns-ports (53/udp and 53/tcp).
Good luck.

Can't resolve dns from inside k8s pod

In dnsutils pod exec ping stackoverflow.com
/ # ping stackoverflow.com
ping: bad address 'stackoverflow.com'
The /etc/resolve.conf file looks fine from inside the pod
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search weika.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
10.96.0.10 is the kube-dns service ip:
[root#test3 k8s]# kubectl -n kube-system get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 75d
core dns
[root#test3 k8s]# kubectl -n kube-system get pod -o wide | grep core
coredns-6557d7f7d6-5nkv7 1/1 Running 0 10d 10.244.0.14 test3.weikayuninternal.com <none> <none>
coredns-6557d7f7d6-gtrgc 1/1 Running 0 10d 10.244.0.13 test3.weikayuninternal.com <none> <none>
when I change the nameserver ip to coredns ip. resolve dns is ok.
/ # cat /etc/resolv.conf
nameserver 10.244.0.14
#nameserver 10.96.0.10
search weika.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # ping stackoverflow.com
PING stackoverflow.com (151.101.65.69): 56 data bytes
64 bytes from 151.101.65.69: seq=0 ttl=49 time=100.497 ms
64 bytes from 151.101.65.69: seq=1 ttl=49 time=101.014 ms
64 bytes from 151.101.65.69: seq=2 ttl=49 time=100.462 ms
64 bytes from 151.101.65.69: seq=3 ttl=49 time=101.465 ms
64 bytes from 151.101.65.69: seq=4 ttl=49 time=100.318 ms
^C
--- stackoverflow.com ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 100.318/100.751/101.465 ms
/ #
Why is it happening?
You have not mentioned how kubernetes was installed. You should restart coredns pods using below command.
kubectl -n kube-system rollout restart deployment coredns
This might only apply to you if there was trouble during either your initial installation of microk8s or enablement of the dns addon, but it might still be worth a shot. I've invested so much gd time in this I couldn't live with myself if I didn't at least share to help that one person out there.
In my case, the server I provisioned to set up a single-node cluster was too small - only 1GB of memory. When I was setting up microk8s for the first time and enabling all the addons I wanted (dns, ingress, hostpath-storage), I started running into problems that were remedied by just giving the server more memory. Unfortunately though, screwing that up initially seems to have left the addons in some kind of undefined, partially initialized/configured state, such that everything appeared to be running normally as best I could tell (i.e. CoreDNS was deployed and ready, and the kube-dns service showed CoreDNS's ClusterIP as it's backend endpoint) but none of my pods could resolve any DNS names, internal or external to the cluster, and I would get these annoying event logs when I ran kubectl describe <pod> suggesting there was a DNS issue of some kind.
What ended up fixing it is resetting the cluster (microk8s reset --destroy-storage) and then re-enabling all my addons (microk8s enable dns ingress hostpath-storage) now that I had enough memory to do so cleanly do so. After that, CoreDNS and the kube-dns service appeared ready just like before, but DNS queries actually worked like they should from within the pods running in the cluster.
tl;dr; - Your dns addon might have have been f'd up during cluster installation. Try resetting your cluster, re-enabling the addons, and re-deploying your resources.

"nslookup: read: Connection refused" from inside of a pod in Kubernetes (K8S) cluster (DNS problem)

Problem
I have custom installation of k8s cluster with 1 master and 1 node on AWS ec2 based on Centos 7. It uses Core-DNS (pods running fine with no errors in logs)
Inside of a node pod when calling e.g. nslookup google.com
the output is nslookup: write to '10.96.0.10': Connection refused
;; connection timed out; no servers could be reached
For example, pinging inside of a pod ping 8.8.8.8 works fine:
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=50 time=1.330 ms
/etc/resolv.conf inside a pod it looks like:
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options ndots:5
This command works fine from the node itself nslookup google.com:
Server: 172.31.0.2
Address: 172.31.0.2#53
Non-authoritative answer:
Name: google.com
Address: 172.217.15.110
Name: google.com
Address: 2607:f8b0:4004:801::200e
Kubelet config kubectl get configmap kubelet-config-1.17 -n kube-system -o yaml returns
data:
kubelet: |
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
kind: ConfigMap
Pods in kube namespace kubectl get pods -n kube-system look like this:
coredns-6955765f44-qdbgx 1/1 Running 6 11d
coredns-6955765f44-r4v7n 1/1 Running 6 11d
etcd-ip-172-31-42-121.ec2.internal 1/1 Running 7 11d
kube-apiserver-ip-172-31-42-121.ec2.internal 1/1 Running 7 11d
kube-controller-manager-ip-172-31-42-121.ec2.internal 1/1 Running 6 11d
kube-proxy-lrpd9 1/1 Running 6 11d
kube-proxy-z55cv 1/1 Running 6 11d
kube-scheduler-ip-172-31-42-121.ec2.internal 1/1 Running 6 11d
weave-net-bdn5n 2/2 Running 0 39h
weave-net-z7mks 2/2 Running 5 39h
UPDATE
From the pod if I do ip route it returns:
default via 10.32.0.1 dev eth0
10.32.0.0/12 dev eth0 scope link src 10.32.0.16
From master:
default via 172.31.32.1 dev eth0
10.32.0.0/12 dev weave proto kernel scope link src 10.32.0.1
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.31.32.0/20 dev eth0 proto kernel scope link src 172.31.42.121
From node:
default via 172.31.32.1 dev eth0
10.32.0.0/12 dev weave proto kernel scope link src 10.32.0.1
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.31.32.0/20 dev eth0 proto kernel scope link src 172.31.46.62
CoreDNS configmap kubectl -n kube-system get configmap coredns -oyaml is:
apiVersion: v1
data:
Corefile: |
.:53 {
log
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
So why nslookup google.com doesn't work inside of a pod??
Installation of k8s cluster was wrong, ansible script should contain correct private IPs of master and nodes on ec2 vms.
dev-kubernetes-master ansible_host=34.233.207.xxx private_ip=172.31.37.xx
dev-kubernetes-slave ansible_host=52.6.10.xxx private_ip=172.31.42.xxx
I've reinstalled cluster with correct private ips specified (before there was no private ip at all) and the problem has gone.

coredns do not resolve service name correctly

i use Kubernetes v1.11.3 ,it use coredns to resolve host or service name,but i find in pod ,the resolve not work correctly,
# kubectl get services --all-namespaces -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 50d <none>
kube-system calico-etcd ClusterIP 10.96.232.136 <none> 6666/TCP 50d k8s-app=calico-etcd
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 50d k8s-app=kube-dns
kube-system kubelet ClusterIP None <none> 10250/TCP 32d <none>
testalex grafana NodePort 10.96.51.173 <none> 3000:30002/TCP 2d app=grafana
testalex k8s-alert NodePort 10.108.150.47 <none> 9093:30093/TCP 13m app=alertmanager
testalex prometheus NodePort 10.96.182.108 <none> 9090:30090/TCP 16m app=prometheus
following command no response
# kubectl exec -it k8s-monitor-7ddcb74b87-n6jsd -n testalex /bin/bash
[root#k8s-monitor-7ddcb74b87-n6jsd /]# ping k8s-alert
PING k8s-alert.testalex.svc.cluster.local (10.108.150.47) 56(84) bytes of data.
and no cordons output log
# kubectl logs coredns-78fcdf6894-h78sd -n kube-system
i think maybe something is wrong,but i can not locate the problem,another question is why the two coredns pods on the master node,it suppose to one on each node
UPDATE
it seems coredns work fine ,but i do not understand the ping command no return
[root#k8s-monitor-7ddcb74b87-n6jsd yum.repos.d]# nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
[root#k8s-monitor-7ddcb74b87-n6jsd yum.repos.d]# cat /etc/resolv.conf
nameserver 10.96.0.10
search testalex.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
# kubectl get ep kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 192.168.121.3:53,192.168.121.4:53,192.168.121.3:53 + 1 more... 50d
also dns server can not be reached
# kubectl exec -it k8s-monitor-7ddcb74b87-n6jsd -n testalex /bin/bash
[root#k8s-monitor-7ddcb74b87-n6jsd /]# cat /etc/resolv.conf
nameserver 10.96.0.10
search testalex.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
[root#k8s-monitor-7ddcb74b87-n6jsd /]# ping 10.96.0.10
PING 10.96.0.10 (10.96.0.10) 56(84) bytes of data.
^C
--- 10.96.0.10 ping statistics ---
9 packets transmitted, 0 received, 100% packet loss, time 8000ms
i think maybe i misconfig the network
this is my cluster init command
kubeadm init --kubernetes-version=v1.11.3 --apiserver-advertise-address=10.100.1.20 --pod-network-cidr=172.16.0.0/16
and this is calico ip pool set
# kubectl exec -it calico-node-77m9l -n kube-system /bin/sh
Defaulting container name to calico-node.
Use 'kubectl describe pod/calico-node-77m9l -n kube-system' to see all of the containers in this pod.
/ # cd /tmp
/tmp # ls
calicoctl tunl-ip
/tmp # ./calicoctl get ipPool
CIDR
192.168.0.0/16
You can start by checking if the dns is working
Run the nslookup on kubernetes.default from inside the pod k8s-monitor-7ddcb74b87-n6jsd, check if it is working.
[root#k8s-monitor-7ddcb74b87-n6jsd /]# nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
If this returns output that means everything is working from the coredns. If output is not okay, then look into the the resolve.conf inside the pod k8s-monitor-7ddcb74b87-n6jsd, it should return output something like this:
[root#metrics-master-2 /]# cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options ndots:5
At last check the coredns endpoints are exposed using:
kubectl get ep kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.180.3.17:53,10.180.3.17:53 1h
You can verify if queries are being received by CoreDNS by adding the log plugin to the CoreDNS configuration (aka Corefile). The CoreDNS Corefile is held in a ConfigMap named coredns
Hope this helps.
EDIT:
You might be having this issue, Please have a look:
https://github.com/kubernetes/kubeadm/issues/1056
You cannot ping ipaddress or hostname of service cluster always,since it is virtual ip
service’s cluster IP is a virtual IP, and only has meaning when combined with the service port.You can try the same via srv recored(combination of virtual ip and port)(refer kubernetes in action by mark luksa)
Thanks for the answer. This is the output. IP-s certainly not real.
[root#master ~]# nslookup kubernetes.default
Server: 203.150.92.12
Address: 203.150.92.12#53
** server can't find kubernetes.default: NXDOMAIN
[root#master ~]# kubectl cluster-info
Kubernetes master is running at https://203.150.72.81:6443
coredns is running at https://203.150.72.81:6443/api/v1/namespaces/kube-system/services/coredns:dns/proxy
kubernetes-dashboard is running at https://203.150.72.81:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
metrics-server is running at https://203.150.72.81:6443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root#master ~]# cat /etc/resolv.conf
search invalid
nameserver 203.150.92.12
nameserver 203.150.92.10
nameserver 1111:c207::2:55
[root#master ~]# kubectl get ep kube-dns --namespace=kube-system
Error from server (NotFound): endpoints "kube-dns" not found
[root#master ~]#
I think the reason why you cannot get ping working is because you are using iptables to redirect the request to service cluster IP to the correct pods. The iptables rule will only redirect the traffic to the service cluster IP with the exported ports. The icmp request is never been redirected to the real endpoints.