k3s debuging dns resolution - kubernetes

I'm new to kubernetes and I have some issues with my dns names in my k3s cluster on pc with arm architecture.
I've tried to debug as docs (https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/) suggest
I installed 3ks as follows:
sudo curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE=”644” sh -
And applied manifest for debugging pod:
kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
I've checked that pod is running:
kubectl get pods dnsutils
and tried to run
kubectl exec -i -t dnsutils -- nslookup kubernetes.default
and expected smth like that:
Server: 10.0.0.10
Address 1: 10.0.0.10
Name: kubernetes.default
Address 1: 10.0.0.1
But get:
;; connection timed out; no servers could be reached
command terminated with exit code 1
Any thoughts to debug? It seems that I messing smth...
UPD. Tried to debug as rancher suggests (https://docs.ranchermanager.rancher.io/v2.5/troubleshooting/other-troubleshooting-tips/dns):
kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup kubernetes.default
And there is the output:
If you don't see a command prompt, try pressing enter.
Address 1: 10.43.0.10
nslookup: can't resolve 'kubernetes.default'
pod "busybox" deleted
pod default/busybox terminated (Error)
So I tried next step:
for p in $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name); do kubectl logs --namespace=kube-system $p; done
and logs are:
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
.:53
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
[INFO] plugin/reload: Running configuration SHA512 = b941b080e5322f6519009bb49349462c7ddb6317425b0f6a83e5451175b720703949e3f3b454a24e77f3ffe57fd5e9c6130e528a5a1dd00d9000e4afd6c1108d
CoreDNS-1.9.1
linux/arm64, go1.17.8, 4b597f8
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:39581->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:52272->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:41480->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:52059->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:46821->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:35222->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:38013->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:42222->8.8.8.8:53: i/o timeout
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:50612->8.8.8.8:53: i/o timeout
[ERROR] plugin/errors: 2 4288512074117887106.1437335397389171032. HINFO: read udp 10.42.0.5:50341->8.8.8.8:53: i/o timeout
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
...
UPD2
kubectl -n kube-system get cm coredns -o yaml
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
hosts /etc/coredns/NodeHosts {
ttl 60
reload 15s
fallthrough
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
import /etc/coredns/custom/*.server
NodeHosts: |
192.168.0.103 ubuntu
kind: ConfigMap
metadata:
annotations:
objectset.rio.cattle.io/applied: H4sIAAAAAAAA/4yQwWrzMBCEX0Xs2fEf20nsX9BDybH02lMva2kdq1Z2g6SkBJN3L8IUCiVtbyNGOzvfzoAn90IhOmHQcKmgAIsJQc+wl0CD8wQaSr1t1PzKSilFIUiIix4JfRoXHQjtdZHTuafAlCgq488xUSi9wK2AybEFDXvhwR2e8QQFHCnh50ZkloTJCcf8lP6NTIqUyuCkNJiSp9LJP5czoLjryztTWB0uE2iYmvjFuVSFenJsHx6tFf41gvGY6Y0Eshz/9D2e0OSZfIJVvMZExwzusSf/I9SIcQQNvaG6a+r/XVdV7abBddPtsN9W66Eedi0N7aberM22zaHf6t0tcPsIAAD//8Ix+PfoAQAA
objectset.rio.cattle.io/id: ""
objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
objectset.rio.cattle.io/owner-name: coredns
objectset.rio.cattle.io/owner-namespace: kube-system
creationTimestamp: "2022-09-23T09:06:05Z"
labels:
objectset.rio.cattle.io/hash: bce283298811743a0386ab510f2f67ef74240c57
name: coredns
namespace: kube-system
resourceVersion: "315"
uid: 33a8ccf6-511f-49c4-9752-424859d67d70
UPD3
kubectl -n kube-system get po -o wide
Output:
coredns-b96499967-sct84 1/1 Running 1 (17h ago) 20h 10.42.0.6 ubuntu <none> <none>
helm-install-traefik-crd-wrh5b 0/1 Completed 0 20h 10.42.0.3 ubuntu <none> <none>
helm-install-traefik-wx7s2 0/1 Completed 1 20h 10.42.0.5 ubuntu <none> <none>
local-path-provisioner-7b7dc8d6f5-qxjvs 1/1 Running 1 (17h ago) 20h 10.42.0.3 ubuntu <none> <none>
metrics-server-668d979685-ngbmr 1/1 Running 1 (17h ago) 20h 10.42.0.5 ubuntu <none> <none>
svclb-traefik-67fcd721-mz6sd 2/2 Running 2 (17h ago) 20h 10.42.0.2 ubuntu <none> <none>
traefik-7cd4fcff68-j74gd 1/1 Running 1 (17h ago) 20h 10.42.0.4 ubuntu <none> <none>
kubectl -n kube-system get svc
Output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 20h
metrics-server ClusterIP 10.43.178.64 <none> 443/TCP 20h
traefik LoadBalancer 10.43.36.41 192.168.0.103 80:30268/TCP,443:30293/TCP 20h

Actually I found workaround. When install k3s one should use flag flannel-backend=ipsec
curl -sfL https://get.k3s.io | sh -s - server --write-kubeconfig-mode 644 --flannel-backend=ipsec
By default it uses --flannel-backend=vxlan I've tried --flannel-backend=host-gw
But for me works well flannel-backend=ipsec

Related

Kubernetes: Can't Resolve Hostnames from within pods

new to Kubernetes, but have used K3s a little in the past. Just setup a K8s cluster. None of my pods can do DNS lookups, even to google, or to an internal domain.
I init'd with: --pod-network-cidr=10.244.0.0/16. Metal-LB is installed (10.7.7.10-10.7.7.254) and the nodes and master are running with IPs 10.7.50.X/16 and 10.7.60.X/16 respectively. Flannel is setup with the default Kube-Flannel: https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
So far it's just 1 master with 2 nodes.
Versions:
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:44:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:39:34Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
$ kubelet --version
Kubernetes v1.22.1
Troubleshooting commands:
$ kubectl describe service kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=CoreDNS
Annotations: prometheus.io/port: 9153
prometheus.io/scrape: true
Selector: k8s-app=kube-dns
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.0.10
IPs: 10.96.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 10.244.1.20:53,10.244.2.28:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 10.244.1.20:53,10.244.2.28:53
Port: metrics 9153/TCP
TargetPort: 9153/TCP
Endpoints: 10.244.1.20:9153,10.244.2.28:9153
Session Affinity: None
Events: <none>
$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-84f8874d6d-jgvwk 1/1 Running 1 (115m ago) 21h 10.244.1.20 k-w-001 <none> <none>
coredns-84f8874d6d-qh2f4 1/1 Running 1 (115m ago) 21h 10.244.2.28 k-w-002 <none> <none>
etcd-k-m-001 1/1 Running 12 (15m ago) 2d22h 10.7.50.11 k-m-001 <none> <none>
kube-apiserver-k-m-001 1/1 Running 11 (15m ago) 2d22h 10.7.50.11 k-m-001 <none> <none>
kube-controller-manager-k-m-001 1/1 Running 12 (15m ago) 2d22h 10.7.50.11 k-m-001 <none> <none>
kube-flannel-ds-286dc 1/1 Running 10 (15m ago) 2d22h 10.7.50.11 k-m-001 <none> <none>
kube-flannel-ds-rbmhx 1/1 Running 6 (114m ago) 2d21h 10.7.60.11 k-w-001 <none> <none>
kube-flannel-ds-vjl7l 1/1 Running 4 (115m ago) 2d21h 10.7.60.12 k-w-002 <none> <none>
kube-proxy-948z8 1/1 Running 8 (15m ago) 2d22h 10.7.50.11 k-m-001 <none> <none>
kube-proxy-l7h64 1/1 Running 4 (115m ago) 2d21h 10.7.60.12 k-w-002 <none> <none>
kube-proxy-pqmsr 1/1 Running 4 (115m ago) 2d21h 10.7.60.11 k-w-001 <none> <none>
kube-scheduler-k-m-001 1/1 Running 12 (15m ago) 2d22h 10.7.50.11 k-m-001 <none> <none>
metrics-server-6dfddc5fb8-47mnb 0/1 Running 3 (115m ago) 2d20h 10.244.1.21 k-w-001 <none> <none>
$ kubectl logs --namespace=kube-system coredns-84f8874d6d-jgvwk
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.8.4
linux/amd64, go1.16.4, 053c4d5
$ kubectl logs --namespace=kube-system coredns-84f8874d6d-qh2f4
[INFO] plugin/ready: Still waiting on: "kubernetes"
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.8.4
linux/amd64, go1.16.4, 053c4d5
These were ran seconds apart:
$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10:53
*** Can't find kubernetes.default: No answer
*** Can't find kubernetes.default: No answer
$ kubectl exec -ti busybox -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
command terminated with exit code 1
Here are some more tests:
$ kubectl exec -ti busybox -- nslookup google.com
;; connection timed out; no servers could be reached
command terminated with exit code 1
$ kubectl exec -ti busybox -- nslookup google.com 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8:53
Non-authoritative answer:
Name: google.com
Address: 142.251.33.78
*** Can't find google.com: No answer
$ kubectl exec -ti busybox -- ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=116 time=6.437 ms
$ kubectl exec busybox -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
$ kubectl exec -ti busybox -- nslookup kubernetes.default 10.96.0.10
Server: 10.96.0.10
Address: 10.96.0.10:53
*** Can't find kubernetes.default: No answer
*** Can't find kubernetes.default: No answer
$ kubectl exec -ti busybox -- nslookup kubernetes.default 10.96.0.10
;; connection timed out; no servers could be reached
command terminated with exit code 1
I also noticed that the kube-dns service has the app selector set to k8s-app=kube-dns and coredns has the label k8s-app=kube-dns, is this correct?
The pods running in the kube-system namespace seem to have 2 different IP ranges. One is using the Node's IP, and the other is using Flannels.
I'm not sure what's happening here, being new to Kubernetes, but it appears like the DNS pods or service are not working at all.
Edit:
Further info:
$ sudo ufw status
Status: inactive
Issue was actually Flannel. DNS queries worked fine until the nodes were restarted, and then all pod queries failed until the Flannel pods were restarted.
Man this was a rabbit hole.
See: https://github.com/flannel-io/flannel/issues/1321

k3s - networking between pods not working

I'm struggling with this cross-communication between pods even though clusterIP services are set up for them. All the pods are on the same master node, and in the same namespace. In Summary:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-744f4df6df-rxhph 1/1 Running 0 136m 10.42.0.31 raspberrypi <none> <none>
nginx-2-867f4f8859-csn48 1/1 Running 0 134m 10.42.0.32 raspberrypi <none> <none>
$ kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
nginx-service ClusterIP 10.43.155.201 <none> 80/TCP 136m app=nginx
nginx-service2 ClusterIP 10.43.182.138 <none> 85/TCP 134m app=nginx-2
where I can't curl http://nginx-service2:85 from within nginx container, or vice versa... while I validated that this worked from my docker desktop installation:
# docker desktop
root#nginx-7dc45fbd74-7prml:/# curl http://nginx-service2:85
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
nginx.org.<br/>
Commercial support is available at
nginx.com.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
# k3s
root#nginx-744f4df6df-rxhph:/# curl http://nginx-service2.pwk3spi-vraptor:85
curl: (6) Could not resolve host: nginx-service2.pwk3spi-vraptor
After googling the issue (and please correct me if I'm wrong) it seems like a coredns issue, because looking at the logs and see the error timeouts:
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
helm-install-traefik-qr2bd 0/1 Completed 0 153d
metrics-server-7566d596c8-nnzg2 1/1 Running 59 148d
svclb-traefik-kjbbr 2/2 Running 60 153d
traefik-758cd5fc85-wzjrn 1/1 Running 20 62d
local-path-provisioner-6d59f47c7-4hvf2 1/1 Running 72 148d
coredns-7944c66d8d-gkdp4 1/1 Running 0 3m47s
$ kubectl logs coredns-7944c66d8d-gkdp4 -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 1c648f07b77ab1530deca4234afe0d03
CoreDNS-1.6.9
linux/arm, go1.14.1, 1766568
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:50482->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:34160->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:53485->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:46642->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:55329->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:44471->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:49182->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:54082->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:48151->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:48599->192.168.8.109:53: i/o timeout
where people recommended
changing the coredns config map to forward to your master node IP
... other CoreFile stuff
forward . host server IP
... other CoreFile stuff
or adding your coredns clusterip IP as a nameserver to /etc/resolve.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.42.0.38
nameserver 192.168.8.1
nameserver fe80::266:19ff:fea7:85e7%wlan0
, however didn't find that these solutions worked.
Details for reference:
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
raspberrypi Ready master 153d v1.18.9+k3s1 192.168.8.109 <none> Raspbian GNU/Linux 10 (buster) 5.10.9-v7l+ containerd://1.3.3-k3s2
$ kubectl get svc -n kube-system -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 153d k8s-app=kube-dns
metrics-server ClusterIP 10.43.205.8 <none> 443/TCP 153d k8s-app=metrics-server
traefik-prometheus ClusterIP 10.43.222.138 <none> 9100/TCP 153d app=traefik,release=traefik
traefik LoadBalancer 10.43.249.133 192.168.8.109 80:31222/TCP,443:32509/TCP 153d app=traefik,release=traefik
$ kubectl get ep kube-dns -n kube-system
NAME ENDPOINTS AGE
kube-dns 10.42.0.38:53,10.42.0.38:9153,10.42.0.38:53 153d
No idea where I'm going wrong, or if I focused on the wrong stuff, or how to continue. Any help will be much appreciated, please.
When all else fails..... go back to the manual. I tried finding the 'issue' in all the wrong places, while I just had to follow Rancher's installation documentation for k3s (sigh).
Rancher's documentation is very good (you just have to actually follow it), where they state that when installing k3s on Raspbian Buster environments
check version:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description: Raspbian GNU/Linux 10 (buster)
Release: 10
Codename: buster
you need to change to legacy iptables, stating to run (link):
sudo iptables -F
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
sudo reboot
note that when setting the iptables, do it directly on the pi, not via ssh. You will be kicked out
After doing this, all my services were happy, and could curl each other from within the containers via their defined clusterIP service names etc.
for anyone that don't want to waste 3 hours like me on centos using k3s you need to disable firewall for those services to call eachother
https://rancher.com/docs/k3s/latest/en/advanced/#additional-preparation-for-red-hat-centos-enterprise-linux
It is recommended to turn off firewalld:
systemctl disable firewalld --now
If enabled, it is required to disable nm-cloud-setup and reboot the node:
systemctl disable nm-cloud-setup.service nm-cloud-setup.timer
reboot
after i disabled it, the services was able to call each other through dns name in my Config
still looking for a better way then disable firewall, but that depend on the developer of the k3s project
Is there a reason why you try to curl this address:
curl http://nginx-service2.pwk3spi-vraptor:85
Shouldn't this be just:
curl http://nginx-service2:85
In my case , follow rancher docs:
The nodes need to be able to reach other nodes over UDP port 8472 when Flannel VXLAN is used or over UDP ports 51820 and 51821 (when using IPv6) when Flannel Wireguard backend is used.
I just open udp port in Oracle cloud and it's work

K8s no internet connection inside the container

I installed a clean K8s cluster in virtual machines (Debian 10). After the installation and the integration into my landscape, I checked the connectivity inside my testing alpine image. As result the connection of outgoing traffic not working and no information was inside the coreDNS log. I used the workaround on my build image to overwrite my /etc/resolv.conf and replace the DNS entries (e.g. set 1.1.1.1 as Nameserver). After that temporary "hack" the connection to the internet works perfectly. But the workaround is not a long term solution and I want to use the official way. Inside the documentation of K8s coreDNS, I found the forward section and I interpret the flag like an option, to forward the inquiry to the predefined local resolver. I think the forwarding to the local resolv.conf and the resolve process works not correctly. Can anyone help me to solve that issue?
Basic setup:
K8s version: 1.19.0
K8s setup: 1 master + 2 worker nodes
Based on: Debian 10 VM's
CNI: Flannel
Status of CoreDNS Pods
kube-system coredns-xxxx 1/1 Running 1 26h
kube-system coredns-yyyy 1/1 Running 1 26h
CoreDNS Log:
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
CoreDNS config:
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
creationTimestamp: ""
name: coredns
namespace: kube-system
resourceVersion: "219"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
uid: xxx
Ouput alpine image:
/ # nslookup -debug google.de
;; connection timed out; no servers could be reached
Output of pods resolv.conf
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search development.svc.cluster.local svc.cluster.local cluster.local invalid
options ndots:5
Output of host resolv.conf
cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 213.136.95.11
nameserver 213.136.95.10
search invalid
Output of host /run/flannel/subnet.env
cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
Output of kubectl get pods -n kube-system -o wide
coredns-54694b8f47-4sm4t 1/1 Running 0 14d 10.244.1.48 xxx3-node-1 <none> <none>
coredns-54694b8f47-6c7zh 1/1 Running 0 14d 10.244.0.43 xxx2-master <none> <none>
coredns-54694b8f47-lcthf 1/1 Running 0 14d 10.244.2.88 xxx4-node-2 <none> <none>
etcd-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-apiserver-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-controller-manager-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-flannel-ds-amd64-4w8zl 1/1 Running 8 28d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-flannel-ds-amd64-w7m44 1/1 Running 7 28d xxx.xx.xx.xxx xxx3-node-1 <none> <none>
kube-flannel-ds-amd64-xztqm 1/1 Running 6 28d xxx.xx.xx.xxx xxx4-node-2 <none> <none>
kube-proxy-dfs85 1/1 Running 4 28d xxx.xx.xx.xxx xxx4-node-2 <none> <none>
kube-proxy-m4hl2 1/1 Running 4 28d xxx.xx.xx.xxx xxx3-node-1 <none> <none>
kube-proxy-s7p4s 1/1 Running 8 28d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-scheduler-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
Problem:
The (two) coreDNS pods were only deployed on the master node. You can check the settings with this command.
kubectl get pods -n kube-system -o wide | grep coredns
Solution:
I could solve the problem by scaling up the coreDNS pods and edit the deployment configuration. The following commands must be executed.
kubectl edit deployment coredns -n kube-system
Set replicas value to node quantity e.g. 3
kubectl patch deployment coredns -n kube-system -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"force-update/updated-at\":\"$(date +%s)\"}}}}}"
kubectl get pods -n kube-system -o wide | grep coredns
Source
https://blog.dbi-services.com/kubernetes-dns-resolution-using-coredns-force-update-deployment/
Hint
If you still have a problem with your coreDNS and your DNS resolution works sporadically, take a look at this post.

Setting up Kubernetes - API not reachable from Pods

I'm trying to setup a basic Kubernetes cluster on a (Ubuntu 16) VM. I've just followed the getting started docs and would expect a working cluster, but unfortunately, no such luck - no pods can't seem to connect to the Kubenernetes API. Since I'm new to Kubernetes it is very tough for me to find where things are going wrong.
Provision script:
apt-get update && apt-get upgrade -y
apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl docker.io
apt-mark hold kubelet kubeadm kubectl
swapoff -a
sysctl net.bridge.bridge-nf-call-iptables=1
kubeadm init
mkdir -p /home/ubuntu/.kube
cp -i /etc/kubernetes/admin.conf /home/ubuntu/.kube/config
chown -R ubuntu:ubuntu /home/ubuntu/.kube
runuser -l ubuntu -c "kubectl apply -f \"https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')\""
runuser -l ubuntu -c "kubectl taint nodes --all node-role.kubernetes.io/master-"
Installation seems fine.
ubuntu#packer-Ubuntu-16:~$ kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-86c58d9df4-lbp46 0/1 CrashLoopBackOff 7 18m 10.32.0.2 packer-ubuntu-16 <none> <none>
kube-system coredns-86c58d9df4-t8nnn 0/1 CrashLoopBackOff 7 18m 10.32.0.3 packer-ubuntu-16 <none> <none>
kube-system etcd-packer-ubuntu-16 1/1 Running 0 17m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-apiserver-packer-ubuntu-16 1/1 Running 0 18m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-controller-manager-packer-ubuntu-16 1/1 Running 0 17m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-proxy-dwhhf 1/1 Running 0 18m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-scheduler-packer-ubuntu-16 1/1 Running 0 17m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system weave-net-sfvz5 2/2 Running 0 18m 145.100.100.100 packer-ubuntu-16 <none> <none>
Question: is it normal that the Kubernetes pods have as IP the ip of eth0 of the host (145.100.100.100)? Seems weird to me, I would expect them to have a virtual IP?
As you can see the coredns pod is crashing, because, well, it cannot reach the API.
This is as I understand it, the service:
ubuntu#packer-Ubuntu-16:~$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 22m
CoreDNS crashing, because API is unreachable:
ubuntu#packer-Ubuntu-16:~$ kubectl logs -n kube-system coredns-86c58d9df4-lbp46
.:53
2018-12-06T12:54:28.481Z [INFO] CoreDNS-1.2.6
2018-12-06T12:54:28.481Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
E1206 12:54:53.482269 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:318: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1206 12:54:53.482363 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:311: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1206 12:54:53.482540 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I tried launching a simple alpine pod/container. And indeed 10.96.0.1 doesn't responds to pings or anything else.
I'm stuck here. I've tried to google a lot but nothing comes up and my understanding is pretty basic. I guess something's up with the networking, but I don't know what (for me it seems suspicious that when doing get pods, the pods show up with the host IP, but perhaps this is normal also?)
I found that the problem is caused by the host's iptables rules.

Name or service not known [tcp://redis-slave:6379]

Running the Guestbook Kubernetes app with the Kube-Sky add-on. On the guestbook app page I get this error in the JavaScript console:
Fatal error: Uncaught exception 'Predis\Connection\ConnectionException' with message 'php_network_getaddresses: getaddrinfo failed: Name or service not known [tcp://redis-slave:6379]' in /vendor/predis/predis/lib/Predis/Connection/AbstractConnection.php:141
The following are the all the changes (vs. the provided templates) relating to DNS I can find in my setup:
diff -r sample-configs/unmodified/cloud-init/node.yaml sample-configs/defaults/cloud-init/node.yaml
88a91,92
> --cluster_dns=10.100.88.88 \
> --cluster_domain=cluster.local \
diff -r sample-configs/unmodified/skydns-controller.yaml sample-configs/defaults/skydns-controller.yaml
11c11
< replicas: {{ pillar['dns_replicas'] }}
---
> replicas: 3
50c50,51
< - -domain={{ pillar['dns_domain'] }}
---
> - -domain=cluster.local
> - -kube_master_url=http://$(KUBERNETES_MASTER_IPV4):8080
62c63
< - -domain={{ pillar['dns_domain'] }}.
---
> - -domain=cluster.local.
91c92
< - -cmd=nslookup kubernetes.default.svc.{{ pillar['dns_domain'] }} 127.0.0.1 >/dev/null
---
> - -cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
diff -r sample-configs/unmodified/skydns-service.yaml sample-configs/defaults/skydns-service.yaml
13c13,14
< clusterIP: {{ pillar['dns_server'] }}
---
> clusterIP: 10.100.88.88
> type: NodePort
17a19
> nodePort:
20a23
> nodePort:
\ No newline at end of file
./sample-configs/unmodified/cloud-init/master.yaml: --service-cluster-ip-range=10.100.0.0/16 \
NAMESPACE NAME READY STATUS RESTARTS AGE NODE
default frontend-jw0ud 1/1 Running 0 48m $publicip.23
default frontend-mwu18 1/1 Running 0 48m $publicip.23
default frontend-o33ei 1/1 Running 0 48m $publicip.26
default redis-master-ubpga 1/1 Running 0 46m $publicip.23
default redis-slave-7aqp9 1/1 Running 0 46m $publicip.97
default redis-slave-w6rn3 1/1 Running 0 46m $publicip.26
default redis-slave-wny9v 1/1 Running 0 46m $publicip.26
kube-system kube-dns-v9-jek26 4/4 Running 0 50m $publicip.23
kube-system kube-dns-v9-ua150 4/4 Running 0 50m $publicip.26
kube-system kube-dns-v9-ycloq 4/4 Running 0 50m $publicip.97
NAMESPACE NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
default frontend 10.100.221.197 nodes 80/TCP name=frontend 46m
default kubernetes 10.100.0.1 <none> 443/TCP <none> 1h
default redis-master 10.100.151.114 <none> 6379/TCP name=redis-master 46m
default redis-slave 10.100.223.227 nodes 6379/TCP name=redis-slave 46m
kube-system kube-dns 10.100.88.88 nodes 53/UDP,53/TCP k8s-app=kube-dns 46m
NAMESPACE CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS AGE
default frontend php-redis kubernetes/example-guestbook-php-redis:v2 name=frontend 3 48m
default redis-master master redis name=redis-master 1 46m
default redis-slave worker kubernetes/redis-slave:v2 name=redis-slave 3 46m
kube-system kube-dns-v9 etcd gcr.io/google_containers/etcd:2.0.9 k8s-app=kube-dns 3 50m
kube2sky gcr.io/google_containers/kube2sky:1.11
skydns gcr.io/google_containers/skydns:2015-10-13-8c72f8c
healthz gcr.io/google_containers/exechealthz:1.0
$ rkubectl exec busybox -- nslookup kubernetes
Server: 10.100.88.88
Address 1: 10.100.88.88
nslookup: can't resolve 'kubernetes'
error: error executing remote command: Error executing command in container: Error executing in Docker Container: 1
$ rkubectl exec busybox -- ping -w 1 10.100.88.88
PING 10.100.88.88 (10.100.88.88): 56 data bytes
--- 10.100.88.88 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
error: error executing remote command: Error executing command in container: Error executing in Docker Container: 1
$ rkubectl exec busybox -- route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.17.42.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
$ rkubectl exec busybox -- ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:1A
inet addr:172.17.0.26 Bcast:0.0.0.0 Mask:255.255.0.0
$ rkubectl exec busybox -- cat /etc/resolv.conf
nameserver 10.100.88.88
nameserver 8.8.8.8
nameserver 8.8.4.4
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5