I have a k8s service/deployment in a minikube cluster (name amq in default namespace:
D20181472:argo-k8s gms$ kubectl get svc --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argo argo-ui ClusterIP 10.97.242.57 <none> 80/TCP 5h19m
default amq LoadBalancer 10.102.205.126 <pending> 61616:32514/TCP 4m4s
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5h23m
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 5h23m
I spun up infoblox/dnstools, and tried nslookup, dig and ping of amq.default with the following results:
dnstools# nslookup amq.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: amq.default.svc.cluster.local
Address: 10.102.205.126
dnstools# ping amq.default
PING amq.default (10.102.205.126): 56 data bytes
^C
--- amq.default ping statistics ---
28 packets transmitted, 0 packets received, 100% packet loss
dnstools# dig amq.default
; <<>> DiG 9.11.3 <<>> amq.default
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 15104
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;amq.default. IN A
;; Query time: 32 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Sat Jan 26 01:58:13 UTC 2019
;; MSG SIZE rcvd: 29
dnstools# ping amq.default
PING amq.default (10.102.205.126): 56 data bytes
^C
--- amq.default ping statistics ---
897 packets transmitted, 0 packets received, 100% packet loss
(NB: pinging the ip address directly gives the same result)
I admittedly am not very knowledgable about the deep workings of DNS, so I am not sure why I can do a lookup and dig for the hostname, but not ping it.
I admittedly am not very knowledgable about the deep workings of DNS, so I am not sure why I can do a lookup and dig for the hostname, but not ping it.
Because Service IP addresses are figments of your cluster's imagination, caused by either iptables or ipvs, and don't actually exist. You can see them with iptables -t nat -L -n on any Node that is running kube-proxy (or ipvsadm -ln), as is described by the helpful Debug[-ing] Services page
Since they are not real IPs bound to actual NICs, they don't respond to any traffic other than the port numbers registered in the Service resource. The correct way of testing connectivity against a service is with something like curl or netcat and using the port number upon which you are expecting application traffic to travel.
That’s because the service’s cluster IP is a virtual IP, and only has meaning when combined with the service port.
Whenever a service gets created by API server a Virtual IP address is assigned to it immediately and after that, the API server notifies all kube-proxy agents running on the worker nodes that a new Service has been created. Then, It's kube-proxy's job to make that service addressable on the node it’s running on. kube-proxy does this by setting up a few iptables rules, which make sure each packet destined for the service IP/port pair is intercepted and its destination address modified, so the packet is redirected to one of the pods backing the service.
IPs and VIPs
Related
I have a pod deployed named 'sample_pod' in rancher cluster having a container named 'sample_container'. The sample pod has a service named 'test'. Inside the sample_container, if I try to resolve the cluster domain names using 'host' or 'dig' or 'nslookup' command, I am always getting connection refused; no servers could be reached.
I have coredns pods running inside my cluster
user#abc$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7fbff695b4-f7vxc 1/1 Running 0 21h
canal-928m6 2/2 Running 0 21h
canal-d7vjr 2/2 Running 0 20h
coredns-6f85d5fb88-9txmx 1/1 Running 0 21h
coredns-autoscaler-79599b9dc6-ndgfj 1/1 Running 0 21h
kube-multus-ds-769n6 1/1 Running 0 20h
metrics-server-8449844bf-jz66w 1/1 Running 0 21h
rke-coredns-addon-deploy-job-dlvlh 0/1 Completed 0 21h
rke-ingress-controller-deploy-job-jcj6w 0/1 Completed 0 21h
rke-metrics-addon-deploy-job-wnhbq 0/1 Completed 0 21h
rke-network-plugin-deploy-job-wzqfb 0/1 Completed 0 21h
whereabouts-p6vcc 1/1 Running 0 20h
I am not touching the default Corefile of coredns
Corefile:
.:53 {
log
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . "/etc/resolv.conf"
cache 30
loop
reload
loadbalance
}
/etc/hosts file of sample_container:
[root#sample_container]# cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.42.1.18 sample_pod
# Entries added by HostAliases.
127.0.0.1 localhost
10.94.66.8 netboot.com
/etc/resolv.conf of sample_container:
[root#sample_container]# cat /etc/resolv.conf
nameserver 10.43.0.10
search default.svc.cluster.local svc.cluster.local cluster.local openstacklocal
options ndots:5
Host or dig command I used to resolve following domains and got the error:
[root#sample_container]# ping 10.43.0.10
PING 10.43.0.10 (10.43.0.10) 56(84) bytes of data.
^C
--- 10.43.0.10 ping statistics ---
99 packets transmitted, 0 received, 100% packet loss, time 98003ms
[root#sample_container]# host kube-dns.kube-system
;; connection timed out; no servers could be reached
[root#sample_container]# host localhost
;; connection timed out; no servers could be reached
I tried to resolve test service in the default namespace (where sample_container, sample_pod resides in same namespace)
[root#sample_container]# host test
;; connection timed out; no servers could be reached
dig or nslookup command also returns same
[root#sample_container]# nslookup localhost
;; connection timed out; no servers could be reached
[root#sample_container]# dig localhost
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.8 <<>> localhost
;; global options: +cmd
;; connection timed out; no servers could be reached
Additional information on pod ip and service ip:
root#user$ kubectl get all -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/sample_pod 1/1 Running 0 177m 10.42.1.18 dsc-worker-node <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/test ClusterIP 10.43.19.85 <none> 80/TCP,443/TCP 177m role=test
Note: I deployed this pod such a way that some containers will access the baremetal machine to serve its purpose. And I need to achieve forwarding certain domain names to that baremetal server which will reply for that dns query. Also I am aware of forward plugin which does this job. But without touching the Corefile, I am unable to reach coredns for cluster domain names itself.
Could someone help me to solve this issue? It would be really helpful for me. Thanks in advance!!!
I solved this issue after changing the route. By default, the dns queries are sent to kubernetes nameserver via private interface instead of sending via default gateway (public interface). After changing the route to make dns queries to be sent via default gateway, it was solved.
Recently I have created private AKS via Terraform, every thing went OK, how is it possible that two pods within the same namespace are unable to communicate with each other?
AKS version= 1.19.11
coredns:1.6.6
# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 5d18h
Cluster has been created with below resources:
Network type (plugin)=Kubenet
Pod CIDR=10.x.x.x/16
Service CIDR=10.x.x.0/16
DNS service IP=10.x.x.10
Docker bridge CIDR=172.x.x.1/16
Network Policy=Calico
Ping response:
/ # ping 10.x.x.89
PING 10.x.x.89 (10.x.x.89): 56 data bytes
^C
--- 10.x.x.89 ping statistics ---
25 packets transmitted, 0 packets received, 100% packet loss
/ # ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1): 56 data bytes
64 bytes from 10.0.0.1: seq=0 ttl=241 time=27.840 ms
64 bytes from 10.0.0.1: seq=1 ttl=241 time=28.790 ms
64 bytes from 10.0.0.1: seq=2 ttl=241 time=28.725 ms
^C
--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 27.840/28.451/28.790 ms
/ # ping kubernetes
ping: bad address 'kubernetes'
/ # nslookup kubernetes
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'kubernetes': Name does not resolve
/ #
Network policy was the issue
Kubectl get netpol -n namespace
I have build new Kubernetes cluster v1.20.1 single master and single node with Calico CNI.
I deployed the busybox pod in default namespace.
# kubectl get pods busybox -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 12m 10.203.0.129 node02 <none> <none>
nslookup not working
kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'kubernetes.default'
cluster is running RHEL 8 with latest update
followed this steps: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
nslookup command not able to reach nameserver
# kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
command terminated with exit code 1
resolve.conf file
# kubectl exec -ti dnsutils -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5
DNS pods running
# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-74ff55c5b-472vx 1/1 Running 1 85m
coredns-74ff55c5b-c75bq 1/1 Running 1 85m
DNS pod logs
kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
Service is defined
# kubectl get svc --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 86m
**I can see the endpoints of DNS pod**
# kubectl get endpoints kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.203.0.5:53,10.203.0.6:53,10.203.0.5:53 + 3 more... 86m
enabled the logging, but didn't see traffic coming to DNS pod
# kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
I can ping DNS POD
# kubectl exec -i -t dnsutils -- ping 10.203.0.5
PING 10.203.0.5 (10.203.0.5): 56 data bytes
64 bytes from 10.203.0.5: seq=0 ttl=62 time=6.024 ms
64 bytes from 10.203.0.5: seq=1 ttl=62 time=6.052 ms
64 bytes from 10.203.0.5: seq=2 ttl=62 time=6.175 ms
64 bytes from 10.203.0.5: seq=3 ttl=62 time=6.000 ms
^C
--- 10.203.0.5 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 6.000/6.062/6.175 ms
nmap show port filtered
# ke netshoot-6f677d4fdf-5t5cb -- nmap 10.203.0.5
Starting Nmap 7.80 ( https://nmap.org ) at 2021-01-15 22:29 UTC
Nmap scan report for 10.203.0.5
Host is up (0.0060s latency).
Not shown: 997 closed ports
PORT STATE SERVICE
53/tcp filtered domain
8080/tcp filtered http-proxy
8181/tcp filtered intermapper
Nmap done: 1 IP address (1 host up) scanned in 14.33 seconds
If I schedule the POD on master node, nslookup works nmap show port open?
# ke netshoot -- bash
bash-5.0# nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
nmap -p 53 10.96.0.10
Starting Nmap 7.80 ( https://nmap.org ) at 2021-01-15 22:46 UTC
Nmap scan report for kube-dns.kube-system.svc.cluster.local (10.96.0.10)
Host is up (0.000098s latency).
PORT STATE SERVICE
53/tcp open domain
Nmap done: 1 IP address (1 host up) scanned in 0.14 seconds
Why nslookup from POD running on worker node is not working? how to troubleshoot this issue?
I re-build the server two times, still same issue.
Thanks
SR
Update adding kubeadm config file
# cat kubeadm-config.yaml
---
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
kubeletExtraArgs:
cgroup-driver: "systemd"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "master01:6443"
networking:
dnsDomain: cluster.local
podSubnet: 10.0.0.0/14
serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs
"
First of all, according to the docs - please note that Calico and kubeadm support Centos/RHEL 7+.
In both Calico and kubeadm documentation we can see that they only support RHEL7+.
By default RHEL8 uses nftables instead of iptables ( we can still use iptables but "iptables" on RHEL8 is actually using the kernel's nft framework in the background - look at "Running Iptables on RHEL 8").
9.2.1. nftables replaces iptables as the default network packet filtering framework
I believe that nftables may cause this network issues because as we can find on nftables adoption page:
Kubernetes does not support nftables yet.
Note: For now I highly recommend you to use RHEL7 instead of RHEL8.
With that in mind, I'll present some information that may help you with RHEL8.
I have reproduced your issue and found a solution that works for me.
First I opened ports required by Calico - these ports can be found
here under "Network requirements".
As workaround:
Next I reverted to the old iptables backend on all cluster
nodes, you can easily do so by setting FirewallBackend in
/etc/firewalld/firewalld.conf to iptables as described
here.
Finally I restarted firewalld to make the new rules active.
I've tried nslookup from Pod running on worker node (kworker) and it seems to work correctly.
root#kmaster:~# kubectl get pod,svc -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/web 1/1 Running 0 112s 10.99.32.1 kworker <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.99.0.1 <none> 443/TCP 5m51s <none>
root#kmaster:~# kubectl exec -it web -- bash
root#web:/# nslookup kubernetes.default
Server: 10.99.0.10
Address: 10.99.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.99.0.1
root#web:/#
In my situation, we're using the K3S cluster. And the new agent couldn't make the default(ClusterFirst) DNS query. After lots of research, I found I need to change the kube-proxy cluster-cidr args to make the DNS work successfully.
Hope this info is useful for others.
I ran into the same issue setting up a vanilla kubeadm 1.25 cluster on RHEL8 and #matt_j's answer lead me to another solution that avoids nftables by using ipvs mode in kube-proxy.
Just modify the kube-proxy ConfigMap in kube-system namespace so the config.conf file has this value;
...
data:
config.conf:
...
mode: "ipvs"
...
And ensure kube-proxy or your nodes are restarted.
I could not access my application from the k8s cluster.
With nodePort everything works. If I use ingress controller, I could see that it is created successfully. I am able to ping IP. If I try to telnet, it says connection refused. I am also unable to access the application. What do i miss? I do not see any exception in the ingress pod.
kubectl get ing -n test
NAME CLASS HOSTS ADDRESS PORTS AGE
web-ingress <none> * 192.168.0.102 80 44m
ping 192.168.0.102
PING 192.168.0.102 (192.168.0.102) 56(84) bytes of data.
64 bytes from 192.168.0.102: icmp_seq=1 ttl=64 time=0.795 ms
64 bytes from 192.168.0.102: icmp_seq=2 ttl=64 time=0.860 ms
64 bytes from 192.168.0.102: icmp_seq=3 ttl=64 time=0.631 ms
^C
--- 192.168.0.102 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2038ms
rtt min/avg/max/mdev = 0.631/0.762/0.860/0.096 ms
telnet 192.168.0.102 80
Trying 192.168.0.102...
telnet: Unable to connect to remote host: Connection refused
kubectl get all -n ingress-nginx
shows this
NAME READY STATUS RESTARTS AGE
pod/ingress-nginx-admission-create-htvkh 0/1 Completed 0 99m
pod/ingress-nginx-admission-patch-cf8gj 0/1 Completed 0 99m
pod/ingress-nginx-controller-7fd7d8df56-kll4v 1/1 Running 0 99m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ingress-nginx-controller NodePort 10.102.220.87 <none> 80:31692/TCP,443:32736/TCP 99m
service/ingress-nginx-controller-admission ClusterIP 10.106.159.230 <none> 443/TCP 99m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ingress-nginx-controller 1/1 1 1 99m
NAME DESIRED CURRENT READY AGE
replicaset.apps/ingress-nginx-controller-7fd7d8df56 1 1 1 99m
NAME COMPLETIONS DURATION AGE
job.batch/ingress-nginx-admission-create 1/1 7s 99m
job.batch/ingress-nginx-admission-patch 1/1 8s 99m
Answer
The IP from kubectl get ing -n test is not an externally accessible address that you should be using.
Your NGINX Ingress Controller Deployment has a Service deployed alongside it. You can use the external IP of this Service (if it has one) to hit your Ingress Controller.
Because your Service is of NodePort type (and does not show an external IP), you must address the Ingress Controller Pods through your cluster's Node IPs. You would need to track which Node the Pod is on, then find the Node's IP. Here is an example of doing this:
NODE=$(kubectl get pods -o wide | grep "ingress-nginx-controller" | awk {'print $8'})
NODE_IP=$(kubectl get nodes "$NODE" -o wide | grep Ready | awk {'print $7'})
More Info
If your cluster is managed (i.e. GKE/Azure/AWS), you can use a LoadBalancer Service to provide an external IP to hit the Ingress Controller.
In dnsutils pod exec ping stackoverflow.com
/ # ping stackoverflow.com
ping: bad address 'stackoverflow.com'
The /etc/resolve.conf file looks fine from inside the pod
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search weika.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
10.96.0.10 is the kube-dns service ip:
[root#test3 k8s]# kubectl -n kube-system get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 75d
core dns
[root#test3 k8s]# kubectl -n kube-system get pod -o wide | grep core
coredns-6557d7f7d6-5nkv7 1/1 Running 0 10d 10.244.0.14 test3.weikayuninternal.com <none> <none>
coredns-6557d7f7d6-gtrgc 1/1 Running 0 10d 10.244.0.13 test3.weikayuninternal.com <none> <none>
when I change the nameserver ip to coredns ip. resolve dns is ok.
/ # cat /etc/resolv.conf
nameserver 10.244.0.14
#nameserver 10.96.0.10
search weika.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # ping stackoverflow.com
PING stackoverflow.com (151.101.65.69): 56 data bytes
64 bytes from 151.101.65.69: seq=0 ttl=49 time=100.497 ms
64 bytes from 151.101.65.69: seq=1 ttl=49 time=101.014 ms
64 bytes from 151.101.65.69: seq=2 ttl=49 time=100.462 ms
64 bytes from 151.101.65.69: seq=3 ttl=49 time=101.465 ms
64 bytes from 151.101.65.69: seq=4 ttl=49 time=100.318 ms
^C
--- stackoverflow.com ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 100.318/100.751/101.465 ms
/ #
Why is it happening?
You have not mentioned how kubernetes was installed. You should restart coredns pods using below command.
kubectl -n kube-system rollout restart deployment coredns
This might only apply to you if there was trouble during either your initial installation of microk8s or enablement of the dns addon, but it might still be worth a shot. I've invested so much gd time in this I couldn't live with myself if I didn't at least share to help that one person out there.
In my case, the server I provisioned to set up a single-node cluster was too small - only 1GB of memory. When I was setting up microk8s for the first time and enabling all the addons I wanted (dns, ingress, hostpath-storage), I started running into problems that were remedied by just giving the server more memory. Unfortunately though, screwing that up initially seems to have left the addons in some kind of undefined, partially initialized/configured state, such that everything appeared to be running normally as best I could tell (i.e. CoreDNS was deployed and ready, and the kube-dns service showed CoreDNS's ClusterIP as it's backend endpoint) but none of my pods could resolve any DNS names, internal or external to the cluster, and I would get these annoying event logs when I ran kubectl describe <pod> suggesting there was a DNS issue of some kind.
What ended up fixing it is resetting the cluster (microk8s reset --destroy-storage) and then re-enabling all my addons (microk8s enable dns ingress hostpath-storage) now that I had enough memory to do so cleanly do so. After that, CoreDNS and the kube-dns service appeared ready just like before, but DNS queries actually worked like they should from within the pods running in the cluster.
tl;dr; - Your dns addon might have have been f'd up during cluster installation. Try resetting your cluster, re-enabling the addons, and re-deploying your resources.