Recently I have created private AKS via Terraform, every thing went OK, how is it possible that two pods within the same namespace are unable to communicate with each other?
AKS version= 1.19.11
coredns:1.6.6
# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 5d18h
Cluster has been created with below resources:
Network type (plugin)=Kubenet
Pod CIDR=10.x.x.x/16
Service CIDR=10.x.x.0/16
DNS service IP=10.x.x.10
Docker bridge CIDR=172.x.x.1/16
Network Policy=Calico
Ping response:
/ # ping 10.x.x.89
PING 10.x.x.89 (10.x.x.89): 56 data bytes
^C
--- 10.x.x.89 ping statistics ---
25 packets transmitted, 0 packets received, 100% packet loss
/ # ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1): 56 data bytes
64 bytes from 10.0.0.1: seq=0 ttl=241 time=27.840 ms
64 bytes from 10.0.0.1: seq=1 ttl=241 time=28.790 ms
64 bytes from 10.0.0.1: seq=2 ttl=241 time=28.725 ms
^C
--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 27.840/28.451/28.790 ms
/ # ping kubernetes
ping: bad address 'kubernetes'
/ # nslookup kubernetes
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'kubernetes': Name does not resolve
/ #
Network policy was the issue
Kubectl get netpol -n namespace
Related
I've noticed that even failed pods replays to ICMP pings ( pods in not Ready state ). Is there a way to configure CNI ( or Kubernetes ) in the way so failed pods didn't generate ICMP replies ?
#kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
multitool-1 1/1 Running 0 20m 172.17.0.3 minikube <none> <none>
multitool-2 0/1 ImagePullBackOff 0 20m 172.17.0.4 minikube <none> <none>
multitool-3 1/1 Running 0 3m9s 172.17.0.5 minikube <none> <none>
#kubectl exec multitool-3 -it bash
bash-5.0# ping 172.17.0.4
PING 172.17.0.4 (172.17.0.4) 56(84) bytes of data.
64 bytes from 172.17.0.4: icmp_seq=1 ttl=64 time=0.048 ms
64 bytes from 172.17.0.4: icmp_seq=2 ttl=64 time=0.107 ms
^C
--- 172.17.0.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1041ms
rtt min/avg/max/mdev = 0.048/0.077/0.107/0.029 ms
bash-5.0#
No, that's not how ICMP works. The kernel handles those, it only checks if the networking interface is operational, which it is regardless of how broken the container process might be.
Implemented on Oracle VM as per the document, but failover didn't work.
https://github.com/justmeandopensource/kubernetes/tree/master/kubeadm-ha-multi-master
Role FQDN IP OS RAM CPU
Load Balancer loadbalancer.example.com172.16.16.100 Ubuntu 20.04 1G 1
Master kmaster1.example.com 172.16.16.101 Ubuntu 20.04 2G 2
Master kmaster2.example.com 172.16.16.102 Ubuntu 20.04 2G 2
Worker kworker1.example.com 172.16.16.201 Ubuntu 20.04 1G 1
Below are the details before shutting down the kmaster1 and after.
root#kmaster2:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kmaster1 Ready master 22h v1.19.2
kmaster2 Ready master 22h v1.19.2
kworker1 Ready <none> 22h v1.19.2
===>shutdown now on master1-========
root#kmaster2:~# kubectl get nodes
Error from server: etcdserver: request timed out
root#kmaster2:~# kubectl get nodes
Error from server: etcdserver: request timed out
root#kmaster2:~# ping 172.16.16.100
PING 172.16.16.100 (172.16.16.100) 56(84) bytes of data.
64 bytes from 172.16.16.100: icmp_seq=1 ttl=64 time=0.580 ms
64 bytes from 172.16.16.100: icmp_seq=2 ttl=64 time=0.716 ms
64 bytes from 172.16.16.100: icmp_seq=3 ttl=64 time=1.08 ms
^C
--- 172.16.16.100 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2053ms
rtt min/avg/max/mdev = 0.580/0.792/1.081/0.211 ms
root#kmaster2:~# kubectl get nodes
Unable to connect to the server: net/http: TLS handshake timeout
root#kmaster2:~# kubectl get nodes
Unable to connect to the server: net/http: TLS handshake timeout
I could not access my application from the k8s cluster.
With nodePort everything works. If I use ingress controller, I could see that it is created successfully. I am able to ping IP. If I try to telnet, it says connection refused. I am also unable to access the application. What do i miss? I do not see any exception in the ingress pod.
kubectl get ing -n test
NAME CLASS HOSTS ADDRESS PORTS AGE
web-ingress <none> * 192.168.0.102 80 44m
ping 192.168.0.102
PING 192.168.0.102 (192.168.0.102) 56(84) bytes of data.
64 bytes from 192.168.0.102: icmp_seq=1 ttl=64 time=0.795 ms
64 bytes from 192.168.0.102: icmp_seq=2 ttl=64 time=0.860 ms
64 bytes from 192.168.0.102: icmp_seq=3 ttl=64 time=0.631 ms
^C
--- 192.168.0.102 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2038ms
rtt min/avg/max/mdev = 0.631/0.762/0.860/0.096 ms
telnet 192.168.0.102 80
Trying 192.168.0.102...
telnet: Unable to connect to remote host: Connection refused
kubectl get all -n ingress-nginx
shows this
NAME READY STATUS RESTARTS AGE
pod/ingress-nginx-admission-create-htvkh 0/1 Completed 0 99m
pod/ingress-nginx-admission-patch-cf8gj 0/1 Completed 0 99m
pod/ingress-nginx-controller-7fd7d8df56-kll4v 1/1 Running 0 99m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ingress-nginx-controller NodePort 10.102.220.87 <none> 80:31692/TCP,443:32736/TCP 99m
service/ingress-nginx-controller-admission ClusterIP 10.106.159.230 <none> 443/TCP 99m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ingress-nginx-controller 1/1 1 1 99m
NAME DESIRED CURRENT READY AGE
replicaset.apps/ingress-nginx-controller-7fd7d8df56 1 1 1 99m
NAME COMPLETIONS DURATION AGE
job.batch/ingress-nginx-admission-create 1/1 7s 99m
job.batch/ingress-nginx-admission-patch 1/1 8s 99m
Answer
The IP from kubectl get ing -n test is not an externally accessible address that you should be using.
Your NGINX Ingress Controller Deployment has a Service deployed alongside it. You can use the external IP of this Service (if it has one) to hit your Ingress Controller.
Because your Service is of NodePort type (and does not show an external IP), you must address the Ingress Controller Pods through your cluster's Node IPs. You would need to track which Node the Pod is on, then find the Node's IP. Here is an example of doing this:
NODE=$(kubectl get pods -o wide | grep "ingress-nginx-controller" | awk {'print $8'})
NODE_IP=$(kubectl get nodes "$NODE" -o wide | grep Ready | awk {'print $7'})
More Info
If your cluster is managed (i.e. GKE/Azure/AWS), you can use a LoadBalancer Service to provide an external IP to hit the Ingress Controller.
In dnsutils pod exec ping stackoverflow.com
/ # ping stackoverflow.com
ping: bad address 'stackoverflow.com'
The /etc/resolve.conf file looks fine from inside the pod
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search weika.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
10.96.0.10 is the kube-dns service ip:
[root#test3 k8s]# kubectl -n kube-system get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 75d
core dns
[root#test3 k8s]# kubectl -n kube-system get pod -o wide | grep core
coredns-6557d7f7d6-5nkv7 1/1 Running 0 10d 10.244.0.14 test3.weikayuninternal.com <none> <none>
coredns-6557d7f7d6-gtrgc 1/1 Running 0 10d 10.244.0.13 test3.weikayuninternal.com <none> <none>
when I change the nameserver ip to coredns ip. resolve dns is ok.
/ # cat /etc/resolv.conf
nameserver 10.244.0.14
#nameserver 10.96.0.10
search weika.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # ping stackoverflow.com
PING stackoverflow.com (151.101.65.69): 56 data bytes
64 bytes from 151.101.65.69: seq=0 ttl=49 time=100.497 ms
64 bytes from 151.101.65.69: seq=1 ttl=49 time=101.014 ms
64 bytes from 151.101.65.69: seq=2 ttl=49 time=100.462 ms
64 bytes from 151.101.65.69: seq=3 ttl=49 time=101.465 ms
64 bytes from 151.101.65.69: seq=4 ttl=49 time=100.318 ms
^C
--- stackoverflow.com ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 100.318/100.751/101.465 ms
/ #
Why is it happening?
You have not mentioned how kubernetes was installed. You should restart coredns pods using below command.
kubectl -n kube-system rollout restart deployment coredns
This might only apply to you if there was trouble during either your initial installation of microk8s or enablement of the dns addon, but it might still be worth a shot. I've invested so much gd time in this I couldn't live with myself if I didn't at least share to help that one person out there.
In my case, the server I provisioned to set up a single-node cluster was too small - only 1GB of memory. When I was setting up microk8s for the first time and enabling all the addons I wanted (dns, ingress, hostpath-storage), I started running into problems that were remedied by just giving the server more memory. Unfortunately though, screwing that up initially seems to have left the addons in some kind of undefined, partially initialized/configured state, such that everything appeared to be running normally as best I could tell (i.e. CoreDNS was deployed and ready, and the kube-dns service showed CoreDNS's ClusterIP as it's backend endpoint) but none of my pods could resolve any DNS names, internal or external to the cluster, and I would get these annoying event logs when I ran kubectl describe <pod> suggesting there was a DNS issue of some kind.
What ended up fixing it is resetting the cluster (microk8s reset --destroy-storage) and then re-enabling all my addons (microk8s enable dns ingress hostpath-storage) now that I had enough memory to do so cleanly do so. After that, CoreDNS and the kube-dns service appeared ready just like before, but DNS queries actually worked like they should from within the pods running in the cluster.
tl;dr; - Your dns addon might have have been f'd up during cluster installation. Try resetting your cluster, re-enabling the addons, and re-deploying your resources.
I have a k8s service/deployment in a minikube cluster (name amq in default namespace:
D20181472:argo-k8s gms$ kubectl get svc --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argo argo-ui ClusterIP 10.97.242.57 <none> 80/TCP 5h19m
default amq LoadBalancer 10.102.205.126 <pending> 61616:32514/TCP 4m4s
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5h23m
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 5h23m
I spun up infoblox/dnstools, and tried nslookup, dig and ping of amq.default with the following results:
dnstools# nslookup amq.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: amq.default.svc.cluster.local
Address: 10.102.205.126
dnstools# ping amq.default
PING amq.default (10.102.205.126): 56 data bytes
^C
--- amq.default ping statistics ---
28 packets transmitted, 0 packets received, 100% packet loss
dnstools# dig amq.default
; <<>> DiG 9.11.3 <<>> amq.default
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 15104
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;amq.default. IN A
;; Query time: 32 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Sat Jan 26 01:58:13 UTC 2019
;; MSG SIZE rcvd: 29
dnstools# ping amq.default
PING amq.default (10.102.205.126): 56 data bytes
^C
--- amq.default ping statistics ---
897 packets transmitted, 0 packets received, 100% packet loss
(NB: pinging the ip address directly gives the same result)
I admittedly am not very knowledgable about the deep workings of DNS, so I am not sure why I can do a lookup and dig for the hostname, but not ping it.
I admittedly am not very knowledgable about the deep workings of DNS, so I am not sure why I can do a lookup and dig for the hostname, but not ping it.
Because Service IP addresses are figments of your cluster's imagination, caused by either iptables or ipvs, and don't actually exist. You can see them with iptables -t nat -L -n on any Node that is running kube-proxy (or ipvsadm -ln), as is described by the helpful Debug[-ing] Services page
Since they are not real IPs bound to actual NICs, they don't respond to any traffic other than the port numbers registered in the Service resource. The correct way of testing connectivity against a service is with something like curl or netcat and using the port number upon which you are expecting application traffic to travel.
That’s because the service’s cluster IP is a virtual IP, and only has meaning when combined with the service port.
Whenever a service gets created by API server a Virtual IP address is assigned to it immediately and after that, the API server notifies all kube-proxy agents running on the worker nodes that a new Service has been created. Then, It's kube-proxy's job to make that service addressable on the node it’s running on. kube-proxy does this by setting up a few iptables rules, which make sure each packet destined for the service IP/port pair is intercepted and its destination address modified, so the packet is redirected to one of the pods backing the service.
IPs and VIPs