Pods cant reach kube-dns - kubernetes

I'm newbie at kubernetes.
I set up a local cluster with 1 master and 2 workers (worker1,worker2) using kubeadm and virtualbox.
I chose containerd as my Container Runtime.
I'm facing a issue with networking that it's driving me crazy.
I cant ping any outside address from pods because DNS is not resolving
I used the following to set up the cluster:
kubeadm init --apiserver-advertise-address=10.16.10.10 --apiserver-cert-extra-sans=10.16.10.10 --node-name=master0 --pod-network-cidr=10.244.0.0/16
Swap and SELinux are disabled.
I'm using flannel.
[masterk8s#master0 .kube]$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master0 Ready control-plane,master 3h26m v1.23.1 10.16.10.10 <none> CentOS Linux 7 (Core) 3.10.0-1160.49.1.el7.x86_64 containerd://1.4.12
worker1 Ready <none> 169m v1.23.1 10.16.10.11 <none> CentOS Linux 7 (Core) 3.10.0-1160.49.1.el7.x86_64 containerd://1.4.12
worker2 Ready <none> 161m v1.23.1 10.16.10.12 <none> CentOS Linux 7 (Core) 3.10.0-1160.49.1.el7.x86_64 containerd://1.4.12
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default pod/dnsutils 1/1 Running 1 (59m ago) 119m 10.244.3.2 worker1 <none> <none>
default pod/nginx 1/1 Running 0 11s 10.244.4.2 worker2 <none> <none>
kube-system pod/coredns-64897985d-lnzs7 1/1 Running 0 126m 10.244.0.2 master0 <none> <none>
kube-system pod/coredns-64897985d-vfngl 1/1 Running 0 126m 10.244.0.3 master0 <none> <none>
kube-system pod/etcd-master0 1/1 Running 1 (125m ago) 126m 10.16.10.10 master0 <none> <none>
kube-system pod/kube-apiserver-master0 1/1 Running 1 (125m ago) 126m 10.16.10.10 master0 <none> <none>
kube-system pod/kube-controller-manager-master0 1/1 Running 1 (125m ago) 126m 10.16.10.10 master0 <none> <none>
kube-system pod/kube-flannel-ds-6g4dm 1/1 Running 0 81m 10.16.10.12 worker2 <none> <none>
kube-system pod/kube-flannel-ds-lvgpf 1/1 Running 0 89m 10.16.10.11 worker1 <none> <none>
kube-system pod/kube-flannel-ds-pkm4k 1/1 Running 1 (125m ago) 126m 10.16.10.10 master0 <none> <none>
kube-system pod/kube-proxy-8gnfx 1/1 Running 0 89m 10.16.10.11 worker1 <none> <none>
kube-system pod/kube-proxy-cbws6 1/1 Running 0 81m 10.16.10.12 worker2 <none> <none>
kube-system pod/kube-proxy-fxvm5 1/1 Running 1 (125m ago) 126m 10.16.10.10 master0 <none> <none>
kube-system pod/kube-scheduler-master0 1/1 Running 1 (125m ago) 126m 10.16.10.10 master0 <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 126m <none>
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 126m k8s-app=kube-dns
cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
master:
[masterk8s#master0 .kube]$ ip r
default via 10.0.2.2 dev enp0s3
default via 10.16.10.1 dev enp0s9 proto static metric 102
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 100
10.16.10.0/24 dev enp0s9 proto kernel scope link src 10.16.10.10 metric 102
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
10.244.3.0/24 via 10.244.3.0 dev flannel.1 onlink
10.244.4.0/24 via 10.244.4.0 dev flannel.1 onlink
192.168.56.0/24 dev enp0s8 proto kernel scope link src 192.168.56.100 metric 101
worker1:
[workerk8s#worker1 ~]$ ip r
default via 10.0.2.2 dev enp0s3 proto dhcp metric 100
default via 10.16.10.1 dev enp0s9 proto static metric 102
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 100
10.16.10.0/24 dev enp0s9 proto kernel scope link src 10.16.10.11 metric 102
10.244.0.0/24 via 10.244.0.0 dev flannel.1 onlink
10.244.3.0/24 dev cni0 proto kernel scope link src 10.244.3.1
10.244.4.0/24 via 10.244.4.0 dev flannel.1 onlink
192.168.56.0/24 dev enp0s8 proto kernel scope link src 192.168.56.101 metric 101
I can reach kube-dns cluster-IP from master:
[masterk8s#master0 .kube]$ telnet 10.96.0.10 53
Trying 10.96.0.10...
Connected to 10.96.0.10.
Escape character is '^]'.
But cannot from worker:
[workerk8s#worker1 ~]$ telnet 10.96.0.10 53
Trying 10.96.0.10...
^C
I used dnsutils pod from kubernetes (https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/) to do some tests:
(This pod's been deployed on worker1 but same issue for worker2)
[masterk8s#master0 .kube]$ kubectl exec -i -t dnsutils -- nslookup kubernetes.default
^C
command terminated with exit code 1
[masterk8s#master0 .kube]$ kubectl exec -i -t dnsutils -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local Home
nameserver 10.96.0.10
options ndots:5
There's connection between nodes. But pods on different nodes can't ping each other. Example:
default pod/dnsutils 1/1 Running 1 (59m ago) 119m 10.244.3.2 worker1 <none> <none>
default pod/nginx 1/1 Running 0 11s 10.244.4.2 worker2 <none> <none>
10.244.3.2 is only reachable from worker1 and 10.224.4.2 only reachable from worker2.
My guessing is there's something wrong with kube-proxy but don't know what it could be.
I can't see any errors in pod logs.
Any suggestions?
Thanks
EDITED:
SOLVED
Flannel was using wrong interface, as my nodes have 3 network interfaces, I specified the correct one with --iface
name: kube-flannel
image: quay.io/coreos/flannel:v0.15.1
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=enp0s9
Also realized firewalld was blocking requests to DNS, and solved that adding (How can I use Flannel without disabing firewalld (Kubernetes)):
firewall-cmd --add-masquerade --permanent

Related

Rancher: kubernetes cluster stuck in pending. "No route to host"

I built a kubernetes cluster on CentOS 8 first. I followed the how-to found here: https://www.tecmint.com/install-a-kubernetes-cluster-on-centos-8/
And then I built an Ubuntu 18.04 VM and installed Rancher on it. I can access the Rancher website just fine and all appears to be working on the rancher side, except I can't add my kubernetes cluster to it.
When I use the "Add Cluster" feature, I chose the "Other Cluster" option, give it a name, and then click create. I then copy the insecure "Cluster Registration Command" to the master node. It appears to take the command just fine.
In troubleshooting, I've issued the following command: kubectl -n cattle-system logs -l app=cattle-cluster-agent
The output I get is as follows:
INFO: Environment: CATTLE_ADDRESS=10.42.0.1 CATTLE_CA_CHECKSUM=94ad10e756d390cdf8b25465f938c04344a396b16b4ff6c0922b9cd6b9fc454c CATTLE_CLUSTER=true CATTLE_CLUSTER_REGISTRY= CATTLE_FEATURES= CATTLE_INTERNAL_ADDRESS= CATTLE_IS_RKE=false CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-7b9df685cf-9kr4p CATTLE_SERVER=https://192.168.188.189:8443
INFO: Using resolv.conf: nameserver 10.96.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local options ndots:5
ERROR: https://192.168.188.189:8443/ping is not accessible (Failed to connect to 192.168.188.189 port 8443: No route to host)
INFO: Environment: CATTLE_ADDRESS=10.40.0.0 CATTLE_CA_CHECKSUM=94ad10e756d390cdf8b25465f938c04344a396b16b4ff6c0922b9cd6b9fc454c CATTLE_CLUSTER=true CATTLE_CLUSTER_REGISTRY= CATTLE_FEATURES= CATTLE_INTERNAL_ADDRESS= CATTLE_IS_RKE=false CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-7bc7687557-tkvzt CATTLE_SERVER=https://192.168.188.189:8443
INFO: Using resolv.conf: nameserver 10.96.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local options ndots:5
ERROR: https://192.168.188.189:8443/ping is not accessible (Failed to connect to 192.168.188.189 port 8443: No route to host)
[root#k8s-master ~]# ping 192.168.188.189
PING 192.168.188.189 (192.168.188.189) 56(84) bytes of data.
64 bytes from 192.168.188.189: icmp_seq=1 ttl=64 time=0.432 ms
64 bytes from 192.168.188.189: icmp_seq=2 ttl=64 time=0.400 ms
^C
--- 192.168.188.189 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.400/0.416/0.432/0.016 ms
As you can see I'm getting a "No route to host" error message. But, I can ping the rancher VM using its IP address.
It appears to be attempting to use resolv.conf inside the cluster and looking to use a 10.96.0.10 to resolve the ip address of 192.168.188.189 (my Rancher VM). But it appears to be failing to resolve it.
I'm thinking I have some sort of DNS issue that's preventing me from using hostnames. Though I've edited the /etc/hosts file on the master and worker nodes to include entries for each of the devices. I can ping devices using their hostname, but I can't reach a pod using :. I get a "No route to host" error message when I try that too. See here:
[root#k8s-master ~]# ping k8s-worker1
PING k8s-worker1 (192.168.188.191) 56(84) bytes of data.
64 bytes from k8s-worker1 (192.168.188.191): icmp_seq=1 ttl=64 time=0.478 ms
64 bytes from k8s-worker1 (192.168.188.191): icmp_seq=2 ttl=64 time=0.449 ms
^C
--- k8s-worker1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.449/0.463/0.478/0.025 ms
[root#k8s-master ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-world NodePort 10.103.5.49 <none> 8080:30370/TCP 45m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 26h
nginx NodePort 10.97.172.245 <none> 80:30205/TCP 3h43m
[root#k8s-master ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-world-7884c6997d-2dc9z 1/1 Running 0 28m 10.40.0.4 k8s-worker3 <none> <none>
hello-world-7884c6997d-562lh 1/1 Running 0 28m 10.35.0.8 k8s-worker2 <none> <none>
hello-world-7884c6997d-78dmm 1/1 Running 0 28m 10.36.0.3 k8s-worker1 <none> <none>
hello-world-7884c6997d-7vt4f 1/1 Running 0 28m 10.40.0.6 k8s-worker3 <none> <none>
hello-world-7884c6997d-bpq5g 1/1 Running 0 49m 10.36.0.2 k8s-worker1 <none> <none>
hello-world-7884c6997d-c529d 1/1 Running 0 28m 10.35.0.6 k8s-worker2 <none> <none>
hello-world-7884c6997d-ddk7k 1/1 Running 0 28m 10.36.0.5 k8s-worker1 <none> <none>
hello-world-7884c6997d-fq8hx 1/1 Running 0 28m 10.35.0.7 k8s-worker2 <none> <none>
hello-world-7884c6997d-g5lxs 1/1 Running 0 28m 10.40.0.3 k8s-worker3 <none> <none>
hello-world-7884c6997d-kjb7f 1/1 Running 0 49m 10.35.0.3 k8s-worker2 <none> <none>
hello-world-7884c6997d-nfdpc 1/1 Running 0 28m 10.40.0.5 k8s-worker3 <none> <none>
hello-world-7884c6997d-nnd6q 1/1 Running 0 28m 10.36.0.7 k8s-worker1 <none> <none>
hello-world-7884c6997d-p6gxh 1/1 Running 0 49m 10.40.0.1 k8s-worker3 <none> <none>
hello-world-7884c6997d-p7v4b 1/1 Running 0 28m 10.35.0.4 k8s-worker2 <none> <none>
hello-world-7884c6997d-pwpxr 1/1 Running 0 28m 10.36.0.4 k8s-worker1 <none> <none>
hello-world-7884c6997d-qlg9h 1/1 Running 0 28m 10.40.0.2 k8s-worker3 <none> <none>
hello-world-7884c6997d-s89c5 1/1 Running 0 28m 10.35.0.5 k8s-worker2 <none> <none>
hello-world-7884c6997d-vd8ch 1/1 Running 0 28m 10.40.0.7 k8s-worker3 <none> <none>
hello-world-7884c6997d-wvnh7 1/1 Running 0 28m 10.36.0.6 k8s-worker1 <none> <none>
hello-world-7884c6997d-z57kx 1/1 Running 0 49m 10.36.0.1 k8s-worker1 <none> <none>
nginx-6799fc88d8-gm5ls 1/1 Running 0 4h11m 10.35.0.1 k8s-worker2 <none> <none>
nginx-6799fc88d8-k2jtw 1/1 Running 0 4h11m 10.44.0.1 k8s-worker1 <none> <none>
nginx-6799fc88d8-mc5mz 1/1 Running 0 4h12m 10.36.0.0 k8s-worker1 <none> <none>
nginx-6799fc88d8-qn6mh 1/1 Running 0 4h11m 10.35.0.2 k8s-worker2 <none> <none>
[root#k8s-master ~]# curl k8s-worker1:30205
curl: (7) Failed to connect to k8s-worker1 port 30205: No route to host
I suspect this is the underlying reason why I can't join the cluster to rancher.
EDIT: I want to add additional details to this question. Each of my nodes (master & worker nodes) have the following ports open on the firewall:
firewall-cmd --list-ports
6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp 10255/tcp 6783/tcp 6783/udp 6784/udp
For the CNI, the Kubernetes cluster is using Weavenet.
Each node (master & worker) is configured to use my main home DNS server (which is also an active directory domain controller) in their networking configuration. I've created AAA records for each node in the DNS server. The nodes are NOT joined to the domain. However, I've also edited each node's /etc/hosts file to contain the following records:
# more /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.188.190 k8s-master
192.168.188.191 k8s-worker1
192.168.188.192 k8s-worker2
192.168.188.193 k8s-worker3
I've found that I CAN use "curl k8s-worker1.mydomain.com:30370" with about 33% success. But I would have thought that the /etc/hosts file would take precedence over using my home DNS server.
And finally, I noticed an additional anomaly. I've discovered that the cluster is not load balancing across the three worker nodes. As shown above, I'm running a deployment called "hello-world" based on the bashofmann/rancher-demo image with 20 replicas. I've also created a nodeport service for hello-world that maps nodeport 30370 to port 8080 on each respective pod.
If I open my web browser and go to http://192.168.188.191:30370 then it'll load the website but only served up by pods on k8s-worker1. It'll never load the website served up by any pods on any of the other worker nodes. This would explain why I only get ~33% success, as long as it's served up by the same worker node that I've specified in my url.
OP confirmed that the issue is found to be due to firewall rules. This was debugged by disabling the firewall, that lead to desired operation(cluster addition) to be successful.
In order to nodePort service to work properly, port range 30000 - 32767 should be reachable on all the nodes of the cluster.
I also found that disabling the firewall "fixes" the issue, but that's not a great fix. Also, adding ports 30000-32767 for tcp/udp didn't work for me. Still no route to host.

K8s pods running in diffrent node can't communicate with each other

I have k8s cluster with two node, master and worker node, installed Calico.
I initialized cluster and installed calico with following commands
# Initialize cluster
kubeadm init --apiserver-advertise-address=<MatserNodePublicIP> --pod-network-cidr=192.168.0.0/16
# Install Calico. Refer to official document
# https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
After that, I found pods running in different node can't communicate with each other, but pods running in same node can communicate with each other.
Here are my operations:
# With following command, I ran a nginx pod scheduled to worker node
# and assigned pod id 192.168.199.72
kubectl create nginx --image=nginx
# With following command, I ran a busybox pod scheduled to master node
# and assigned pod id 192.168.119.197
kubectl run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox sh
# In busybox bash, I executed following command
# '>' represents command output
wget 192.168.199.72
> Connecting to 192.168.199.72 (192.168.199.72:80)
> wget: can't connect to remote host (192.168.199.72): Connection timed out
However, if nginx pod run in master node (same as busybox), the wget would output a correct welcome html.
(For scheduling nginx pod to master node, I cordon worker node, and restarted nginx pod)
I also tried to schedule nginx and busybox pod to worker node, the wget ouput is a correct welcome html.
Here are my cluster status, everything looks fine. I searched all I can find but couldn't find solution.
matser and worker node can ping each other with private IP.
For firewall
systemctl status firewalld
> Unit firewalld.service could not be found.
For node infomation
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
pro-con-scrapydmanager Ready control-plane,master 26h v1.21.2 10.120.0.5 <none> CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://20.10.5
pro-con-scraypd-01 Ready,SchedulingDisabled <none>
For pod infomation
kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default busybox 0/1 Error 0 24h 192.168.199.72 pro-con-scrapydmanager <none> <none>
default nginx 1/1 Running 1 26h 192.168.119.197 pro-con-scraypd-01 <none> <none>
kube-system calico-kube-controllers-78d6f96c7b-msrdr 1/1 Running 1 26h 192.168.199.77 pro-con-scrapydmanager <none> <none>
kube-system calico-node-gjhwh 1/1 Running 1 26h 10.120.0.2 pro-con-scraypd-01 <none> <none>
kube-system calico-node-x8d7g 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system coredns-558bd4d5db-mllm5 1/1 Running 1 26h 192.168.199.78 pro-con-scrapydmanager <none> <none>
kube-system coredns-558bd4d5db-whfnn 1/1 Running 1 26h 192.168.199.75 pro-con-scrapydmanager <none> <none>
kube-system etcd-pro-con-scrapydmanager 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-apiserver-pro-con-scrapydmanager 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-controller-manager-pro-con-scrapydmanager 1/1 Running 2 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-proxy-84cxb 1/1 Running 2 26h 10.120.0.2 pro-con-scraypd-01 <none> <none>
kube-system kube-proxy-nj2tq 1/1 Running 2 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-scheduler-pro-con-scrapydmanager 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
lens-metrics kube-state-metrics-78596b555-zxdst 1/1 Running 1 26h 192.168.199.76 pro-con-scrapydmanager <none> <none>
lens-metrics node-exporter-ggwtc 1/1 Running 1 26h 192.168.199.73 pro-con-scrapydmanager <none> <none>
lens-metrics node-exporter-sbz6t 1/1 Running 1 26h 192.168.119.196 pro-con-scraypd-01 <none> <none>
lens-metrics prometheus-0 1/1 Running 1 26h 192.168.199.74 pro-con-scrapydmanager <none> <none>
For services
kubectl get services -o wide --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 26h <none>
default nginx ClusterIP 10.99.117.158 <none> 80/TCP 24h run=nginx
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 26h k8s-app=kube-dns
lens-metrics kube-state-metrics ClusterIP 10.104.32.63 <none> 8080/TCP 26h name=kube-state-metrics
lens-metrics node-exporter ClusterIP None <none> 80/TCP 26h name=node-exporter,phase=prod
lens-metrics prometheus ClusterIP 10.111.86.164 <none> 80/TCP 26h name=prometheus
Ok. It's fault of firewall. I opened all of the following ports on my master node and recreated my cluster, then everything got fine and cni0 interface appeared. Although I still don't know why.
During the proccessing of tring, I find cni0 interface is important. If there is no cni0, I could not ping pod running in diffrent node.
(Refer: https://docs.projectcalico.org/getting-started/bare-metal/requirements)
Configuration Host(s) Connection type Port/protocol
Calico networking (BGP) All Bidirectional TCP 179
Calico networking with IP-in-IP enabled (default) All Bidirectional IP-in-IP, often represented by its protocol number 4
Calico networking with VXLAN enabled All Bidirectional UDP 4789
Calico networking with Typha enabled Typha agent hosts Incoming TCP 5473 (default)
flannel networking (VXLAN) All Bidirectional UDP 4789
All kube-apiserver host Incoming Often TCP 443 or 6443*
etcd datastore etcd hosts Incoming Officially TCP 2379 but can vary

Pods not accessible (timeout) on 3 Node cluster created in AWS ec2 from master

I have 3 node cluster in AWS ec2 (Centos 8 ami).
When I try to access pods scheduled on worker node from master:
kubectl exec -it kube-flannel-ds-amd64-lfzpd -n kube-system /bin/bash
Error from server: error dialing backend: dial tcp 10.41.12.53:10250: i/o timeout
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-54ff9cd656-8mpbx 1/1 Running 2 7d21h 10.244.0.7 master <none> <none>
kube-system coredns-54ff9cd656-xcxvs 1/1 Running 2 7d21h 10.244.0.6 master <none> <none>
kube-system etcd-master 1/1 Running 2 7d21h 10.41.14.198 master <none> <none>
kube-system kube-apiserver-master 1/1 Running 2 7d21h 10.41.14.198 master <none> <none>
kube-system kube-controller-manager-master 1/1 Running 2 7d21h 10.41.14.198 master <none> <none>
kube-system kube-flannel-ds-amd64-8zgpw 1/1 Running 2 7d21h 10.41.14.198 master <none> <none>
kube-system kube-flannel-ds-amd64-lfzpd 1/1 Running 2 7d21h 10.41.12.53 worker1 <none> <none>
kube-system kube-flannel-ds-amd64-nhw5j 1/1 Running 2 7d21h 10.41.15.9 worker3 <none> <none>
kube-system kube-flannel-ds-amd64-s6nms 1/1 Running 2 7d21h 10.41.15.188 worker2 <none> <none>
kube-system kube-proxy-47s8k 1/1 Running 2 7d21h 10.41.15.9 worker3 <none> <none>
kube-system kube-proxy-6lbvq 1/1 Running 2 7d21h 10.41.15.188 worker2 <none> <none>
kube-system kube-proxy-vhmfp 1/1 Running 2 7d21h 10.41.14.198 master <none> <none>
kube-system kube-proxy-xwsnk 1/1 Running 2 7d21h 10.41.12.53 worker1 <none> <none>
kube-system kube-scheduler-master 1/1 Running 2 7d21h 10.41.14.198 master <none> <none>
kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 7d21h v1.13.10
worker1 Ready <none> 7d21h v1.13.10
worker2 Ready <none> 7d21h v1.13.10
worker3 Ready <none> 7d21h v1.13.10
I tried below steps in all nodes, but no luck so far:
iptables -w -P FORWARD ACCEPT on all nodes
Turn on Masquerade
Turn on port 10250/tcp
Turn on port 8472/udp
Start kubelet
Any pointer would be helpful.
Flannel does not support NFT, and since you are using CentOS 8, you can't fallback to iptables.
Your best bet in this situation would be to switch to Calico.
You have to update Calico DaemonSet with:
....
Environment:
FELIX_IPTABLESBACKEND: NFT
....
or use version 3.12 or newer, as it adds
Autodetection of iptables backend
Previous versions of Calico required you to specify the host’s iptables backend (one of NFT or Legacy). With this release, Calico can now autodetect the iptables variant on the host by setting the Felix configuration parameter IptablesBackend to Auto. This is useful in scenarios where you don’t know what the iptables backend might be such as in mixed deployments. For more information, see the documentation for iptables dataplane configuration
Or switch to Ubuntu 20.04. Ubuntu doesn't use nftables yet.
Issue was because of inbound port in SG.I added these ports in SG I am able to get pass that issue.
2222
24007
24008
49152-49251
My original installer script does not need to follow above steps while running on VMs and standalone machine.
As SG is specific to EC2, so port should be allowed in inbound.
Point to note here is all my nodes(master and worker) are on same SG. even then port hast to be opened in inbound rule, that's the way SG works.

Virtualbox kubernetes NodePort access

Morning,
I have a simple nginx setup that is using NodePort to access on an alternate port 30000. I cannot seem to figure out how to actually access it on my workstation that has the virtualbox installed.
Some basic stuff:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP
25h
nginx-55bc597fbf-zb2ml ClusterIP 10.101.124.73 <none> 8080/TCP
24h
nginx-service-np NodePort 10.105.157.230 <none>
8082:30000/TCP 22h
user-login-service NodePort 10.106.129.60 <none>
5000:31395/TCP 38m
I am using flannel
kubectl cluster-info
Kubernetes master is running at https://192.168.56.101:6443
KubeDNS is running at https://192.168.56.101:6443/api/v1/namespaces/kube-
system/services/kube-dns:dns/proxy
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 25h v1.15.1
k8s-worker1 Ready <none> 94m v1.15.1
k8s-worker2 Ready <none> 98m v1.15.1
I did port forwarding for NAT where it is supposed to forward 30000 to 80 and also did 31395 to 31396 for the user-login-service
Trying to access using master ip https://192.168.56.101:80 or https://192.168.56.101:31396 fails. I did try http as well, but cluster-info seems to show master using https and kubernetes is using 443/tcp.
There are two adapters for master and the workers. One adapter is NAT and used to allow flow of traffic outbound (e.g., for use with apt-get commands)
This seems to use 10.0.3.15 address assigned to all three nodes
The other adapter is host-ip and is what is giving the servers addresses in the 192.168.56.0 network. I did set those as static using netplan.
The three servers can see each other fine. I can do external traffic fine.
/etc/netplan# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-4xg8h 1/1 Running 17
120m
coredns-5c98db65d4-xn797 1/1 Running 17
120m
etcd-k8s-master 1/1 Running 8 25h
kube-apiserver-k8s-master 1/1 Running 8 25h
kube-controller-manager-dashap-k8s-master 1/1 Running 12 25h
kube-flannel-ds-amd64-6fw7x 1/1 Running 0 25h
kube-flannel-ds-amd64-hd4ng 1/1 Running 0
122m
kube-flannel-ds-amd64-z2wls 1/1 Running 0
126m
kube-proxy-g8k5l 1/1 Running 0 25h
kube-proxy-khn67 1/1 Running 0
126m
kube-proxy-zsvqs 1/1 Running 0
122m
kube-scheduler-k8s-master 1/1 Running 10 25h
weave-net-2l5cs 2/2 Running 0 44m
weave-net-n4zmr 2/2 Running 0 44m
weave-net-v6t74 2/2 Running 0 44m
This is my first setup, so it is hard to troubleshoot for me. Any help on how to reach the the two services using my browser on my workstation and not within the nodes would be appreciated.

Kubernetes Dashborad is not opening

My Master node ip address is 192.168.56.101. there is no node connected to master yet.
master#kmaster:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kmaster Ready master 125m v1.15.1
master#kmaster:~$
When i deployed my kubernetes-dashborad using below command, why running IP Address of kubernetes-dashboard-5c8f9556c4-f2jpz is 192.168.189.6
Similarly the other pods has also different IP address.
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta1/aio/deploy/recommended.yaml
master#kmaster:~$ kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-7bd78b474d-r2bwg 1/1 Running 0 113m 192.168.189.2 kmaster <none> <none>
kube-system calico-node-dsgqt 1/1 Running 0 113m 192.168.56.101 kmaster <none> <none>
kube-system coredns-5c98db65d4-n2wml 1/1 Running 0 114m 192.168.189.3 kmaster <none> <none>
kube-system coredns-5c98db65d4-v5qc8 1/1 Running 0 114m 192.168.189.1 kmaster <none> <none>
kube-system etcd-kmaster 1/1 Running 0 114m 192.168.56.101 kmaster <none> <none>
kube-system kube-apiserver-kmaster 1/1 Running 0 114m 192.168.56.101 kmaster <none> <none>
kube-system kube-controller-manager-kmaster 1/1 Running 0 114m 192.168.56.101 kmaster <none> <none>
kube-system kube-proxy-bgtmr 1/1 Running 0 114m 192.168.56.101 kmaster <none> <none>
kube-system kube-scheduler-kmaster 1/1 Running 0 114m 192.168.56.101 kmaster <none> <none>
kubernetes-dashboard kubernetes-dashboard-5c8f9556c4-f2jpz 1/1 Running 0 107m 192.168.189.6 kmaster <none> <none>
kubernetes-dashboard kubernetes-metrics-scraper-86456cdd8f-w45w2 1/1 Running 0 107m 192.168.189.4 kmaster <none> <none>
master#kmaster:~$
And also not able to access the kubernetes-dashboard UI. i am using the link
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/.
and the link KubeDNS https://192.168.56.101:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy is also not working.
but when trying to access Kubernetes master at https://192.168.56.101:6443 is working.
master#kmaster:~$ kubectl cluster-info
Kubernetes master is running at https://192.168.56.101:6443
KubeDNS is running at https://192.168.56.101:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Any suggestions.
Solution (see comments): Don't mix your physical and overlay network ranges.
Accessing the KubeDNS is only possible with DNS as protocol, not HTTP. If you want to query the DNS service you need to kubectl port-forward, not the HTTP (API) proxy.
If you try to access the dashboard with localhost:8081, you have to run kubectl proxy --port 8081 from your console to setup the proxy between you localhost to the k8s apiserver.
If you want to access dashboard from apiserver directly without the local proxy, try the following url https://192.168.56.101:6443/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy (assuming your service name is kubernetes-dashboard)
You can also run kubectl port-forward svc/kubernetes-dashboard -n kubernetes-dashboard 443, then access the dashboard with https://localhost:443