DNS does not work correctly in Kubernetes - kubernetes

I am deploying statefulset in my local PC (for doing research) follow this link
In this step:
kubectl run -i --tty --image busybox:1.28 dns-test --restart=Never --rm
nslookup web-0.nginx
I meet this error:
nslookup web-0.nginx
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'web-0.nginx'
My pod and node are still working correctly and my coredns is running correctly
kube-system coredns-fb8b8dccf-hbrhw 1/1 Running 0 26m
kube-system coredns-fb8b8dccf-rmrwp 1/1 Running 0 26m
nguyen#kmaster:~/Documents$ kubectl get --all-namespaces=true -o wide pods
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default busybox 1/1 Running 1 65m 10.244.1.218 knode <none> <none>
default web-0 1/1 Running 0 75m 10.244.1.215 knode <none> <none>
default web-1 1/1 Running 0 75m 10.244.1.216 knode <none> <none>
kube-system coredns-fb8b8dccf-hbrhw 1/1 Running 0 51m 10.244.1.219 knode <none> <none>
kube-system coredns-fb8b8dccf-rmrwp 1/1 Running 0 51m 10.244.0.37 kmaster <none> <none>
kube-system etcd-kmaster 1/1 Running 20 20d 192.168.146.132 kmaster <none> <none>
kube-system kube-apiserver-kmaster 1/1 Running 514 20d 192.168.146.132 kmaster <none> <none>
kube-system kube-controller-manager-kmaster 1/1 Running 144 20d 192.168.146.132 kmaster <none> <none>
kube-system kube-flannel-ds-amd64-ndpjq 1/1 Running 0 76m 192.168.146.129 knode <none> <none>
kube-system kube-flannel-ds-amd64-s2vhp 1/1 Running 0 76m 192.168.146.132 kmaster <none> <none>
kube-system kube-proxy-dk5jd 1/1 Running 6 20d 192.168.146.132 kmaster <none> <none>
kube-system kube-proxy-ts79l 1/1 Running 2 20d 192.168.146.129 knode <none> <none>
kube-system kube-scheduler-kmaster 1/1 Running 172 20d 192.168.146.132 kmaster <none> <none>
nguyen#kmaster:~$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21d
nginx ClusterIP None <none> 80/TCP 6h8m
Did I miss something? Someone can help me.
Thank you!

nginx statefulset is deployed in default namespace as shown below
default web-0 1/1 Running 0 75m 10.244.1.215 knode <none> <none>
default web-1 1/1 Running 0 75m 10.244.1.216 knode <none> <none>
This is how you should test
master $ kubectl get po
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 1m
web-1 1/1 Running 0 1m
master $ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 35m
nginx ClusterIP None <none> 80/TCP 2m
master $ kubectl run -i --tty --image busybox:1.28 dns-test --restart=Never --rm
If you don't see a command prompt, try pressing enter.
/ # nslookup nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: nginx
Address 1: 10.40.0.1 web-0.nginx.default.svc.cluster.local
Address 2: 10.40.0.2 web-1.nginx.default.svc.cluster.local
/ #
/ # nslookup web-0.nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx
Address 1: 10.40.0.1 web-0.nginx.default.svc.cluster.local
/ # nslookup web-0.nginx.default.svc.cluster.local
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx.default.svc.cluster.local
Address 1: 10.40.0.1 web-0.nginx.default.svc.cluster.local

Related

K8s pods running in diffrent node can't communicate with each other

I have k8s cluster with two node, master and worker node, installed Calico.
I initialized cluster and installed calico with following commands
# Initialize cluster
kubeadm init --apiserver-advertise-address=<MatserNodePublicIP> --pod-network-cidr=192.168.0.0/16
# Install Calico. Refer to official document
# https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
After that, I found pods running in different node can't communicate with each other, but pods running in same node can communicate with each other.
Here are my operations:
# With following command, I ran a nginx pod scheduled to worker node
# and assigned pod id 192.168.199.72
kubectl create nginx --image=nginx
# With following command, I ran a busybox pod scheduled to master node
# and assigned pod id 192.168.119.197
kubectl run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox sh
# In busybox bash, I executed following command
# '>' represents command output
wget 192.168.199.72
> Connecting to 192.168.199.72 (192.168.199.72:80)
> wget: can't connect to remote host (192.168.199.72): Connection timed out
However, if nginx pod run in master node (same as busybox), the wget would output a correct welcome html.
(For scheduling nginx pod to master node, I cordon worker node, and restarted nginx pod)
I also tried to schedule nginx and busybox pod to worker node, the wget ouput is a correct welcome html.
Here are my cluster status, everything looks fine. I searched all I can find but couldn't find solution.
matser and worker node can ping each other with private IP.
For firewall
systemctl status firewalld
> Unit firewalld.service could not be found.
For node infomation
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
pro-con-scrapydmanager Ready control-plane,master 26h v1.21.2 10.120.0.5 <none> CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://20.10.5
pro-con-scraypd-01 Ready,SchedulingDisabled <none>
For pod infomation
kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default busybox 0/1 Error 0 24h 192.168.199.72 pro-con-scrapydmanager <none> <none>
default nginx 1/1 Running 1 26h 192.168.119.197 pro-con-scraypd-01 <none> <none>
kube-system calico-kube-controllers-78d6f96c7b-msrdr 1/1 Running 1 26h 192.168.199.77 pro-con-scrapydmanager <none> <none>
kube-system calico-node-gjhwh 1/1 Running 1 26h 10.120.0.2 pro-con-scraypd-01 <none> <none>
kube-system calico-node-x8d7g 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system coredns-558bd4d5db-mllm5 1/1 Running 1 26h 192.168.199.78 pro-con-scrapydmanager <none> <none>
kube-system coredns-558bd4d5db-whfnn 1/1 Running 1 26h 192.168.199.75 pro-con-scrapydmanager <none> <none>
kube-system etcd-pro-con-scrapydmanager 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-apiserver-pro-con-scrapydmanager 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-controller-manager-pro-con-scrapydmanager 1/1 Running 2 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-proxy-84cxb 1/1 Running 2 26h 10.120.0.2 pro-con-scraypd-01 <none> <none>
kube-system kube-proxy-nj2tq 1/1 Running 2 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
kube-system kube-scheduler-pro-con-scrapydmanager 1/1 Running 1 26h 10.120.0.5 pro-con-scrapydmanager <none> <none>
lens-metrics kube-state-metrics-78596b555-zxdst 1/1 Running 1 26h 192.168.199.76 pro-con-scrapydmanager <none> <none>
lens-metrics node-exporter-ggwtc 1/1 Running 1 26h 192.168.199.73 pro-con-scrapydmanager <none> <none>
lens-metrics node-exporter-sbz6t 1/1 Running 1 26h 192.168.119.196 pro-con-scraypd-01 <none> <none>
lens-metrics prometheus-0 1/1 Running 1 26h 192.168.199.74 pro-con-scrapydmanager <none> <none>
For services
kubectl get services -o wide --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 26h <none>
default nginx ClusterIP 10.99.117.158 <none> 80/TCP 24h run=nginx
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 26h k8s-app=kube-dns
lens-metrics kube-state-metrics ClusterIP 10.104.32.63 <none> 8080/TCP 26h name=kube-state-metrics
lens-metrics node-exporter ClusterIP None <none> 80/TCP 26h name=node-exporter,phase=prod
lens-metrics prometheus ClusterIP 10.111.86.164 <none> 80/TCP 26h name=prometheus
Ok. It's fault of firewall. I opened all of the following ports on my master node and recreated my cluster, then everything got fine and cni0 interface appeared. Although I still don't know why.
During the proccessing of tring, I find cni0 interface is important. If there is no cni0, I could not ping pod running in diffrent node.
(Refer: https://docs.projectcalico.org/getting-started/bare-metal/requirements)
Configuration Host(s) Connection type Port/protocol
Calico networking (BGP) All Bidirectional TCP 179
Calico networking with IP-in-IP enabled (default) All Bidirectional IP-in-IP, often represented by its protocol number 4
Calico networking with VXLAN enabled All Bidirectional UDP 4789
Calico networking with Typha enabled Typha agent hosts Incoming TCP 5473 (default)
flannel networking (VXLAN) All Bidirectional UDP 4789
All kube-apiserver host Incoming Often TCP 443 or 6443*
etcd datastore etcd hosts Incoming Officially TCP 2379 but can vary

k3s - can't access from one pod to another if pods on different master nodes (HighAvailability setup)

k3s - can't access from one pod to another if pods on different nodes
Update:
I've narrowed the issue down - it's pods that are on other master nodes that can't communicate with those on the original master
pods on rpi4-server1 - the original cluster - can communicate with pods on rpi-worker01 and rpi3-worker02
pods on rpi4-server2 are unable to communicate with the others
I'm trying to run a HighAvailability cluster with embedded DB and using flannel / vxlan
I'm trying to setup a project with 5 services in k3s
When all of the pods are contained on a single node, they work together fine.
As soon as I add other nodes into the system and pods are deployed to them, the links seem to break.
In troubleshooting I've exec'd into one of the pods and tried to curl another. When they are on the same node this works, if the second service is on another node it doesn't.
I'm sure this is something simple that I'm missing, but I can't work it out! Help appreciated.
Key details:
Using k3s and native traefik
Two rpi4s as servers (High Availability) and two rpi3s as worker nodes
metallb as loadbalancer
Two services - blah-interface and blah-svc are configured as LoadBalancer to allow external access. The others blah-server, n34 and test-apisas NodePort to support debugging, but only really need internal access
Info on nodes, pods and services....
pi#rpi4-server1:~/Projects/test_demo_2020/test_kube_config/testchart/templates $ sudo kubectl get nodes --all-namespaces -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
rpi4-server1 Ready master 11h v1.17.0+k3s.1 192.168.0.140 <none> Raspbian GNU/Linux 10 (buster) 4.19.75-v7l+ docker://19.3.5
rpi-worker01 Ready,SchedulingDisabled <none> 10h v1.17.0+k3s.1 192.168.0.41 <none> Raspbian GNU/Linux 10 (buster) 4.19.66-v7+ containerd://1.3.0-k3s.5
rpi3-worker02 Ready,SchedulingDisabled <none> 10h v1.17.0+k3s.1 192.168.0.142 <none> Raspbian GNU/Linux 10 (buster) 4.19.75-v7+ containerd://1.3.0-k3s.5
rpi4-server2 Ready master 10h v1.17.0+k3s.1 192.168.0.143 <none> Raspbian GNU/Linux 10 (buster) 4.19.75-v7l+ docker://19.3.5
pi#rpi4-server1:~/Projects/test_demo_2020/test_kube_config/testchart/templates $ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system helm-install-traefik-l2z6l 0/1 Completed 2 11h 10.42.0.2 rpi4-server1 <none> <none>
test-demo n34-5c7b9475cb-zjlgl 1/1 Running 1 4h30m 10.42.0.32 rpi4-server1 <none> <none>
kube-system metrics-server-6d684c7b5-5wgf9 1/1 Running 3 11h 10.42.0.26 rpi4-server1 <none> <none>
metallb-system speaker-62rkm 0/1 Pending 0 99m <none> rpi-worker01 <none> <none>
metallb-system speaker-2shzq 0/1 Pending 0 99m <none> rpi3-worker02 <none> <none>
metallb-system speaker-2mcnt 1/1 Running 0 99m 192.168.0.143 rpi4-server2 <none> <none>
metallb-system speaker-v8j9g 1/1 Running 0 99m 192.168.0.140 rpi4-server1 <none> <none>
metallb-system controller-65895b47d4-pgcs6 1/1 Running 0 90m 10.42.0.49 rpi4-server1 <none> <none>
test-demo blah-server-858ccd7788-mnf67 1/1 Running 0 64m 10.42.0.50 rpi4-server1 <none> <none>
default nginx2-6f4f6f76fc-n2kbq 1/1 Running 0 22m 10.42.0.52 rpi4-server1 <none> <none>
test-demo blah-interface-587fc66bf9-qftv6 1/1 Running 0 22m 10.42.0.53 rpi4-server1 <none> <none>
test-demo blah-svc-6f8f68f46-gqcbw 1/1 Running 0 21m 10.42.0.54 rpi4-server1 <none> <none>
kube-system coredns-d798c9dd-hdwn5 1/1 Running 1 11h 10.42.0.27 rpi4-server1 <none> <none>
kube-system local-path-provisioner-58fb86bdfd-tjh7r 1/1 Running 31 11h 10.42.0.28 rpi4-server1 <none> <none>
kube-system traefik-6787cddb4b-tgq6j 1/1 Running 0 4h50m 10.42.1.23 rpi4-server2 <none> <none>
default testdemo2020-testchart-6f8d44b496-2hcfc 1/1 Running 1 6h31m 10.42.0.29 rpi4-server1 <none> <none>
test-demo test-apis-75bb68dcd7-d8rrp 1/1 Running 0 7m13s 10.42.1.29 rpi4-server2 <none> <none>
pi#rpi4-server1:~/Projects/test_demo_2020/test_kube_config/testchart/templates $ sudo kubectl get svc --all-namespaces -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 11h <none>
kube-system kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 11h k8s-app=kube-dns
kube-system metrics-server ClusterIP 10.43.74.118 <none> 443/TCP 11h k8s-app=metrics-server
kube-system traefik-prometheus ClusterIP 10.43.78.135 <none> 9100/TCP 11h app=traefik,release=traefik
test-demo blah-server NodePort 10.43.224.128 <none> 5055:31211/TCP 10h io.kompose.service=blah-server
default testdemo2020-testchart ClusterIP 10.43.91.7 <none> 80/TCP 10h app.kubernetes.io/instance=testdemo2020,app.kubernetes.io/name=testchart
test-demo traf-dashboard NodePort 10.43.60.155 <none> 8080:30808/TCP 10h io.kompose.service=traf-dashboard
test-demo test-apis NodePort 10.43.248.59 <none> 8075:31423/TCP 7h11m io.kompose.service=test-apis
kube-system traefik LoadBalancer 10.43.168.18 192.168.0.240 80:30688/TCP,443:31263/TCP 11h app=traefik,release=traefik
default nginx2 LoadBalancer 10.43.249.123 192.168.0.241 80:30497/TCP 92m app=nginx2
test-demo n34 NodePort 10.43.171.206 <none> 7474:30474/TCP,7687:32051/TCP 72m io.kompose.service=n34
test-demo blah-interface LoadBalancer 10.43.149.158 192.168.0.242 80:30634/TCP 66m io.kompose.service=blah-interface
test-demo blah-svc LoadBalancer 10.43.19.242 192.168.0.243 5005:30005/TCP,5006:31904/TCP,5002:30685/TCP 51m io.kompose.service=blah-svc
Hi you issue could be related to the following issue.
After configuring the network under /etc/systemd/network/eth0.network (filename may differ in your case, since i am using arch linux on all pis)
[Match]
Name=eth0
[Network]
Address=x.x.x.x/24 # ip of node
Gateway=x.x.x.x # ip of gateway router
Domains=default.svc.cluster.local svc.cluster.local cluster.local
DNS=10.x.x.x # k3s dns ip x.x.x.x # ip of gateway router
After that I removed the 10.x.x.x routes with ip route del 10.x.x.x dev [flannel|cni0] on every node and restarted them.

Cannot access to Kubernetes Dashboard

I have a K8s cluster (1 master, 2 workers) running on 3 vagrant VMs on my computer.
I've installed kubernetes dashboard, like explained here.
All my pods are running correctly:
kubectl get pods -o wide --namespace=kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-fb8b8dccf-n5cpm 1/1 Running 1 61m 10.244.0.4 kmaster.example.com <none> <none>
coredns-fb8b8dccf-qwcr4 1/1 Running 1 61m 10.244.0.5 kmaster.example.com <none> <none>
etcd-kmaster.example.com 1/1 Running 1 60m 172.42.42.100 kmaster.example.com <none> <none>
kube-apiserver-kmaster.example.com 1/1 Running 1 60m 172.42.42.100 kmaster.example.com <none> <none>
kube-controller-manager-kmaster.example.com 1/1 Running 1 60m 172.42.42.100 kmaster.example.com <none> <none>
kube-flannel-ds-amd64-hcjsm 1/1 Running 1 61m 172.42.42.100 kmaster.example.com <none> <none>
kube-flannel-ds-amd64-klv4f 1/1 Running 3 56m 172.42.42.102 kworker2.example.com <none> <none>
kube-flannel-ds-amd64-lmpnd 1/1 Running 2 59m 172.42.42.101 kworker1.example.com <none> <none>
kube-proxy-86qsw 1/1 Running 1 59m 10.0.2.15 kworker1.example.com <none> <none>
kube-proxy-dp29s 1/1 Running 1 61m 172.42.42.100 kmaster.example.com <none> <none>
kube-proxy-gqqq9 1/1 Running 1 56m 10.0.2.15 kworker2.example.com <none> <none>
kube-scheduler-kmaster.example.com 1/1 Running 1 60m 172.42.42.100 kmaster.example.com <none> <none>
kubernetes-dashboard-5f7b999d65-zqbbz 1/1 Running 1 28m 10.244.1.3 kworker1.example.com <none> <none>
As you can see the dashboard is in "Running" status.
I also ran kubectl proxy and it's serving on 127.0.0.1:8001.
But when I try to open http://127.0.0.1:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/ I have the error:
This site can’t be reached
127.0.0.1 refused to connect.
ERR_CONNECTION_REFUSED
I'm trying to open the dashboard directly on my computer, not inside the vagram VM. Could that be the problem? If yes, how to solve it ? I'm able to ping my VM from my computer without any issue.
Thanks for helping me.
EDIT
Here is the ouput of kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 96m
kubernetes-dashboard NodePort 10.109.230.83 <none> 443:30089/TCP 63m
Kubernetes dashboard runs only in the cluster as default. You can control it with get svc command:
kubectl get svc -n kube-system
Default type of that service is ClusterIp, to reach from outside of the cluster yo have to change it to NodePort.
To change it follow this doc.

Helm error: dial tcp *:10250: i/o timeout

Created a local cluster using Vagrant + Ansible + VirtualBox. Manually deploying works fine, but when using Helm:
:~$helm install stable/nginx-ingress --name nginx-ingress-controller --set rbac.create=true
Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp 10.0.52.15:10250: i/o timeout
Kubernetes cluster info:
:~$kubectl get nodes,po,deploy,svc,ingress --all-namespaces -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node/ubuntu18-kube-master Ready master 32m v1.13.3 10.0.51.15 <none> Ubuntu 18.04.1 LTS 4.15.0-43-generic docker://18.6.1
node/ubuntu18-kube-node-1 Ready <none> 31m v1.13.3 10.0.52.15 <none> Ubuntu 18.04.1 LTS 4.15.0-43-generic docker://18.6.1
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default pod/nginx-server 1/1 Running 0 40s 10.244.1.5 ubuntu18-kube-node-1 <none> <none>
default pod/nginx-server-b8d78876d-cgbjt 1/1 Running 0 4m25s 10.244.1.4 ubuntu18-kube-node-1 <none> <none>
kube-system pod/coredns-86c58d9df4-5rsw2 1/1 Running 0 31m 10.244.0.2 ubuntu18-kube-master <none> <none>
kube-system pod/coredns-86c58d9df4-lfbvd 1/1 Running 0 31m 10.244.0.3 ubuntu18-kube-master <none> <none>
kube-system pod/etcd-ubuntu18-kube-master 1/1 Running 0 31m 10.0.51.15 ubuntu18-kube-master <none> <none>
kube-system pod/kube-apiserver-ubuntu18-kube-master 1/1 Running 0 30m 10.0.51.15 ubuntu18-kube-master <none> <none>
kube-system pod/kube-controller-manager-ubuntu18-kube-master 1/1 Running 0 30m 10.0.51.15 ubuntu18-kube-master <none> <none>
kube-system pod/kube-flannel-ds-amd64-jffqn 1/1 Running 0 31m 10.0.51.15 ubuntu18-kube-master <none> <none>
kube-system pod/kube-flannel-ds-amd64-vc6p2 1/1 Running 0 31m 10.0.52.15 ubuntu18-kube-node-1 <none> <none>
kube-system pod/kube-proxy-fbgmf 1/1 Running 0 31m 10.0.52.15 ubuntu18-kube-node-1 <none> <none>
kube-system pod/kube-proxy-jhs6b 1/1 Running 0 31m 10.0.51.15 ubuntu18-kube-master <none> <none>
kube-system pod/kube-scheduler-ubuntu18-kube-master 1/1 Running 0 31m 10.0.51.15 ubuntu18-kube-master <none> <none>
kube-system pod/tiller-deploy-69ffbf64bc-x8lkc 1/1 Running 0 24m 10.244.1.2 ubuntu18-kube-node-1 <none> <none>
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
default deployment.extensions/nginx-server 1/1 1 1 4m25s nginx-server nginx run=nginx-server
kube-system deployment.extensions/coredns 2/2 2 2 32m coredns k8s.gcr.io/coredns:1.2.6 k8s-app=kube-dns
kube-system deployment.extensions/tiller-deploy 1/1 1 1 24m tiller gcr.io/kubernetes-helm/tiller:v2.12.3 app=helm,name=tiller
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 32m <none>
default service/nginx-server NodePort 10.99.84.201 <none> 80:31811/TCP 12s run=nginx-server
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 32m k8s-app=kube-dns
kube-system service/tiller-deploy ClusterIP 10.99.4.74 <none> 44134/TCP 24m app=helm,name=tiller
Vagrantfile:
...
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
$hosts.each_with_index do |(hostname, parameters), index|
ip_address = "#{$subnet}.#{$ip_offset + index}"
config.vm.define vm_name = hostname do |vm_config|
vm_config.vm.hostname = hostname
vm_config.vm.box = box
vm_config.vm.network "private_network", ip: ip_address
vm_config.vm.provider :virtualbox do |vb|
vb.gui = false
vb.name = hostname
vb.memory = parameters[:memory]
vb.cpus = parameters[:cpus]
vb.customize ['modifyvm', :id, '--macaddress1', "08002700005#{index}"]
vb.customize ['modifyvm', :id, '--natnet1', "10.0.5#{index}.0/24"]
end
end
end
end
Workaround for VirtualBox issue: set diffenrent macaddress and internal_ip.
It is interesting to find a solution that can be placed in one of the configuration files: vagrant, ansible roles. Any ideas on the problem?
Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp 10.0.52.15:10250: i/o timeout
You're getting bitten by a very common kubernetes-on-Vagrant bug: the kubelet believes its IP address is eth0, which is the NAT interface in Vagrant, versus using (what I hope you have) the :private_address network in your Vagrantfile. Thus, since all kubelet interactions happen directly to it (and not through the API server), things like kubectl exec and kubectl logs will fail in exactly the way you see.
The solution is to force kubelet to bind to the private network interface, or I guess you could switch your Vagrantfile to use the bridge network, if that's an option for you -- just so long as the interface isn't the NAT one.
The question is about how you manage TLS Certificates in the cluster, ensure that port 10250 is reachable.
Here is an example of how i fix it when i try to run exec a pod running in node (instance aws in my case),
resource "aws_security_group" "My_VPC_Security_Group" {
...
ingress {
description = "TLS from VPC"
from_port = 10250
to_port = 10250
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
For more details you can visit [1]: http://carnal0wnage.attackresearch.com/2019/01/kubernetes-unauth-kublet-api-10250.html

Kubernetes services sometime no response

My cluster contains 1 master with 3 worker nodes in which 1 POD with 2 replica sets and 1 service are created. When I try to access the service via the command curl <ClusterIP>:<port> either from 2 worker nodes, sometimes it can feedback Nginx welcome, but sometimes it gets stuck and connection refused and timeout.
I checked the Kubernetes Service, POD and endpoints are fine, but no clue what is going on. Please advise.
vagrant#k8s-master:~/_projects/tmp1$ sudo kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master Ready master 23d v1.12.2 192.168.205.10 <none> Ubuntu 16.04.4 LTS 4.4.0-139-generic docker://17.3.2
k8s-worker1 Ready <none> 23d v1.12.2 192.168.205.11 <none> Ubuntu 16.04.4 LTS 4.4.0-139-generic docker://17.3.2
k8s-worker2 Ready <none> 23d v1.12.2 192.168.205.12 <none> Ubuntu 16.04.4 LTS 4.4.0-139-generic docker://17.3.2
vagrant#k8s-master:~/_projects/tmp1$ sudo kubectl get pod -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default my-nginx-756f645cd7-pfdck 1/1 Running 0 5m23s 10.244.2.39 k8s-worker2 <none>
default my-nginx-756f645cd7-xpbnp 1/1 Running 0 5m23s 10.244.1.40 k8s-worker1 <none>
kube-system coredns-576cbf47c7-ljx68 1/1 Running 18 23d 10.244.0.38 k8s-master <none>
kube-system coredns-576cbf47c7-nwlph 1/1 Running 18 23d 10.244.0.39 k8s-master <none>
kube-system etcd-k8s-master 1/1 Running 18 23d 192.168.205.10 k8s-master <none>
kube-system kube-apiserver-k8s-master 1/1 Running 18 23d 192.168.205.10 k8s-master <none>
kube-system kube-controller-manager-k8s-master 1/1 Running 18 23d 192.168.205.10 k8s-master <none>
kube-system kube-flannel-ds-54xnb 1/1 Running 2 2d5h 192.168.205.12 k8s-worker2 <none>
kube-system kube-flannel-ds-9q295 1/1 Running 2 2d5h 192.168.205.11 k8s-worker1 <none>
kube-system kube-flannel-ds-q25xw 1/1 Running 2 2d5h 192.168.205.10 k8s-master <none>
kube-system kube-proxy-gkpwp 1/1 Running 15 23d 192.168.205.11 k8s-worker1 <none>
kube-system kube-proxy-gncjh 1/1 Running 18 23d 192.168.205.10 k8s-master <none>
kube-system kube-proxy-m4jfm 1/1 Running 15 23d 192.168.205.12 k8s-worker2 <none>
kube-system kube-scheduler-k8s-master 1/1 Running 18 23d 192.168.205.10 k8s-master <none>
kube-system kubernetes-dashboard-77fd78f978-4r62r 1/1 Running 15 23d 10.244.1.38 k8s-worker1 <none>
vagrant#k8s-master:~/_projects/tmp1$ sudo kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23d <none>
my-nginx ClusterIP 10.98.9.75 <none> 80/TCP 75s run=my-nginx
vagrant#k8s-master:~/_projects/tmp1$ sudo kubectl get endpoints
NAME ENDPOINTS AGE
kubernetes 192.168.205.10:6443 23d
my-nginx 10.244.1.40:80,10.244.2.39:80 101s
This sounds odd but it could be that one of your pods is serving traffic and the other is not. You can try shelling into the pods:
$ kubectl exec -it my-nginx-756f645cd7-rs2w2 sh
$ kubectl exec -it my-nginx-756f645cd7-vwzrl sh
You can see if they are listening on port 80:
$ curl localhost:80
You can also see if your service has the two endpoints 10.244.2.28:80 and 10.244.1.29:80.
$ kubectl get ep my-nginx
$ kubectl get ep my-nginx -o=yaml
Also, try to connect to each one of your endpoints from a node:
$ curl 10.244.2.28:80
$ curl 10.244.2.29:80