Pods on different nodes can't ping each other - kubernetes

I set up 1 master 2 nodes k8s cluster in according to documentation. A pod can ping the other pod on the same node but can't ping the pod on the other node.
To demonstrate the problem I deployed below deployments which has 3 replica. While two of them sits on the same node, the other pod sits on the other node.
$ cat nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
name: nginx-svc
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-21-115.us-west-2.compute.internal Ready master 20m v1.11.2
ip-172-31-26-62.us-west-2.compute.internal Ready 19m v1.11.2
ip-172-31-29-204.us-west-2.compute.internal Ready 14m v1.11.2
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
nginx-deployment-966857787-22qq7 1/1 Running 0 11m 10.244.2.3 ip-172-31-29-204.us-west-2.compute.internal
nginx-deployment-966857787-lv7dd 1/1 Running 0 11m 10.244.1.2 ip-172-31-26-62.us-west-2.compute.internal
nginx-deployment-966857787-zkzg6 1/1 Running 0 11m 10.244.2.2 ip-172-31-29-204.us-west-2.compute.internal
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 21m
nginx-svc ClusterIP 10.105.205.10 80/TCP 11m
Everything looks fine.
Let me show you containers.
# docker exec -it 489b180f512b /bin/bash
root#nginx-deployment-966857787-zkzg6:/# ifconfig
eth0: flags=4163 mtu 8951
inet 10.244.2.2 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::cc4d:61ff:fe8a:5aeb prefixlen 64 scopeid 0x20
root#nginx-deployment-966857787-zkzg6:/# ping 10.244.2.3
PING 10.244.2.3 (10.244.2.3) 56(84) bytes of data.
64 bytes from 10.244.2.3: icmp_seq=1 ttl=64 time=0.066 ms
64 bytes from 10.244.2.3: icmp_seq=2 ttl=64 time=0.055 ms
^C
So it pings its neighbor pod on the same node.
root#nginx-deployment-966857787-zkzg6:/# ping 10.244.1.2
PING 10.244.1.2 (10.244.1.2) 56(84) bytes of data.
^C
--- 10.244.1.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1059ms
And can't ping its replica on the other node.
Here is host interfaces:
# ifconfig
cni0: flags=4163 mtu 8951
inet 10.244.2.1 netmask 255.255.255.0 broadcast 0.0.0.0
docker0: flags=4099 mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
eth0: flags=4163 mtu 9001
inet 172.31.29.204 netmask 255.255.240.0 broadcast 172.31.31.255
flannel.1: flags=4163 mtu 8951
inet 10.244.2.0 netmask 255.255.255.255 broadcast 0.0.0.0
lo: flags=73 mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
veth09fb984a: flags=4163 mtu 8951
inet6 fe80::d819:14ff:fe06:174c prefixlen 64 scopeid 0x20
veth87b3563e: flags=4163 mtu 8951
inet6 fe80::d09c:d2ff:fe7b:7dd7 prefixlen 64 scopeid 0x20
# ifconfig
cni0: flags=4163 mtu 8951
inet 10.244.1.1 netmask 255.255.255.0 broadcast 0.0.0.0
docker0: flags=4099 mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
eth0: flags=4163 mtu 9001
inet 172.31.26.62 netmask 255.255.240.0 broadcast 172.31.31.255
flannel.1: flags=4163 mtu 8951
inet 10.244.1.0 netmask 255.255.255.255 broadcast 0.0.0.0
lo: flags=73 mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
veth9733e2e6: flags=4163 mtu 8951
inet6 fe80::8003:46ff:fee2:abc2 prefixlen 64 scopeid 0x20
Processes on the nodes:
# ps auxww|grep kube
root 4059 0.1 2.8 43568 28316 ? Ssl 00:31 0:01 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
root 4260 0.0 3.4 358984 34288 ? Ssl 00:31 0:00 /opt/bin/flanneld --ip-masq --kube-subnet-mgr
root 4455 1.1 9.6 760868 97260 ? Ssl 00:31 0:14 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni
Because of this network problem clusterIP is also unreachable:
$ curl 10.105.205.10:80
Any suggestion?
Thanks.

I found the problem.
Flannel uses UDP port 8285 and 8472 which was being blocked by AWS security groups. I had only opened TCP ports.
I enable UDP port 8285 and UDP port 8472 as well as TCP 6443, 10250, 10256.

The docker virtual bridge interface docker0 is now have IP 172.17.0.1 on both host.
But as per the docker/flannel integration guide, the docker0virtual bridge should be in flannel network on each host.
A highlevel workflow of flannel/docker networking integrations below
Flannel creates /run/flannel/subnet.env as per the etcd network configuration during flanneld startup.
Docker refers the file /run/flannel/subnet.env and set --bip flag during dockerd startup and assign IP from flannel network to docker0
Refer docker/flannel integration doc for more details:
http://docker-k8s-lab.readthedocs.io/en/latest/docker/docker-flannel.html#restart-docker-daemon-with-flannel-network

Related

minikube ingress remote access

I have a Nginx service running in the minikuve VM having ip 192.168.99.106
kubectl get ingress
`NAME CLASS HOSTS ADDRESS PORTS AGE`
`ingress-service <none> * 192.168.99.106 80 153m`
kubectl describe ingress
Name: ingress-service
Namespace: default
Address: 192.168.99.106
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/ fe-cluster-ip-service:3000 (172.17.0.20:3000)
/login/ login-cluster-ip-service:9090 (172.17.0.18:9090)
Annotations: kubernetes.io/ingress.class: nginx
Events: <none>
I want to expose the port 192.168.99.106:80 to the outside world so that the I will be able to access the app from 10.105.230.34:8888
enp129s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet **10.105.230.34** netmask 255.255.255.0 broadcast 10.105.230.255
inet6 fe80::2be:75ff:fee1:57ce prefixlen 64 scopeid 0x20<link>
ether 00:be:75:e1:57:ce txqueuelen 1000 (Ethernet)
RX packets 3441670 bytes 4623846194 (4.3 GiB)
RX errors 0 dropped 38 overruns 0 frame 0
TX packets 971511 bytes 235934965 (225.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xfbc00000-fbcfffff
Is it possible to achieve this functionality. I tried tunneling but could not make it work
I sorted it out by using a reverse proxy nginx to the minikube ip.

Single node Microk8s multus master interface cannot be reached

I have a single node Microk8s with calico.
I have deployed Multus sucessfully and I can create PODs with the 2nd network interface created succesfuly in the pod because can see the interfaces and the IP address correctly assigned. The pods can reach each other on the 2nd interface successfuly but I cannot reach host eno8 ( ip address 10.128.1.244), the multus master interface from the pods. I also cannot reach the pods from outside.
Am new to this kind of deployment and need help to figure out where the problem is?
Thanks.
Here is some details about my environment:
ubuntu#test:$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
test Ready <none> 9d v1.21.4-3+e5758f73ed2a04
Ip a on HOST
ubuntu#test:$ip a
8: eno8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 3c:ec:ef:6c:2c:ff brd ff:ff:ff:ff:ff:ff
inet 10.128.1.244/24 brd 10.128.1.255 scope global eno8
valid_lft forever preferred_lft forever
inet6 fe80::3eec:efff:fe6c:2cff/64 scope link
valid_lft forever preferred_lft forever
ubuntu#test:$ kubectl get pods --all-namespaces | grep -i multus
kube-system kube-multus-ds-amd64-dz42s 1/1 Running 0 175m
Network Deployment:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: test-network
spec:
config: '{
"cniVersion": "{{ .Values.Multus_cniVersion}}",
"name": "test-network",
"type": "{{ .Values.Multus_driverType}}",
"master": "{{ .Values.Multus_master_interface}}",
"mode": "{{ .Values.Multus_interface_mode}}",
"ipam": {
"type": "{{ .Values.Multus_ipam_type}}",
"subnet": "{{ .Values.Multus_ipam_subnet}}",
"rangeStart": "{{ .Values.Multus_ipam_rangeStart}}",
"rangeEnd": "{{ .Values.Multus_ipam_rangeStop}}",
"routes": [
{ "dst": "{{ .Values.Multus_defaultRoute}}" }
],
"dns": {"nameservers": ["{{ .Values.Multus_DNS}}"]},
"gateway": "{{ .Values.Multus_ipam_gw}}"
}
}'
Multus_cniVersion: 0.3.1
Multus_driverType: macvlan
Multus_master_interface: eno8
Multus_interface_mode: bridge
Multus_ipam_type: host-local
Multus_ipam_subnet: 10.128.1.0/24
Multus_ipam_rangeStart: 10.128.1.147
Multus_ipam_rangeStop: 10.128.1.156
Multus_defaultRoute: 0.0.0.0/0
Multus_DNS: 10.128.1.1
Multus_ipam_gw: 10.128.1.1
ubuntu#test:$ kubectl get network-attachment-definitions
NAME AGE
test-network 8m39s
Network description:
ubuntu#test:$ kubectl describe network-attachment-definitions.k8s.cni.cncf.io test-network
Name: test-network
Namespace: default
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: test-demo
meta.helm.sh/release-namespace: default
API Version: k8s.cni.cncf.io/v1
Kind: NetworkAttachmentDefinition
Metadata:
Creation Timestamp: 2021-09-24T12:15:08Z
Generation: 1
Managed Fields:
API Version: k8s.cni.cncf.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:meta.helm.sh/release-name:
f:meta.helm.sh/release-namespace:
f:labels:
.:
f:app.kubernetes.io/managed-by:
f:spec:
.:
f:config:
Manager: Go-http-client
Operation: Update
Time: 2021-09-24T12:15:08Z
Resource Version: 1062851
Self Link: /apis/k8s.cni.cncf.io/v1/namespaces/default/network-attachment-definitions/test-network
UID: c96f3a0f-b30f-4972-9271-6b2871adf299
Spec:
Config: { "cniVersion": "0.3.1", "name": "test-network", "type": "macvlan", "master": "eno8", "mode": "bridge", "ipam": { "type": "host-local", "subnet": "10.128.1.0/24", "rangeStart": "10.128.1.147", "rangeEnd": "10.128.1.156", "routes": [ { "dst": "0.0.0.0/0" } ], "dns": {"nameservers": ["10.128.1.1"]}, "gateway": "10.128.1.1" } }
Events: <none>
ip a in POD
root#test-deployment-6465bdfccc-k2sst:# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0#if505: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
link/ether 22:a8:17:13:35:39 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.1.19.149/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::20a8:17ff:fe13:3539/64 scope link
valid_lft forever preferred_lft forever
4: eth1#if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether de:c1:d7:67:08:93 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.128.1.149/24 brd 10.128.1.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::dcc1:d7ff:fe67:893/64 scope link
valid_lft forever preferred_lft forever
Ping to eno8 in POD
root#test-deployment-6465bdfccc-g8bd4:# ping 10.128.1.244
PING 10.128.1.244 (10.128.1.244) 56(84) bytes of data.
^X^C
--- 10.128.1.244 ping statistics ---
14 packets transmitted, 0 received, 100% packet loss, time 13313ms
Ping to multus gateway
root#test-deployment-6465bdfccc-k2sst:# ping 10.128.1.1
PING 10.128.1.1 (10.128.1.1) 56(84) bytes of data.
From 10.128.1.149 icmp_seq=1 Destination Host Unreachable
From 10.128.1.149 icmp_seq=2 Destination Host Unreachable
From 10.128.1.149 icmp_seq=3 Destination Host Unreachable
From 10.128.1.149 icmp_seq=4 Destination Host Unreachable
From 10.128.1.149 icmp_seq=5 Destination Host Unreachable
From 10.128.1.149 icmp_seq=6 Destination Host Unreachable
^C
--- 10.128.1.1 ping statistics ---
8 packets transmitted, 0 received, +6 errors, 100% packet loss, time 7164ms
pipe 4
Netstat in the POD
root#test-deployment-6465bdfccc-k2sst:# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
10.128.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
169.254.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
ip r in the POD
root#test-deployment-6465bdfccc-g8bd4:# ip r
default via 169.254.1.1 dev eth0
10.128.1.0/24 dev eth1 proto kernel scope link src 10.128.1.149
169.254.1.1 dev eth0 scope link
Your problem may stem from the fact that MACVLAN interfaces cannot be reached from the same host's default route interface. Let's say your PC has interface eth0 with IP 10.0.0.2 and you use MACVLAN to map an interface in a container as a parent interface eth0, or a sub-interface eth0.1 etc., by using an IP 10.0.0.3. You won't be able to reach services running on 10.0.0.3 from the same host, but you will from another host. To resolve this, either use IPVLAN in Layer-3 mode to have fully routable plane. Note that you can't do port forwarding to access the container, because MACVLAN separates the communication on lower layers or use a sub interface with trunking mode 802.1q (but you will need a switch that supports promiscuous mode on the ports to be able to pass VLAN-tagged traffic).

netcat listerning pod in kubernetes namespace unable to connect

I am running kubernetes v 19.4 with weave-net ( image: weaveworks/weave-npc:2.7.0)
There are no network policies active in the default namespace
I want to run a netcat listener on pod1 port 8080, and want to connect to pod1 port 8080 by pod2
[root#node01 ~]# kubectl run pod1 -i -t --image=ubuntu -- /bin/bash
If you don't see a command prompt, try pressing enter.
root#pod1:/# apt update ; apt install netcat-openbsd -y
........
root#pod1:/# nc -l -p 8080
I verify the port is listening on pod1 by :
root#node01 ~]# kubectl exec -i -t pod1 -- /bin/bash
root#pod1:/# apt install net-tools -y
...........
root#pod1:/# netstat -tulpen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 0 213960 263/nc
root#pod1:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1376
inet 10.32.0.3 netmask 255.240.0.0 broadcast 10.47.255.255
ether a2:b9:3e:bc:6e:25 txqueuelen 0 (Ethernet)
RX packets 8429 bytes 17438639 (17.4 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4217 bytes 284639 (284.6 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
I install pod2 witn netcat on it:
[root#node01 ~]# kubectl run pod2 -i -t --image=ubuntu -- /bin/bash
If you don't see a command prompt, try pressing enter.
root#pod2:/# apt update ; apt install netcat-openbsd -y
I test my netcat listener on pod1 from pod2:
root#pod2:/# nc 10.32.0.3 8080
....times out
So i decided to create a service of port 8080 on pod1:
kubectl expose pod pod1 --port=8080 ; kubectl get svc ; kubectl get netpol
[root#node01 ~]# kubectl expose pod pod1 --port=8080 ; kubectl get svc
service/pod1 exposed
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
apache ClusterIP 10.104.218.123 <none> 80/TCP 20d
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21d
nginx ClusterIP 10.98.221.196 <none> 80/TCP 13d
pod1 ClusterIP 10.105.194.196 <none> 8080/TCP 2s
No resources found in default namespace.
Retry from pod2 now by service:
ping pod1
PING pod1.default.svc.cluster.local (10.105.194.196) 56(84) bytes of data.
root#pod2:/# nc pod1 8080
....times out
I also tried this with the regular netcat package.
For good measure i try to expose port 8080 on the pod as nodeport:
root#node01 ~]# kubectl delete svc pod1 ; kubectl expose pod pod1 --port=8080 --type=NodePort ; kubectl get svc
So when i try to access that port from outside kubernetes i am unable to connect, for good measure i also test the ssh port to verify my base connectivity is ok
user#DESKTOP-7TIH9:~$ nc -zv 10.10.70.112 30743
nc: connect to 10.10.70.112 port 30743 (tcp) failed: Connection refused
user#DESKTOP-7TIH9:~$ nc -zv 10.10.70.112 22
Connection to 10.10.70.112 22 port [tcp/ssh] succeeded!
Can anybody tell me if i am doing something, have the wrong expectation or advice me how to proceed.
Thank you in advance.
Trying to solve this i somehow decided to enable the firewall on the k8s hosts.
This lead me to a broken cluster. I decided to reinit the cluster, make sure all the fw ports are opened. Including this one : https://www.weave.works/docs/net/latest/faq#ports
All is working now1

Calico CNI pod networking not working across different hosts on EKS Kubernetes worker nodes

I am running vanilla EKS Kubernetes at version 1.12.
I've used CNI Genie to allow custom selection of the CNI that pods use when starting and I've installed the standard Calico CNI setup.
With CNI Genie I configured the default CNI to be the AWS CNI (aws-node) and all pods start up as usual and get assigned an IP from my VPC subnets.
I then selectively use calico as the CNI for some basic pods I am testing with. I'm using the default calico 192.168.0.0/16 CIDR range. Everything works great if the pods are on the same EKS worker nodes.
Core DNS is working great too (as long as I keep the coredns pods running on the aws CNI).
However, if a pod moves to a different worker node, then networking between them does not work inside the cluster.
I've checked the routing tables on the worker nodes that calico auto configures and it appears logical to me.
Here is my wide pod listing across all namespaces:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default hello-node1-865588ccd7-64p5x 1/1 Running 0 31m 192.168.106.129 ip-10-0-2-31.eu-west-2.compute.internal <none>
default hello-node2-dc7bbcb74-gqpwq 1/1 Running 0 17m 192.168.25.193 ip-10-0-3-222.eu-west-2.compute.internal <none>
kube-system aws-node-cm2dp 1/1 Running 0 26m 10.0.3.222 ip-10-0-3-222.eu-west-2.compute.internal <none>
kube-system aws-node-vvvww 1/1 Running 0 31m 10.0.2.31 ip-10-0-2-31.eu-west-2.compute.internal <none>
kube-system calico-kube-controllers-56bfccb786-fc2j4 1/1 Running 0 30m 10.0.2.41 ip-10-0-2-31.eu-west-2.compute.internal <none>
kube-system calico-node-flmnl 1/1 Running 0 31m 10.0.2.31 ip-10-0-2-31.eu-west-2.compute.internal <none>
kube-system calico-node-hcmqd 1/1 Running 0 26m 10.0.3.222 ip-10-0-3-222.eu-west-2.compute.internal <none>
kube-system coredns-6c64c9f456-g2h9k 1/1 Running 0 30m 10.0.2.204 ip-10-0-2-31.eu-west-2.compute.internal <none>
kube-system coredns-6c64c9f456-g5lhl 1/1 Running 0 30m 10.0.2.200 ip-10-0-2-31.eu-west-2.compute.internal <none>
kube-system genie-plugin-hspts 1/1 Running 0 26m 10.0.3.222 ip-10-0-3-222.eu-west-2.compute.internal <none>
kube-system genie-plugin-vqd2d 1/1 Running 0 31m 10.0.2.31 ip-10-0-2-31.eu-west-2.compute.internal <none>
kube-system kube-proxy-jm7f7 1/1 Running 0 26m 10.0.3.222 ip-10-0-3-222.eu-west-2.compute.internal <none>
kube-system kube-proxy-nnp76 1/1 Running 0 31m 10.0.2.31 ip-10-0-2-31.eu-west-2.compute.internal <none>
As you can see, the two hello-node pods are using the Calico CNI.
I've exposed the hello-node pods with two services:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-node1 ClusterIP 172.20.90.83 <none> 8081/TCP 43m
hello-node2 ClusterIP 172.20.242.22 <none> 8082/TCP 43m
I've confirmed if I start the hello-node pods with the aws CNI that I can ping / curl between them when they run on separate hosts using the cluster service names.
Things stop working when I use Calico CNI as above.
I only have two EKS worker hosts in this test cluster. Here is the routing for each:
K8s Worker 1 routes
[ec2-user#ip-10-0-3-222 ~]$ ip route
default via 10.0.3.1 dev eth0
10.0.3.0/24 dev eth0 proto kernel scope link src 10.0.3.222
169.254.169.254 dev eth0
blackhole 192.168.25.192/26 proto bird
192.168.25.193 dev calia0da7d91dc2 scope link
192.168.106.128/26 via 10.0.2.31 dev tunl0 proto bird onlink
K8s Worker 2 routes
[ec2-user#ip-10-0-2-31 ~]$ ip route
default via 10.0.2.1 dev eth0
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.31
10.0.2.41 dev enif4cf9019f11 scope link
10.0.2.200 dev eni412af1a0e55 scope link
10.0.2.204 dev eni04260ebbbe1 scope link
169.254.169.254 dev eth0
192.168.25.192/26 via 10.0.3.222 dev tunl0 proto bird onlink
blackhole 192.168.106.128/26 proto bird
192.168.106.129 dev cali19da7817849 scope link
To me, the route:
192.168.25.192/26 via 10.0.3.222 dev tunl0 proto bird onlink
tells me that traffic destined for the 192.168.25.192/16 subnet from this worker (and its containers/pods) should go out to the 10.0.3.222 (AWS VPC ENI for the EC2 host) on the tunl0 interface.
This route is on the EC2 host 10.0.2.31. So in other words when talking from this host's containers to containers on the calico subnet 192.168.25.192/16, network traffic should route to 10.0.3.222 (the ENI IP for my other EKS worker node where containers using Calico run on that subnet).
To clarify my testing procedure:
Exec into hello-node1 pod, and curl http://hello-node2:8082 (or ping the calico assigned IP address of the hello-node2 pod.
EDIT
To further test this, I've run tcpdump on the host where the hello-node2 pod is running, capturing on port 8080 (the container listens on this port).
I do get activity on the destination host where the test container that I am curling to is running, but it doesn't seem to indicate dropped traffic.
[ec2-user#ip-10-0-3-222 ~]$ sudo tcpdump -vv -x -X -i tunl0 'port 8080'
tcpdump: listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
14:32:42.859238 IP (tos 0x0, ttl 254, id 63813, offset 0, flags [DF], proto TCP (6), length 60)
10.0.2.31.29192 > 192.168.25.193.webcache: Flags [S], cksum 0xf932 (correct), seq 3206263598, win 28000, options [mss 1400,sackOK,TS val 2836614698 ecr 0,nop,wscale 7], length 0
0x0000: 4500 003c f945 4000 fe06 9ced 0a00 021f E..<.E#.........
0x0010: c0a8 19c1 7208 1f90 bf1b b32e 0000 0000 ....r...........
0x0020: a002 6d60 f932 0000 0204 0578 0402 080a ..m`.2.....x....
0x0030: a913 4e2a 0000 0000 0103 0307 ..N*........
14:32:43.870168 IP (tos 0x0, ttl 254, id 63814, offset 0, flags [DF], proto TCP (6), length 60)
10.0.2.31.29192 > 192.168.25.193.webcache: Flags [S], cksum 0xf53f (correct), seq 3206263598, win 28000, options [mss 1400,sackOK,TS val 2836615709 ecr 0,nop,wscale 7], length 0
0x0000: 4500 003c f946 4000 fe06 9cec 0a00 021f E..<.F#.........
0x0010: c0a8 19c1 7208 1f90 bf1b b32e 0000 0000 ....r...........
0x0020: a002 6d60 f53f 0000 0204 0578 0402 080a ..m`.?.....x....
0x0030: a913 521d 0000 0000 0103 0307 ..R.........
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
Even the calia0da7d91dc2 interface on the host running my target/test pod shows increased RX packets and byte counts whenever I run the curl from the other pod on the other host. Traffic is definitely traversing.
[ec2-user#ip-10-0-3-222 ~]$ ifconfig
calia0da7d91dc2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1440
inet6 fe80::ecee:eeff:feee:eeee prefixlen 64 scopeid 0x20<link>
ether ee:ee:ee:ee:ee:ee txqueuelen 0 (Ethernet)
RX packets 84 bytes 5088 (4.9 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
What is preventing the networking from working between hosts here? Am I missing something obvious?
Edit 2 - information for Arjun Pandey- parjun8840
Here is some more info about my Calico configuration:
I am have disabled source/destination checking on all AWS EC2 worker nodes
I've followed the latest calico docs to configure the IP pool for cross-subnet and NAT use for traffic outside the cluster
calicoctl configs Note: it seems that the workloadendpoints are non-existent...
me#mine ~ aws-vault exec my-vault-entry -- kubectl get IPPool --all-namespaces
NAME AGE
default-ipv4-ippool 1d
me#mine ~ aws-vault exec my-vault-entry -- kubectl get IPPool default-ipv4-ippool -o yaml
apiVersion: crd.projectcalico.org/v1
kind: IPPool
metadata:
annotations:
projectcalico.org/metadata: '{"uid":"41bd2c82-d576-11e9-b1ef-121f3d7b4d4e","creationTimestamp":"2019-09-12T15:59:09Z"}'
creationTimestamp: "2019-09-12T15:59:09Z"
generation: 1
name: default-ipv4-ippool
resourceVersion: "500448"
selfLink: /apis/crd.projectcalico.org/v1/ippools/default-ipv4-ippool
uid: 41bd2c82-d576-11e9-b1ef-121f3d7b4d4e
spec:
blockSize: 26
cidr: 192.168.0.0/16
ipipMode: CrossSubnet
natOutgoing: true
nodeSelector: all()
vxlanMode: Never
me#mine ~ aws-vault exec my-vault-entry -- calicoctl get nodes
NAME
ip-10-254-109-184.ec2.internal
ip-10-254-109-237.ec2.internal
ip-10-254-111-147.ec2.internal
me#mine ~ aws-vault exec my-vault-entry -- calicoctl get workloadendpoints
WORKLOAD NODE NETWORKS INTERFACE
me#mine ~
Here is some network info for a sample host in the cluster and one of the test container's container network:
host ip a
[ec2-user#ip-10-254-109-184 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:1b:79:d1:c5:bc brd ff:ff:ff:ff:ff:ff
inet 10.254.109.184/26 brd 10.254.109.191 scope global dynamic eth0
valid_lft 2881sec preferred_lft 2881sec
inet6 fe80::1b:79ff:fed1:c5bc/64 scope link
valid_lft forever preferred_lft forever
3: eni808caba7453#if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default
link/ether c2:be:80:d4:6a:f3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::c0be:80ff:fed4:6af3/64 scope link
valid_lft forever preferred_lft forever
5: tunl0#NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
inet 192.168.29.128/32 brd 192.168.29.128 scope global tunl0
valid_lft forever preferred_lft forever
6: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:12:58:bb:c6:1a brd ff:ff:ff:ff:ff:ff
inet 10.254.109.137/26 brd 10.254.109.191 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::12:58ff:febb:c61a/64 scope link
valid_lft forever preferred_lft forever
7: enia6f1918d9e2#if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default
link/ether 96:f5:36:53:e9:55 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::94f5:36ff:fe53:e955/64 scope link
valid_lft forever preferred_lft forever
8: enia32d23ac2d1#if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default
link/ether 36:5e:34:a7:82:30 brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::345e:34ff:fea7:8230/64 scope link
valid_lft forever preferred_lft forever
9: cali5e7dde1e39e#if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 3
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever
[ec2-user#ip-10-254-109-184 ~]$
nsenter on the test container pid to get ip a info:
[ec2-user#ip-10-254-109-184 ~]$ sudo nsenter -t 15715 -n ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: tunl0#NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0#if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
link/ether 9a:6d:db:06:74:cb brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.29.129/32 scope global eth0
valid_lft forever preferred_lft forever
I am not sure about the exact solution right now ( I haven't tested calico on AWS, normally I use amazon-vpc-cni-k8s on AWS and on physical cluster calico), but below are the quick things we can look into.
Calico AWS requirement- https://docs.projectcalico.org/v2.3/reference/public-cloud/aws
kubectl get IPPool --all-namespaces
NAME AGE
default-ipv4-ippool 15d
kubectl get IPPool default-ipv4-ippool -o yaml
~ calicoctl get nodes
NAME
node1
node2
node3
node4
~ calicoctl get workloadendpoints
NODE ORCHESTRATOR WORKLOAD NAME
node2 k8s default.myapp-569c54f85-xtktk eth0
node1 k8s kube-system.calico-kube-controllers-5cbcccc885-b9x8s eth0
node1 k8s kube-system.coredns-fb8b8dcde-2zpw8 eth0
node1 k8s kube-system.coredns-fb8b8dcfg-hc6zv eth0
Also if we can get the detail of container network:
nsenter -t pid -n ip a
And for the host as well:
ip a

Building a Bare Metal Kubernetes Cluster with kubeadm

I am trying to build a 3 master, 3 worker Kubernetes Cluster, with 3 separate etcd servers.
[root#K8sMaster01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8smaster01 Ready master 5h v1.11.1
k8smaster02 Ready master 4h v1.11.1
k8smaster03 Ready master 4h v1.11.1
k8snode01 Ready <none> 4h v1.11.1
k8snode02 Ready <none> 4h v1.11.1
k8snode03 Ready <none> 4h v1.11.1
I have spent weeks trying to get those to work, but can not get beyond one problem.
The containers / pods cannot access the API server.
[root#K8sMaster01 ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:50:16Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
[root#K8sMaster01 ~]# cat /etc/redhat-release
Fedora release 28 (Twenty Eight)
[root#K8sMaster01 ~]# uname -a
Linux K8sMaster01 4.16.3-301.fc28.x86_64 #1 SMP Mon Apr 23 21:59:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-c2wbh 1/1 Running 1 4h
coredns-78fcdf6894-psbtq 1/1 Running 1 4h
heapster-77f99d6b7c-5pxj6 1/1 Running 0 4h
kube-apiserver-k8smaster01 1/1 Running 1 4h
kube-apiserver-k8smaster02 1/1 Running 1 4h
kube-apiserver-k8smaster03 1/1 Running 1 4h
kube-controller-manager-k8smaster01 1/1 Running 1 4h
kube-controller-manager-k8smaster02 1/1 Running 1 4h
kube-controller-manager-k8smaster03 1/1 Running 1 4h
kube-flannel-ds-amd64-542x6 1/1 Running 0 4h
kube-flannel-ds-amd64-6dw2g 1/1 Running 4 4h
kube-flannel-ds-amd64-h6j9b 1/1 Running 1 4h
kube-flannel-ds-amd64-mgggx 1/1 Running 0 3h
kube-flannel-ds-amd64-p8xfk 1/1 Running 0 4h
kube-flannel-ds-amd64-qp86h 1/1 Running 4 4h
kube-proxy-4bqxh 1/1 Running 0 3h
kube-proxy-56p4n 1/1 Running 0 3h
kube-proxy-7z8p7 1/1 Running 0 3h
kube-proxy-b59ns 1/1 Running 0 3h
kube-proxy-fc6zg 1/1 Running 0 3h
kube-proxy-wrxg7 1/1 Running 0 3h
kube-scheduler-k8smaster01 1/1 Running 1 4h
kube-scheduler-k8smaster02 1/1 Running 1 4h
kube-scheduler-k8smaster03 1/1 Running 1 4h
**kubernetes-dashboard-6948bdb78-4f7qj 1/1 Running 19 1h**
node-problem-detector-v0.1-77fdw 1/1 Running 0 4h
node-problem-detector-v0.1-96pld 1/1 Running 1 4h
node-problem-detector-v0.1-ctnfn 1/1 Running 0 3h
node-problem-detector-v0.1-q2xvw 1/1 Running 0 4h
node-problem-detector-v0.1-vvf4j 1/1 Running 1 4h
traefik-ingress-controller-7w44f 1/1 Running 0 4h
traefik-ingress-controller-8cprj 1/1 Running 1 4h
traefik-ingress-controller-f6c7q 1/1 Running 0 3h
traefik-ingress-controller-tf8zw 1/1 Running 0 4h
kube-ops-view-6744bdc77d-2x5w8 1/1 Running 0 2h
kube-ops-view-redis-74578dcc5d-5fnvf 1/1 Running 0 2h
The kubernetes-dashboard will not start, but actually the same is for the kube-ops-view. Core DNS also has errors. All this to me is something to do with networks. I have tried:
sudo iptables -P FORWARD ACCEPT
sudo iptables --policy FORWARD ACCEPT
sudo iptables -A FORWARD -o flannel.1 -j ACCEPT
Core DNS give this error in the logs:
[root#K8sMaster01 ~]# kubectl logs coredns-78fcdf6894-c2wbh -n kube-system
.:53
2018/08/26 15:15:28 [INFO] CoreDNS-1.1.3
2018/08/26 15:15:28 [INFO] linux/amd64, go1.10.1, b0fd575c
2018/08/26 15:15:28 [INFO] plugin/reload: Running configuration MD5 = 2a066f12ec80aeb2b92740dd74c17138
CoreDNS-1.1.3
linux/amd64, go1.10.1, b0fd575c
E0826 17:12:19.624560 1 reflector.go:322] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to watch *v1.Service: Get https://10.96.0.1:443/api/v1/services?resourceVersion=556&timeoutSeconds=389&watch=true: dial tcp 10.96.0.1:443: i/o timeout
2018/08/26 17:35:34 [ERROR] 2 kube-ops-view-redis.uk.specsavers.com. A: unreachable backend: read udp 10.96.0.7:46862->10.4.4.28:53: i/o timeout
2018/08/26 17:35:34 [ERROR] 2 kube-ops-view-redis.uk.specsavers.com. AAAA: unreachable backend: read udp 10.96.0.7:46690->10.4.4.28:53: i/o timeout
2018/08/26 17:35:37 [ERROR] 2 kube-ops-view-redis.uk.specsavers.com. AAAA: unreachable backend: read udp 10.96.0.7:60267->10.4.4.28:53: i/o timeout
2018/08/26 17:35:37 [ERROR] 2 kube-ops-view-redis.uk.specsavers.com. A: unreachable backend: read udp 10.96.0.7:41482->10.4.4.28:53: i/o timeout
2018/08/26 17:36:58 [ERROR] 2 kube-ops-view-redis.specsavers.local. AAAA: unreachable backend: read udp 10.96.0.7:58042->10.4.4.28:53: i/o timeout
2018/08/26 17:36:58 [ERROR] 2 kube-ops-view-redis.specsavers.local. A: unreachable backend: read udp 10.96.0.7:53149->10.4.4.28:53: i/o timeout
2018/08/26 17:37:01 [ERROR] 2 kube-ops-view-redis.specsavers.local. A: unreachable backend: read udp 10.96.0.7:36861->10.4.4.28:53: i/o timeout
2018/08/26 17:37:01 [ERROR] 2 kube-ops-view-redis.specsavers.local. AAAA: unreachable backend: read udp 10.96.0.7:43235->10.4.4.28:53: i/o timeout
The Dash board:
[root#K8sMaster01 ~]# kubectl logs kubernetes-dashboard-6948bdb78-4f7qj -n kube-system
2018/08/26 20:10:31 Starting overwatch
2018/08/26 20:10:31 Using in-cluster config to connect to apiserver
2018/08/26 20:10:31 Using service account token for csrf signing
2018/08/26 20:10:31 No request provided. Skipping authorization
2018/08/26 20:11:01 Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout
Refer to our FAQ and wiki pages for more information: https://github.com/kubernetes/dashboard/wiki/FAQ
kube-ops-view:
ERROR:kube_ops_view.update:Failed to query cluster 10-96-0-1:443 (https://10.96.0.1:443): ConnectTimeout (try 141, wait 63 seconds)
10.96.3.1 - - [2018-08-26 20:12:34] "GET /health HTTP/1.1" 200 117 0.001002
10.96.3.1 - - [2018-08-26 20:12:44] "GET /health HTTP/1.1" 200 117 0.000921
10.96.3.1 - - [2018-08-26 20:12:54] "GET /health HTTP/1.1" 200 117 0.000926
10.96.3.1 - - [2018-08-26 20:13:04] "GET /health HTTP/1.1" 200 117 0.000924
10.96.3.1 - - [2018-08-26 20:13:14] "GET /health HTTP/1.1" 200 117 0.000942
10.96.3.1 - - [2018-08-26 20:13:24] "GET /health HTTP/1.1" 200 117 0.000924
10.96.3.1 - - [2018-08-26 20:13:34] "GET /health HTTP/1.1" 200 117 0.000939
ERROR:kube_ops_view.update:Failed to query cluster 10-96-0-1:443 (https://10.96.0.1:443): ConnectTimeout (try 142, wait 61 seconds)
Flannel has created the networks:
[root#K8sMaster01 ~]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu
65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever 2: ens192: BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether 00:50:56:9a:80:f7 brd ff:ff:ff:ff:ff:ff
inet 10.34.88.182/24 brd 10.34.88.255 scope global dynamic ens192
valid_lft 7071sec preferred_lft 7071sec
inet 10.10.40.90/24 brd 10.10.40.255 scope global ens192:1
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe9a:80f7/64 scope link
valid_lft forever preferred_lft forever 3: docker0: <NO-ARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
group default
link/ether 02:42:cf:ec:b3:ee brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever 4: flannel.1: BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN
group default
link/ether 06:df:1e:87:b8:ee brd ff:ff:ff:ff:ff:ff
inet 10.96.0.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::4df:1eff:fe87:b8ee/64 scope link
valid_lft forever preferred_lft forever 5: cni0: BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP
group default qlen 1000
link/ether 0a:58:0a:60:00:01 brd ff:ff:ff:ff:ff:ff
inet 10.96.0.1/24 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::8c77:39ff:fe6e:8710/64 scope link
valid_lft forever preferred_lft forever 7: veth9527916b#if3: BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0
state UP group default
link/ether 46:62:b6:b8:b9:ac brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::4462:b6ff:feb8:b9ac/64 scope link
valid_lft forever preferred_lft forever 8: veth6e6f08f5#if3: BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0
state UP group default
link/ether 3e:a5:4b:8d:11:ce brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::3ca5:4bff:fe8d:11ce/64 scope link
valid_lft forever preferred_lft forever
I can ping the IP:
[root#K8sMaster01 ~]# ping 10.96.0.1
PING 10.96.0.1 (10.96.0.1) 56(84) bytes of data.
64 bytes from 10.96.0.1: icmp_seq=1 ttl=64 time=0.052 ms
64 bytes from 10.96.0.1: icmp_seq=2 ttl=64 time=0.032 ms
64 bytes from 10.96.0.1: icmp_seq=3 ttl=64 time=0.042 ms
and telent the port:
[root#K8sMaster01 ~]# telnet 10.96.0.1 443
Trying 10.96.0.1...
Connected to 10.96.0.1.
Escape character is '^]'.
Some one PLEASE save my back holiday weekend and tell me what is going wrong!
As requested here is my get services:
[root#K8sMaster01 ~]# kubectl get services --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default blackbox-database ClusterIP 10.110.56.121 <none> 3306/TCP 5h
default kube-ops-view ClusterIP 10.105.35.23 <none> 82/TCP 1d
default kube-ops-view-redis ClusterIP 10.107.254.193 <none> 6379/TCP 1d
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 1d
kube-system heapster ClusterIP 10.103.5.79 <none> 80/TCP 1d
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 1d
kube-system kubernetes-dashboard ClusterIP 10.96.220.152 <none> 443/TCP 1d
kube-system traefik-ingress-service ClusterIP 10.102.84.167 <none> 80/TCP,8080/TCP 1d
liab-live-bb blackbox-application ClusterIP 10.98.40.25 <none> 8080/TCP 5h
liab-live-bb blackbox-database ClusterIP 10.108.43.196 <none> 3306/TCP 5h
Telnet to port 46690:
[root#K8sMaster01 ~]# telnet 10.96.0.7 46690
Trying 10.96.0.7...
(no response)
Today I tried deploying two of my applications to the cluster, as can be seen in the get services. The "app" is unable to connect to the "db" it cannot resolve the DB service name. I believe that I have an issue with the networking, not sure if it is at the host level, or with in the kubernetes layer. I did notice my resolv.conf files were not pointing to localhost, and found some changes to make to the coredns config. When Ilooked at its configuration it was trying to point to a IP V6 Address, so changed it to this:
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local 10.96.0.0/12 {
pods insecure
}
prometheus :9153
proxy 10.4.4.28
cache 30
reload
}
kind: ConfigMap
metadata:
creationTimestamp: 2018-08-27T12:28:57Z
name: coredns
namespace: kube-system
resourceVersion: "174571"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
uid: c5016361-a9f4-11e8-b0b4-0050569afad9