VPN to access cluster services / pods : cannot ping anything except openvpn server - kubernetes

I'm trying to setup a VPN to access my cluster's workloads without setting public endpoints.
Service is deployed using the OpenVPN helm chart, and kubernetes using Rancher v2.3.2
replacing L4 loadbalacer with a simple service discovery
edit configMap to allow TCP to go through the loadbalancer and reach the VPN
What does / doesn't work:
OpenVPN client can connect successfully
Cannot ping public servers
Cannot ping Kubernetes services or pods
Can ping openvpn cluster IP "10.42.2.11"
My files
vars.yml
---
replicaCount: 1
nodeSelector:
openvpn: "true"
openvpn:
OVPN_K8S_POD_NETWORK: "10.42.0.0"
OVPN_K8S_POD_SUBNET: "255.255.0.0"
OVPN_K8S_SVC_NETWORK: "10.43.0.0"
OVPN_K8S_SVC_SUBNET: "255.255.0.0"
persistence:
storageClass: "local-path"
service:
externalPort: 444
Connection works, but I'm not able to hit any ip inside my cluster.
The only ip I'm able to reach is the openvpn cluster ip.
openvpn.conf:
server 10.240.0.0 255.255.0.0
verb 3
key /etc/openvpn/certs/pki/private/server.key
ca /etc/openvpn/certs/pki/ca.crt
cert /etc/openvpn/certs/pki/issued/server.crt
dh /etc/openvpn/certs/pki/dh.pem
key-direction 0
keepalive 10 60
persist-key
persist-tun
proto tcp
port 443
dev tun0
status /tmp/openvpn-status.log
user nobody
group nogroup
push "route 10.42.2.11 255.255.255.255"
push "route 10.42.0.0 255.255.0.0"
push "route 10.43.0.0 255.255.0.0"
push "dhcp-option DOMAIN-SEARCH openvpn.svc.cluster.local"
push "dhcp-option DOMAIN-SEARCH svc.cluster.local"
push "dhcp-option DOMAIN-SEARCH cluster.local"
client.ovpn
client
nobind
dev tun
remote xxxx xxx tcp
CERTS CERTS
dhcp-option DOMAIN openvpn.svc.cluster.local
dhcp-option DOMAIN svc.cluster.local
dhcp-option DOMAIN cluster.local
dhcp-option DOMAIN online.net
I don't really know how to debug this.
I'm using windows
route command from client
Destination Gateway Genmask Flags Metric Ref Use Ifac
0.0.0.0 livebox.home 255.255.255.255 U 0 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 256 0 0 eth0
192.168.1.17 0.0.0.0 255.255.255.255 U 256 0 0 eth0
192.168.1.255 0.0.0.0 255.255.255.255 U 256 0 0 eth0
224.0.0.0 0.0.0.0 240.0.0.0 U 256 0 0 eth0
255.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 eth0
224.0.0.0 0.0.0.0 240.0.0.0 U 256 0 0 eth1
255.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 eth1
0.0.0.0 10.240.0.5 255.255.255.255 U 0 0 0 eth1
10.42.2.11 10.240.0.5 255.255.255.255 U 0 0 0 eth1
10.42.0.0 10.240.0.5 255.255.0.0 U 0 0 0 eth1
10.43.0.0 10.240.0.5 255.255.0.0 U 0 0 0 eth1
10.240.0.1 10.240.0.5 255.255.255.255 U 0 0 0 eth1
127.0.0.0 0.0.0.0 255.0.0.0 U 256 0 0 lo
127.0.0.1 0.0.0.0 255.255.255.255 U 256 0 0 lo
127.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 lo
224.0.0.0 0.0.0.0 240.0.0.0 U 256 0 0 lo
255.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 lo
And finally ifconfig
inet 192.168.1.17 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 2a01:cb00:90c:5300:603c:f8:703e:a876 prefixlen 64 scopeid 0x0<global>
inet6 2a01:cb00:90c:5300:d84b:668b:85f3:3ba2 prefixlen 128 scopeid 0x0<global>
inet6 fe80::603c:f8:703e:a876 prefixlen 64 scopeid 0xfd<compat,link,site,host>
ether 00:d8:61:31:22:32 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.240.0.6 netmask 255.255.255.252 broadcast 10.240.0.7
inet6 fe80::b9cf:39cc:f60a:9db2 prefixlen 64 scopeid 0xfd<compat,link,site,host>
ether 00:ff:42:04:53:4d (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 1500
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0xfe<compat,link,site,host>
loop (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

For anybody looking for a working sample, this is going to go into your openvpn deployment along side your container definition:
initContainers:
- args:
- -w
- net.ipv4.ip_forward=1
command:
- sysctl
image: busybox
name: openvpn-sidecar
securityContext:
privileged: true

Don't know if it is the RIGHT answer.
But I got it to work by adding a sidecar to my pods to execute
net.ipv4.ip_forward=1
which solved the issue

You can set ipForwardInitContainer option to "true" in values.yaml

Related

Problem with socket connect between raspberry (client) and host

I've a Raspberry Pi 3 Model B. I would like to send data from raspberry to host. I use python program with socket package.
I've problem when my raspberry is client and my laptop is server.
I got next error:
OSError: [Errno 113] No route to host
Code client.py:
import socket
HOST = '192.168.0.107'
PORT = 5353
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect((HOST, PORT))
s.send(b'Hello, world')
data = s.recv(1024)
print('Received', repr(data))
server.py:
hostname = socket.gethostname()
HOST = socket.gethostbyname(hostname)
print(HOST)
PORT = 5353
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(('', PORT))
s.listen(1)
conn, addr = s.accept()
with conn:
print('Connected by', addr)
print(conn)
while True:
data = conn.recv(1024)
print(data)
if not data: break
conn.sendall(data)
ifconfig on server:
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.107 netmask 255.255.255.0 broadcast 192.168.0.255
inet6 fe80::ab65:70bf:9921:1d4b prefixlen 64 scopeid 0x20<link>
ether 18:31:bf:51:9d:9c txqueuelen 1000 (Ethernet)
RX packets 213759 bytes 177479962 (169.2 MiB)
RX errors 0 dropped 27 overruns 0 frame 0
TX packets 144335 bytes 25485658 (24.3 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 43423 bytes 3729254 (3.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 43423 bytes 3729254 (3.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
When I swap them i.e. raspberry becomes a server then all messages are send
raspberry: sudo ufw status
Status: active
To Action From
-- ------ ----
SSH ALLOW Anywhere
OpenSSH ALLOW Anywhere
80 ALLOW Anywhere
443 ALLOW Anywhere
443/tcp ALLOW Anywhere
5353 ALLOW Anywhere
5353/tcp ALLOW Anywhere
SSH (v6) ALLOW Anywhere (v6)
OpenSSH (v6) ALLOW Anywhere (v6)
80 (v6) ALLOW Anywhere (v6)
443 (v6) ALLOW Anywhere (v6)
443/tcp (v6) ALLOW Anywhere (v6)
5353 (v6) ALLOW Anywhere (v6)
5353/tcp (v6) ALLOW Anywhere (v6)
raspberry: netstat -lntu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:53 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 :::53 :::* LISTEN
tcp6 0 0 ::1:631 :::* LISTEN
udp 0 0 0.0.0.0:33841 0.0.0.0:*
udp 0 0 0.0.0.0:53 0.0.0.0:*
udp 0 0 0.0.0.0:68 0.0.0.0:*
udp 0 0 0.0.0.0:631 0.0.0.0:*
udp 0 0 0.0.0.0:5353 0.0.0.0:*
udp6 0 0 :::48624 :::*
udp6 0 0 :::53 :::*
udp6 0 0 :::5353 :::*
I think my problem is that port 5353 is not open on tcp. But the command
sudo ufw allow 5353/tcp
does not help.
Also, I reset my ufw's rules:
sudo ufw reset
added new rules like
sudo ufw allow SSH
sudo ufw allow OpenSSH
sudo ufw allow 80
sudo ufw allow 443
sudo ufw allow 5353/tcp
and I disabled and enabled ufw.
It's not worked.
SOLVED:
My host machine is Fedora. Fedora has its own firewall: link
My action in host:
sudo firewall-cmd --state
>> running
firewall-cmd --list-ports
>> [empty]
sudo firewall-cmd --add-port=5353/tcp --timeout 15m
>> success
firewall-cmd --list-ports
>> 5353/tcp
Then I launched the client on raspberry and got the data. Yippee!

kubernetes: can't ping pods

I created a k8s cluster which network configuration to pod is podSubnet: 172.168.0.0/12
Then, I find that can't ping those pod's IP.
for example, the deployment of metrics
# on k8s-master01 node:
$ kubectl get po -n kube-system -o wide
metrics-server-545b8b99c6-r2ql5 1/1 Running 0 5d1h 172.171.14.193 k8s-node02 <none> <none>
# ping 172.171.14.193 -c 2
PING 172.171.14.193 (172.171.14.193) 56(84) bytes of data.
^C
--- 172.171.14.193 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1016ms
# this is route table
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.180.104.1 0.0.0.0 UG 0 0 0 eth0
10.180.104.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.161.125.0 10.180.104.110 255.255.255.192 UG 0 0 0 tunl0
172.162.195.0 10.180.104.109 255.255.255.192 UG 0 0 0 tunl0
172.169.92.64 10.180.104.108 255.255.255.192 UG 0 0 0 tunl0
172.169.244.192 0.0.0.0 255.255.255.255 UH 0 0 0 cali06e1673851f
172.169.244.192 0.0.0.0 255.255.255.192 U 0 0 0 *
172.171.14.192 10.180.104.111 255.255.255.192 UG 0 0 0 tunl0
that shows metric pod host on k8s-node02. This is k8s-node02's route table
# route -n
Kernel IP routing table of k8s-master01
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.180.104.1 0.0.0.0 UG 0 0 0 eth0
10.180.104.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.169.254 10.180.104.11 255.255.255.255 UGH 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.161.125.0 10.180.104.110 255.255.255.192 UG 0 0 0 tunl0
172.162.195.0 10.180.104.109 255.255.255.192 UG 0 0 0 tunl0
172.169.92.64 10.180.104.108 255.255.255.192 UG 0 0 0 tunl0
172.169.244.192 10.180.104.107 255.255.255.192 UG 0 0 0 tunl0
172.171.14.192 0.0.0.0 255.255.255.192 U 0 0 0 *
172.171.14.193 0.0.0.0 255.255.255.255 UH 0 0 0 cali872eed170f4
172.171.14.194 0.0.0.0 255.255.255.255 UH 0 0 0 cali7d7625dd37e
172.171.14.203 0.0.0.0 255.255.255.255 UH 0 0 0 calid4e258f95f6
172.171.14.204 0.0.0.0 255.255.255.255 UH 0 0 0 cali5cf96eb1028
in fact, all pods can't access. I created a service based on example deployment.
# kubectl describe svc my-service
Name: my-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=demo-nginx
Type: ClusterIP
IP Families: <none>
IP: 10.100.75.139
IPs: 10.100.75.139
Port: http 80/TCP
TargetPort: 80/TCP
Endpoints: 172.161.125.14:80,172.161.125.15:80,172.171.14.203:80
Session Affinity: None
Events: <none>
# ping 10.100.75.139 -c 1
PING 10.100.75.139 (10.100.75.139) 56(84) bytes of data.
64 bytes from 10.100.75.139: icmp_seq=1 ttl=64 time=0.077 ms
# nc -vz 10.100.75.139 80
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.
I suppose that the root cause is route table but not sure. Would you please help to fix this issue??
please feel free to let me know if you need more information.
Thanks a lot in advance.
BR//
In connect mode, Ncat initiates a connection (or sends UDP data) to a service that is listening somewhere. For those familiar with socket programming, connect mode is like using the connect function. In listen mode, Ncat waits for an incoming connection (or data receipt), like using the bind and listen to functions.
A connection timed out response indicates that your connection is not working, which could mean your firewall is blocking the port. Test the connection status by adding a rule that accepts connections on the required port.
There might be issues with traffic rules, try adding outbound traffic rules in security groups if you face the error again. Even after adding the traffic rules, if you face the same error then there must be some restricting rules on the node which is blocking you from accessing the port.
Refer to Testing Network services for more information.

Requests timing out when accesing a Kubernetes clusterIP service

I am looking for help to troubleshoot this basic scenario that isn't working OK:
Three nodes installed with kubeadm on VirtualBox VMs running on a MacBook:
sudo kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubernetes-master Ready master 4h v1.10.2
kubernetes-node1 Ready <none> 4h v1.10.2
kubernetes-node2 Ready <none> 34m v1.10.2
The Virtualbox VMs have 2 adapters: 1) Host-only 2) NAT. The node IP's from the guest computer are:
kubernetes-master (192.168.56.3)
kubernetes-node1 (192.168.56.4)
kubernetes-node2 (192.168.56.5)
I am using flannel pod network (I also tried Calico previously with the same result).
When installing the master node I used this command:
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.56.3
I deployed an nginx application whose pods are up, one pod per node:
nginx-deployment-64ff85b579-sk5zs 1/1 Running 0 14m 10.244.2.2 kubernetes-node2
nginx-deployment-64ff85b579-sqjgb 1/1 Running 0 14m 10.244.1.2 kubernetes-node1
I exposed them as a ClusterIP service:
sudo kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 22m
nginx-deployment ClusterIP 10.98.206.211 <none> 80/TCP 14m
Now the problem:
I ssh into kubernetes-node1 and curl the service using the cluster IP:
ssh 192.168.56.4
---
curl 10.98.206.211
Sometimes the request goes fine, returning the nginx welcome page. I can see in the logs that this requests are always answered by the pod in the same node (kubernetes-node1). Some other requests are stuck until they time out. I guess that this ones were sent to the pod in the other node (kubernetes-node2).
The same happens the other way around, when ssh'd into kubernetes-node2 the pod from this node logs the successful requests and the others time out.
I seems there is some kind of networking problem and nodes can't access pods from the other nodes. How can I fix this?
UPDATE:
I downscaled the number of replicas to 1, so now there is only one pod on kubernetes-node2
If I ssh into kubernetes-node2 all curls go fine. When in kubernetes-node1 all requests time out.
UPDATE 2:
kubernetes-master ifconfig
cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.0.1 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::20a0:c7ff:fe6f:8271 prefixlen 64 scopeid 0x20<link>
ether 0a:58:0a:f4:00:01 txqueuelen 1000 (Ethernet)
RX packets 10478 bytes 2415081 (2.4 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11523 bytes 2630866 (2.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:cd:ce:84:a9 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.56.3 netmask 255.255.255.0 broadcast 192.168.56.255
inet6 fe80::a00:27ff:fe2d:298f prefixlen 64 scopeid 0x20<link>
ether 08:00:27:2d:29:8f txqueuelen 1000 (Ethernet)
RX packets 20784 bytes 2149991 (2.1 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 26567 bytes 26397855 (26.3 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.3.15 netmask 255.255.255.0 broadcast 10.0.3.255
inet6 fe80::a00:27ff:fe09:f08a prefixlen 64 scopeid 0x20<link>
ether 08:00:27:09:f0:8a txqueuelen 1000 (Ethernet)
RX packets 12662 bytes 12491693 (12.4 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4507 bytes 297572 (297.5 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.0.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::c078:65ff:feb9:e4ed prefixlen 64 scopeid 0x20<link>
ether c2:78:65:b9:e4:ed txqueuelen 0 (Ethernet)
RX packets 6 bytes 444 (444.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6 bytes 444 (444.0 B)
TX errors 0 dropped 15 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 464615 bytes 130013389 (130.0 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 464615 bytes 130013389 (130.0 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
tunl0: flags=193<UP,RUNNING,NOARP> mtu 1440
tunnel txqueuelen 1000 (IPIP Tunnel)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vethb1098eb3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet6 fe80::d8a3:a2ff:fedf:4d1d prefixlen 64 scopeid 0x20<link>
ether da:a3:a2:df:4d:1d txqueuelen 0 (Ethernet)
RX packets 10478 bytes 2561773 (2.5 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11538 bytes 2631964 (2.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
kubernetes-node1 ifconfig
cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.1.1 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::5cab:32ff:fe04:5b89 prefixlen 64 scopeid 0x20<link>
ether 0a:58:0a:f4:01:01 txqueuelen 1000 (Ethernet)
RX packets 199 bytes 41004 (41.0 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 331 bytes 56438 (56.4 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:0f:02:bb:ff txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.56.4 netmask 255.255.255.0 broadcast 192.168.56.255
inet6 fe80::a00:27ff:fe36:741a prefixlen 64 scopeid 0x20<link>
ether 08:00:27:36:74:1a txqueuelen 1000 (Ethernet)
RX packets 12834 bytes 9685221 (9.6 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 9114 bytes 1014758 (1.0 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.3.15 netmask 255.255.255.0 broadcast 10.0.3.255
inet6 fe80::a00:27ff:feb2:23a3 prefixlen 64 scopeid 0x20<link>
ether 08:00:27:b2:23:a3 txqueuelen 1000 (Ethernet)
RX packets 13263 bytes 12557808 (12.5 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 5065 bytes 341321 (341.3 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.1.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::7815:efff:fed6:1423 prefixlen 64 scopeid 0x20<link>
ether 7a:15:ef:d6:14:23 txqueuelen 0 (Ethernet)
RX packets 483 bytes 37506 (37.5 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 483 bytes 37506 (37.5 KB)
TX errors 0 dropped 15 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 3072 bytes 269588 (269.5 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3072 bytes 269588 (269.5 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
veth153293ec: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet6 fe80::70b6:beff:fe94:9942 prefixlen 64 scopeid 0x20<link>
ether 72:b6:be:94:99:42 txqueuelen 0 (Ethernet)
RX packets 81 bytes 19066 (19.0 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 129 bytes 10066 (10.0 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
kubernetes-node2 ifconfig
cni0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 10.244.2.1 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::4428:f5ff:fe8b:a76b prefixlen 64 scopeid 0x20<link>
ether 0a:58:0a:f4:02:01 txqueuelen 1000 (Ethernet)
RX packets 184 bytes 36782 (36.7 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 284 bytes 36940 (36.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:7f:e9:79:cd txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.56.5 netmask 255.255.255.0 broadcast 192.168.56.255
inet6 fe80::a00:27ff:feb7:ff54 prefixlen 64 scopeid 0x20<link>
ether 08:00:27:b7:ff:54 txqueuelen 1000 (Ethernet)
RX packets 12634 bytes 9466460 (9.4 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8961 bytes 979807 (979.8 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.3.15 netmask 255.255.255.0 broadcast 10.0.3.255
inet6 fe80::a00:27ff:fed8:9210 prefixlen 64 scopeid 0x20<link>
ether 08:00:27:d8:92:10 txqueuelen 1000 (Ethernet)
RX packets 12658 bytes 12491919 (12.4 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4544 bytes 297215 (297.2 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.2.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::c832:e4ff:fe3e:f616 prefixlen 64 scopeid 0x20<link>
ether ca:32:e4:3e:f6:16 txqueuelen 0 (Ethernet)
RX packets 111 bytes 8466 (8.4 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 111 bytes 8466 (8.4 KB)
TX errors 0 dropped 15 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 2940 bytes 258968 (258.9 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2940 bytes 258968 (258.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
UPDATE 3:
Kubelet logs:
kubernetes-master kubelet logs
kubernetes-node1 kubelet logs
kubernetes-node2 kubelet logs
IP Routes
Master
kubernetes-master:~$ ip route
default via 10.0.3.2 dev enp0s8 proto dhcp src 10.0.3.15 metric 100
10.0.3.0/24 dev enp0s8 proto kernel scope link src 10.0.3.15
10.0.3.2 dev enp0s8 proto dhcp scope link src 10.0.3.15 metric 100
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.56.0/24 dev enp0s3 proto kernel scope link src 192.168.56.3
Node1
kubernetes-node1:~$ ip route
default via 10.0.3.2 dev enp0s8 proto dhcp src 10.0.3.15 metric 100
10.0.3.0/24 dev enp0s8 proto kernel scope link src 10.0.3.15
10.0.3.2 dev enp0s8 proto dhcp scope link src 10.0.3.15 metric 100
10.244.0.0/24 via 10.244.0.0 dev flannel.1 onlink
10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.56.0/24 dev enp0s3 proto kernel scope link src 192.168.56.4
Node2
kubernetes-node2:~$ ip route
default via 10.0.3.2 dev enp0s8 proto dhcp src 10.0.3.15 metric 100
10.0.3.0/24 dev enp0s8 proto kernel scope link src 10.0.3.15
10.0.3.2 dev enp0s8 proto dhcp scope link src 10.0.3.15 metric 100
10.244.0.0/24 via 10.244.0.0 dev flannel.1 onlink
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.56.0/24 dev enp0s3 proto kernel scope link src 192.168.56.5
iptables-save:
kubernetes-master iptables-save
kubernetes-node1 iptables-save
kubernetes-node2 iptables-save
I was running into a similar problem with my K8s cluster with Flannel. I had set up the vms with a NAT nic for internet connectivity and a Host-Only nic for node to node communication. Flannel was choosing the NAT nic by default for node to node communication which obviously won't work in this scenario.
I modified the flannel manifest before deploying to set the --iface=enp0s8
argument to the Host-Only nic that should have been chosen (enp0s8 in my case). In your case it looks like enp0s3 would be the correct NIC. Node to node communication worked fine after that.
I failed to note that I also modified the kube-proxy manifest to include the --cluster-cidr=10.244.0.0/16 and --proxy-mode=iptables which appears to be required as well.
Flushed all firewalls with iptables --flush and iptables -tnat --flush then restart docker fixed it
check this github issue link
Based on your logs and the fact that you had problems only with connections between nodes which use Flannel, I guess you had a problem with Flannel CNI during the installation.
In logs from node1 and master, I see the following messages:
Error adding network: open /run/flannel/subnet.env: no such file or directory
Error while adding to cni network: open /run/flannel/subnet.env: no such file or directory
The root cause can be in network problem between VMs.
I recommend you to create 2 networks for each instance in your cluster - one with NAT for access to the Internet and one Host-only for in-cluster communication.
As an alternative way - you can use Bridge mode for interfaces of VMs if your network allows it.
Finally, the only suggestion I can provide - remove all cluster components and initialize cluster one more time using the configuration I mentioned above. That is the fastest way.
I have had the same issue after raw install kubernetes on raspberrypi cluster, with flannel.
The resolution was to disable ufw firewall.

expose kuberentes api to the rest of the network

ss -tnulp|grep 8443
tcp LISTEN 0 128 172.16.1.4:8443 *:* users:(("kube-apiserver",pid=29513,fd=5))
i have my api server running and i want to expose it to the rest of the network, this is the network config on my cluster :
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.16.1.4 netmask 255.255.255.0 broadcast 172.16.1.255
inet6 fe80::f816:3eff:feb5:93a3 prefixlen 64 scopeid 0x20<link>
ether fa:16:3e:b5:93:a3 txqueuelen 1000 (Ethernet)
RX packets 218935 bytes 2518654013 (2.3 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 160281 bytes 33994810 (32.4 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 139.54.130.39 netmask 255.255.254.0 broadcast 139.54.131.255
inet6 3ffe:302:11:2:f816:3eff:fe46:ab28 prefixlen 64 scopeid 0x0<global>
inet6 fd12:1f4b:e0bf:10:f816:3eff:fe46:ab28 prefixlen 64 scopeid 0x0<global>
inet6 fd12:1f4b:e0bf:1:f816:3eff:fe46:ab28 prefixlen 64 scopeid 0x0<global>
inet6 fe80::f816:3eff:fe46:ab28 prefixlen 64 scopeid 0x20<link>
ether fa:16:3e:46:ab:28 txqueuelen 1000 (Ethernet)
RX packets 3227129 bytes 845879874 (806.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1072031 bytes 132806957 (126.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
the VM has an external ip 139.54.130.39
Any leads how to do that ?
Did you try using this option
- --apiserver-advertise-address=139.54.130.39
Kubectl over this network will be able to handshake 139.54.130.39
you can apply this depends of your installation:
.......
In case .. you installed apiserver as pod
just you can change apiserver-advertise-address parameter in
/etc/kubernetes/manifests/kube-apiserver.yaml
or
check/list kube-system pods you have to get actual apiserver name and edit it (carefully )
kubectl get pod -n kube-system
kubectl edit pod -n kube-system kube-apiserver
........
In case .. you installed apiserver as service, edit systemd script
ex:
vim /etc/systemd/system/kube-apiserver.service
Edit
ExecStart=/usr/local/bin/kube-apiserver
--bind-address=0.0.0.0
--advertise_address=139.54.130.39

kubernetes default gateway not routing to local network

I'm seeing a weird issue on kubernetes and I'm not sure how to debug it. The k8s environment was installed by kube-up for vsphere using the 2016-01-08 kube.vmdk
The symptom is that the dns for a container in a pod is not working correctly. When I logon to the kube-dns service to check the settings everything looks correct. When I ping outside the local network it works as it should but when I ping inside my local network it cannot reach any of the hosts.
For the following my host network is 10.1.1.x, the gateway / dns server is 10.1.1.1.
inside the kube-dns container:
(I can ping outside the network by ip and I can ping the gateway just fine. dns isn't working since the nameserver is unreachable)
kube#kubernetes-master:~$ kubectl --namespace=kube-system exec -ti kube-dns-v20-in2me -- /bin/sh
/ # cat /etc/resolv.conf
nameserver 10.1.1.1
options ndots:5
/ # ping google.com
^C
/ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=54 time=13.542 ms
64 bytes from 8.8.8.8: seq=1 ttl=54 time=13.862 ms
^C
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 13.542/13.702/13.862 ms
/ # ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1): 56 data bytes
^C
--- 10.1.1.1 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
/ # netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 10.244.2.1 0.0.0.0 UG 0 0 0 eth0
10.244.2.0 * 255.255.255.0 U 0 0 0 eth0
/ # ping 10.244.2.1
PING 10.244.2.1 (10.244.2.1): 56 data bytes
64 bytes from 10.244.2.1: seq=0 ttl=64 time=0.249 ms
64 bytes from 10.244.2.1: seq=1 ttl=64 time=0.091 ms
^C
--- 10.244.2.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.091/0.170/0.249 ms
on the master:
kube#kubernetes-master:~$ netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 10.1.1.1 0.0.0.0 UG 0 0 0 eth0
10.1.1.0 * 255.255.255.0 U 0 0 0 eth0
10.244.0.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.1.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.2.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.3.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.246.0.0 * 255.255.255.0 U 0 0 0 cbr0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
kube#kubernetes-master:~$ ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.409 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=0.481 ms
^C
--- 10.1.1.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.409/0.445/0.481/0.036 ms
version:
kube#kubernetes-master:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.5", GitCommit:"5a0a696437ad35c133c0c8493f7e9d22b0f9b81b", GitTreeState:"clean", BuildDate:"2016-10-29T01:38:40Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.5", GitCommit:"5a0a696437ad35c133c0c8493f7e9d22b0f9b81b", GitTreeState:"clean", BuildDate:"2016-10-29T01:32:42Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
kubernetes-minion-2 (10.244.2.1):
(Per #der's response adding info from 10.244.2.1)
kube#kubernetes-minion-2:~$ ip addr show cbr0
5: cbr0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default
link/ether 8a:ef:b5:fc:28:f4 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.1/24 scope global cbr0
valid_lft forever preferred_lft forever
inet6 fe80::38b5:44ff:fe8a:6d79/64 scope link
valid_lft forever preferred_lft forever
kube#kubernetes-minion-2:~$ ping google.com
PING google.com (216.58.192.14) 56(84) bytes of data.
64 bytes from nuq04s29-in-f14.1e100.net (216.58.192.14): icmp_seq=1 ttl=52 time=11.8 ms
64 bytes from nuq04s29-in-f14.1e100.net (216.58.192.14): icmp_seq=2 ttl=52 time=11.6 ms
64 bytes from nuq04s29-in-f14.1e100.net (216.58.192.14): icmp_seq=3 ttl=52 time=10.4 ms
^C
--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 10.477/11.343/11.878/0.624 ms
kube#kubernetes-minion-2:~$ ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.369 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=0.456 ms
64 bytes from 10.1.1.1: icmp_seq=3 ttl=64 time=0.442 ms
^C
--- 10.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.369/0.422/0.456/0.041 ms
kube#kubernetes-minion-2:~$ netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 10.1.1.1 0.0.0.0 UG 0 0 0 eth0
10.1.1.0 * 255.255.255.0 U 0 0 0 eth0
10.244.0.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.1.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.2.0 * 255.255.255.0 U 0 0 0 cbr0
10.244.3.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
kube#kubernetes-minion-2:~$ routel
target gateway source proto scope dev tbl
default 10.1.1.1 eth0
10.1.1.0 24 10.1.1.86 kernel link eth0
10.244.0.0 24 10.1.1.88 eth0
10.244.1.0 24 10.1.1.87 eth0
10.244.2.0 24 10.244.2.1 kernel link cbr0
10.244.3.0 24 10.1.1.85 eth0
172.17.0.0 16 172.17.0.1 kernel linkdocker0
10.1.1.0 broadcast 10.1.1.86 kernel link eth0 local
10.1.1.86 local 10.1.1.86 kernel host eth0 local
10.1.1.255 broadcast 10.1.1.86 kernel link eth0 local
10.244.2.0 broadcast 10.244.2.1 kernel link cbr0 local
10.244.2.1 local 10.244.2.1 kernel host cbr0 local
10.244.2.255 broadcast 10.244.2.1 kernel link cbr0 local
127.0.0.0 broadcast 127.0.0.1 kernel link lo local
127.0.0.0 8 local 127.0.0.1 kernel host lo local
127.0.0.1 local 127.0.0.1 kernel host lo local
127.255.255.255 broadcast 127.0.0.1 kernel link lo local
172.17.0.0 broadcast 172.17.0.1 kernel linkdocker0 local
172.17.0.1 local 172.17.0.1 kernel hostdocker0 local
172.17.255.255 broadcast 172.17.0.1 kernel linkdocker0 local
::1 local kernel lo
fe80:: 64 kernel eth0
fe80:: 64 kernel cbr0
fe80:: 64 kernel veth6129284
default unreachable kernel lo unspec
::1 local none lo local
fe80::250:56ff:fe8e:d580 local none lo local
fe80::38b5:44ff:fe8a:6d79 local none lo local
fe80::88ef:b5ff:fefc:28f4 local none lo local
ff00:: 8 eth0 local
ff00:: 8 cbr0 local
ff00:: 8 veth6129284 local
default unreachable kernel lo unspec
How can I diagnose what is going on here?
thanks!
Turns out this is an issue with the default nat routing rules on the minions
$ iptables –t nat –vnxL
...
...
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
...
80 4896 MASQUERADE all -- * * 0.0.0.0/0 !10.0.0.0/8 /* kubelet: SNAT outbound cluster traffic */ ADDRTYPE match dst-type !LOCAL
...
...
This shows that all traffic coming from the 10.x.x.x network gets ignored by the postrouting rules.
If anyone runs across this fix it with:
$ iptables -t nat -I POSTROUTING 1 -s 10.244.0.0/16 -d 10.1.1.1/32 -j MASQUERADE
where 10.244.x.x/16 is the container network and 10.1.1.1 is the gateway ip
First, figure out what's up with kubernetes-mini. Do on it what you've done with the 2 nodes you've shown us.
All traffic between 10.1.1.0 and 10.244.2.0 goes through it. It, however, may have a bad route for the 10.1.1.0 net.