route missed in kubernetes with calico - kubernetes

I am installing k8s with calico on centos8, everything looks well but I couldn't ping each other between pods.
I am using k8s as DATASTORE of calico, the deployment file is in calico.yaml
I don't know why there are some route missed, any suggestions are appreciated.
Here are some informations about the cluster:
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
instance-4njec0xa-1 Ready <none> 3h55m v1.19.4
instance-4njec0xa-2 Ready <none> 3h55m v1.19.4
instance-4njec0xa-3 Ready master 3h56m v1.19.4
on master node
# ./calicoctl node status
Calico process is running.
IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 192.168.0.194 | node-to-node mesh | up | 04:10:41 | Established |
| 192.168.0.195 | node-to-node mesh | up | 04:10:41 | Established |
+---------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 eth0
169.254.169.254 192.168.0.2 255.255.255.255 UGH 100 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.17.139.128 0.0.0.0 255.255.255.255 UH 0 0 0 cali96f39d92828
172.17.139.128 0.0.0.0 255.255.255.192 U 0 0 0 *
172.17.139.129 0.0.0.0 255.255.255.255 UH 0 0 0 caliccf893b1917
172.17.139.130 0.0.0.0 255.255.255.255 UH 0 0 0 cali09dc1beebda
172.17.153.64 192.168.0.194 255.255.255.192 UG 0 0 0 eth0
172.17.181.64 192.168.0.195 255.255.255.192 UG 0 0 0 eth0
192.168.0.0 0.0.0.0 255.255.240.0 U 100 0 0 eth0
on node1
# ./calicoctl node status
Calico process is running.
IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 192.168.0.195 | node-to-node mesh | up | 04:10:42 | Established |
| 192.168.0.196 | node-to-node mesh | up | 04:10:40 | Established |
+---------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 eth0
169.254.169.254 192.168.0.2 255.255.255.255 UGH 100 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.17.139.128 192.168.0.196 255.255.255.192 UG 0 0 0 eth0
172.17.153.64 0.0.0.0 255.255.255.192 U 0 0 0 *
172.17.153.69 0.0.0.0 255.255.255.255 UH 0 0 0 cali2587d39bec8
172.17.181.64 192.168.0.195 255.255.255.192 UG 0 0 0 eth0
192.168.0.0 0.0.0.0 255.255.240.0 U 100 0 0 eth0
on node2
./calicoctl node status
Calico process is running.
IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 192.168.0.194 | node-to-node mesh | up | 04:10:42 | Established |
| 192.168.0.196 | node-to-node mesh | up | 04:10:40 | Established |
+---------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 eth0
169.254.169.254 192.168.0.2 255.255.255.255 UGH 100 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.17.139.128 192.168.0.196 255.255.255.192 UG 0 0 0 eth0
172.17.153.64 192.168.0.194 255.255.255.192 UG 0 0 0 eth0
172.17.181.64 0.0.0.0 255.255.255.192 U 0 0 0 *
172.17.181.66 0.0.0.0 255.255.255.255 UH 0 0 0 cali12d4a061371
192.168.0.0 0.0.0.0 255.255.240.0 U 100 0 0 eth0

The calico file default disable ipip and vxlan, which causes the network failure. When I enable ipip and set the correct veth_mtu everything works well.

I have situation like this
I had 3 worker nodes with calico (without any changes)
My cluster spinning up in openstack and workers payload cant communicate with each other over IPIP. I create network rule in openstack which allow TCP #4 traffic between nodes. After a few seconds my cluster started working great again

Related

kubernetes: can't ping pods

I created a k8s cluster which network configuration to pod is podSubnet: 172.168.0.0/12
Then, I find that can't ping those pod's IP.
for example, the deployment of metrics
# on k8s-master01 node:
$ kubectl get po -n kube-system -o wide
metrics-server-545b8b99c6-r2ql5 1/1 Running 0 5d1h 172.171.14.193 k8s-node02 <none> <none>
# ping 172.171.14.193 -c 2
PING 172.171.14.193 (172.171.14.193) 56(84) bytes of data.
^C
--- 172.171.14.193 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1016ms
# this is route table
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.180.104.1 0.0.0.0 UG 0 0 0 eth0
10.180.104.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.161.125.0 10.180.104.110 255.255.255.192 UG 0 0 0 tunl0
172.162.195.0 10.180.104.109 255.255.255.192 UG 0 0 0 tunl0
172.169.92.64 10.180.104.108 255.255.255.192 UG 0 0 0 tunl0
172.169.244.192 0.0.0.0 255.255.255.255 UH 0 0 0 cali06e1673851f
172.169.244.192 0.0.0.0 255.255.255.192 U 0 0 0 *
172.171.14.192 10.180.104.111 255.255.255.192 UG 0 0 0 tunl0
that shows metric pod host on k8s-node02. This is k8s-node02's route table
# route -n
Kernel IP routing table of k8s-master01
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.180.104.1 0.0.0.0 UG 0 0 0 eth0
10.180.104.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.169.254 10.180.104.11 255.255.255.255 UGH 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.161.125.0 10.180.104.110 255.255.255.192 UG 0 0 0 tunl0
172.162.195.0 10.180.104.109 255.255.255.192 UG 0 0 0 tunl0
172.169.92.64 10.180.104.108 255.255.255.192 UG 0 0 0 tunl0
172.169.244.192 10.180.104.107 255.255.255.192 UG 0 0 0 tunl0
172.171.14.192 0.0.0.0 255.255.255.192 U 0 0 0 *
172.171.14.193 0.0.0.0 255.255.255.255 UH 0 0 0 cali872eed170f4
172.171.14.194 0.0.0.0 255.255.255.255 UH 0 0 0 cali7d7625dd37e
172.171.14.203 0.0.0.0 255.255.255.255 UH 0 0 0 calid4e258f95f6
172.171.14.204 0.0.0.0 255.255.255.255 UH 0 0 0 cali5cf96eb1028
in fact, all pods can't access. I created a service based on example deployment.
# kubectl describe svc my-service
Name: my-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=demo-nginx
Type: ClusterIP
IP Families: <none>
IP: 10.100.75.139
IPs: 10.100.75.139
Port: http 80/TCP
TargetPort: 80/TCP
Endpoints: 172.161.125.14:80,172.161.125.15:80,172.171.14.203:80
Session Affinity: None
Events: <none>
# ping 10.100.75.139 -c 1
PING 10.100.75.139 (10.100.75.139) 56(84) bytes of data.
64 bytes from 10.100.75.139: icmp_seq=1 ttl=64 time=0.077 ms
# nc -vz 10.100.75.139 80
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.
I suppose that the root cause is route table but not sure. Would you please help to fix this issue??
please feel free to let me know if you need more information.
Thanks a lot in advance.
BR//
In connect mode, Ncat initiates a connection (or sends UDP data) to a service that is listening somewhere. For those familiar with socket programming, connect mode is like using the connect function. In listen mode, Ncat waits for an incoming connection (or data receipt), like using the bind and listen to functions.
A connection timed out response indicates that your connection is not working, which could mean your firewall is blocking the port. Test the connection status by adding a rule that accepts connections on the required port.
There might be issues with traffic rules, try adding outbound traffic rules in security groups if you face the error again. Even after adding the traffic rules, if you face the same error then there must be some restricting rules on the node which is blocking you from accessing the port.
Refer to Testing Network services for more information.

Trying phppgadmin docker container to view postgres database on host: it says login failed

I want to view the host postgresql with phppgadmin docker container
My host is archlinux and with postgresql server running on it.
I have /var/lib/postgres/data/postgresql.conf
listen_addresses = "*"
and
/var/lib/postgres/data/pg_hba.conf
host all all 172.17.0.0/16 password
I want to view the postgresql tables. So i using phppgadmin docker with the following command
docker run --name='phppgadmin' --rm \
--publish=8888:80 \
-e PHP_PG_ADMIN_SERVER_HOST="127.0.0.1" \
dockage/phppgadmin:latest
Now i can open the phppgadmin from 127.0.0.1:8888/phppgadmin
But when i try to login it says login failed
I have a django project on my host using the using hosts postgresql. That works well with the settings
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': ‘<db_name>’,
'USER': '<db_username>',
'PASSWORD': '<password>',
'HOST': '127.0.0.1',
'PORT': '5432',
}
}
Also my netstat output on host
$ netstat -nrv
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 wlp3s0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.18.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br-1c7e732767f4
172.20.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br-17604ffc4858
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 wlp3s0
on my docker container
$ netstat -nrv
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 172.17.0.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
From netstat -nrv on my phppgadmin docker container
$ netstat -nrv
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 172.17.0.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
So the ip address of the host is 172.17.0.1
Change PHP_PG_ADMIN_SERVER_HOST="127.0.0.1" to PHP_PG_ADMIN_SERVER_HOST="172.17.0.1"
docker run --name='phppgadmin' --rm \
--publish=8888:80 \
-e PHP_PG_ADMIN_SERVER_HOST="172.17.0.1" \
dockage/phppgadmin:latest
with:
I have /var/lib/postgres/data/postgresql.conf
listen_addresses = "localhost,127.0.0.1,172.17.0.1"
and
/var/lib/postgres/data/pg_hba.conf
host all all 172.17.0.0/16 md5
Open 127.0.0.1:8888/phppgadmin and login with non Superuser
For Superuser its still not working.

VPN to access cluster services / pods : cannot ping anything except openvpn server

I'm trying to setup a VPN to access my cluster's workloads without setting public endpoints.
Service is deployed using the OpenVPN helm chart, and kubernetes using Rancher v2.3.2
replacing L4 loadbalacer with a simple service discovery
edit configMap to allow TCP to go through the loadbalancer and reach the VPN
What does / doesn't work:
OpenVPN client can connect successfully
Cannot ping public servers
Cannot ping Kubernetes services or pods
Can ping openvpn cluster IP "10.42.2.11"
My files
vars.yml
---
replicaCount: 1
nodeSelector:
openvpn: "true"
openvpn:
OVPN_K8S_POD_NETWORK: "10.42.0.0"
OVPN_K8S_POD_SUBNET: "255.255.0.0"
OVPN_K8S_SVC_NETWORK: "10.43.0.0"
OVPN_K8S_SVC_SUBNET: "255.255.0.0"
persistence:
storageClass: "local-path"
service:
externalPort: 444
Connection works, but I'm not able to hit any ip inside my cluster.
The only ip I'm able to reach is the openvpn cluster ip.
openvpn.conf:
server 10.240.0.0 255.255.0.0
verb 3
key /etc/openvpn/certs/pki/private/server.key
ca /etc/openvpn/certs/pki/ca.crt
cert /etc/openvpn/certs/pki/issued/server.crt
dh /etc/openvpn/certs/pki/dh.pem
key-direction 0
keepalive 10 60
persist-key
persist-tun
proto tcp
port 443
dev tun0
status /tmp/openvpn-status.log
user nobody
group nogroup
push "route 10.42.2.11 255.255.255.255"
push "route 10.42.0.0 255.255.0.0"
push "route 10.43.0.0 255.255.0.0"
push "dhcp-option DOMAIN-SEARCH openvpn.svc.cluster.local"
push "dhcp-option DOMAIN-SEARCH svc.cluster.local"
push "dhcp-option DOMAIN-SEARCH cluster.local"
client.ovpn
client
nobind
dev tun
remote xxxx xxx tcp
CERTS CERTS
dhcp-option DOMAIN openvpn.svc.cluster.local
dhcp-option DOMAIN svc.cluster.local
dhcp-option DOMAIN cluster.local
dhcp-option DOMAIN online.net
I don't really know how to debug this.
I'm using windows
route command from client
Destination Gateway Genmask Flags Metric Ref Use Ifac
0.0.0.0 livebox.home 255.255.255.255 U 0 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 256 0 0 eth0
192.168.1.17 0.0.0.0 255.255.255.255 U 256 0 0 eth0
192.168.1.255 0.0.0.0 255.255.255.255 U 256 0 0 eth0
224.0.0.0 0.0.0.0 240.0.0.0 U 256 0 0 eth0
255.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 eth0
224.0.0.0 0.0.0.0 240.0.0.0 U 256 0 0 eth1
255.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 eth1
0.0.0.0 10.240.0.5 255.255.255.255 U 0 0 0 eth1
10.42.2.11 10.240.0.5 255.255.255.255 U 0 0 0 eth1
10.42.0.0 10.240.0.5 255.255.0.0 U 0 0 0 eth1
10.43.0.0 10.240.0.5 255.255.0.0 U 0 0 0 eth1
10.240.0.1 10.240.0.5 255.255.255.255 U 0 0 0 eth1
127.0.0.0 0.0.0.0 255.0.0.0 U 256 0 0 lo
127.0.0.1 0.0.0.0 255.255.255.255 U 256 0 0 lo
127.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 lo
224.0.0.0 0.0.0.0 240.0.0.0 U 256 0 0 lo
255.255.255.255 0.0.0.0 255.255.255.255 U 256 0 0 lo
And finally ifconfig
inet 192.168.1.17 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 2a01:cb00:90c:5300:603c:f8:703e:a876 prefixlen 64 scopeid 0x0<global>
inet6 2a01:cb00:90c:5300:d84b:668b:85f3:3ba2 prefixlen 128 scopeid 0x0<global>
inet6 fe80::603c:f8:703e:a876 prefixlen 64 scopeid 0xfd<compat,link,site,host>
ether 00:d8:61:31:22:32 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.240.0.6 netmask 255.255.255.252 broadcast 10.240.0.7
inet6 fe80::b9cf:39cc:f60a:9db2 prefixlen 64 scopeid 0xfd<compat,link,site,host>
ether 00:ff:42:04:53:4d (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 1500
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0xfe<compat,link,site,host>
loop (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
For anybody looking for a working sample, this is going to go into your openvpn deployment along side your container definition:
initContainers:
- args:
- -w
- net.ipv4.ip_forward=1
command:
- sysctl
image: busybox
name: openvpn-sidecar
securityContext:
privileged: true
Don't know if it is the RIGHT answer.
But I got it to work by adding a sidecar to my pods to execute
net.ipv4.ip_forward=1
which solved the issue
You can set ipForwardInitContainer option to "true" in values.yaml

Ping other pod in the same & different pod

I would like to ping B pod (node 1) from A pod (node 0) but it's unreachable. However, pinging pod in the same node can not be reachable too.
I am setting up new cluster for trying Kubernetes from Kelsey.
I have tried to use this link as my reference Kubernetes: Can't ping pods across nodes
Node - IP Private - IP Pod
worker-0 - 10.240.0.20 - 10.200.0.0/24
worker-1 - 10.240.0.21 - 10.200.1.0/24
worker-2 - 10.240.0.22 - 10.200.2.0/24
route -n
worker-0
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.240.0.1 0.0.0.0 UG 100 0 0 ens4
10.200.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cnio0
10.240.0.1 0.0.0.0 255.255.255.255 UH 100 0 0 ens4
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
worker-1
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.240.0.1 0.0.0.0 UG 100 0 0 ens4
10.200.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cnio0
10.240.0.1 0.0.0.0 255.255.255.255 UH 100 0 0 ens4
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
worker-2
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.240.0.1 0.0.0.0 UG 100 0 0 ens4
10.200.2.0 0.0.0.0 255.255.255.0 U 0 0 0 cnio0
10.240.0.1 0.0.0.0 255.255.255.255 UH 100 0 0 ens4
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
I have done setting up VPC Netowrk Routes like this link.
After that i followed this reference Kubernetes: Can't ping pods across nodes
route add -net 10.200.1.0 netmask 255.255.255.0 gw 10.240.0.21 in worker-0
The result is
SIOCADDRT: Network is unreachable
I tried it in worker-0, worker-1, worker-2 and got same result.
Eventhough worker-0 can ping to worker-1 (10.240.0.21), reachable.
My expectation when i am in Pod A (worker-0) with IP Pod 10.200.0.3, i can ping to Pod B (worker-1) with IP Pod 10.200.1.3. And also, i can ping to Pod C (worker-0) same with Pod A.
Does this step should be using Calico or Flannel ? or Should we can ping other pod from different node without Calico or Flannel (only CNI setting) ?
Additional Information
I am using Docker not runc & containderd.
So, i installed Docker manually from this link.
In kubelet.service, --container-runtime=remote become --container-runtime=docker
Try adding the routes like this:
Worker-0:
$ sudo route add -net 10.200.1.0 netmask 255.255.255.0 gw 10.240.0.21
$ sudo route add -net 10.200.2.0 netmask 255.255.255.0 gw 10.240.0.22
Worker-1:
$ sudo route add -net 10.200.0.0 netmask 255.255.255.0 gw 10.240.0.20
$ sudo route add -net 10.200.2.0 netmask 255.255.255.0 gw 10.240.0.22
Worker-2:
$ sudo route add -net 10.200.0.0 netmask 255.255.255.0 gw 10.240.0.20
$ sudo route add -net 10.200.1.0 netmask 255.255.255.0 gw 10.240.0.21

kubernetes : unable to join a node

I have created a master and am trying to join a node to create a cluster. When I try the join command I get the below error. Both the nodes are on the same network. The error message indicates that no routing exist to the host. I'm not sure how to establish a route to the host. Any help is appreciated.
sudo kubeadm join --token d23afe.14fde99cd03def7e 192.168.178.24:6443 --discovery-token-ca-cert-hash sha256:6a5e2674825e683bbdfe9bab512b03c556bcf89d8648317a64372bb44746bb39
[preflight] Running pre-flight checks.
[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.02.0-ce. Max validated version: 17.03
[WARNING FileExisting-crictl]: crictl not found in system path
[discovery] Trying to connect to API Server "192.168.178.24:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.178.24:6443"
[discovery] Failed to request cluster info, will try again: [Get https://192.168.178.24:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 192.168.178.24:6443: getsockopt: no route to host]
Here's the output of sudo route. Unfortunately, I have little knowledge to troubleshoot from this output
Here's the output of
`sudo route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 192.168.178.1 0.0.0.0 UG 202 0 0 eth0
10.32.0.0 0.0.0.0 255.240.0.0 U 0 0 0 weave
link-local 0.0.0.0 255.255.0.0 U 205 0 0 datapath
link-local 0.0.0.0 255.255.0.0 U 210 0 0 vethwe-datapath
link-local 0.0.0.0 255.255.0.0 U 211 0 0 vethwe-bridge
link-local 0.0.0.0 255.255.0.0 U 212 0 0 vxlan-6784
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.178.0 0.0.0.0 255.255.255.0 U 202 0 0 eth0
`
I managed to identify the issue. The issue was with the weave net plugin. I did a tear down and reinstalled the plugin. I was then able to join the node. Thanks all for your suggestions.