Kubernetes pods cannot communicate outbound

Kubernetes pods cannot communicate outbound - kubernetes

I have installed Kubernetes v1.13.10 on a group of VMs running CentOS 7. When I deploy pods, they can connect to one another but cannot connect to anything outside of the cluster. The CoreDNS pods have these errors in the log:
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. A: unreachable backend: read udp 172.21.0.33:48105->10.20.10.52:53: i/o timeout
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. AAAA: unreachable backend: read udp 172.21.0.33:49098->10.20.10.51:53: i/o timeout
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. AAAA: unreachable backend: read udp 172.21.0.33:53113->10.20.10.51:53: i/o timeout
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. A: unreachable backend: read udp 172.21.0.33:39648->10.20.10.51:53: i/o timeout
The IPs 10.20.10.51 and 10.20.10.52 are the internal DNS servers and are reachable from the nodes. I did a Wireshark capture from the DNS servers, and I see the traffic is coming in from the CoreDNS pod IP address 172.21.0.33. There would be no route for the DNS servers to get back to that IP as it isn't routable outside of the Kubernetes cluster.
My understanding is that an iptables rule should be implemented to nat the pod IPs to the address of the node when a pod is trying to communicate outbound (correct?). Below is the POSTROUTING chain in iptables:
[root#kube-aci-1 ~]# iptables -t nat -L POSTROUTING -v --line-number
Chain POSTROUTING (policy ACCEPT 23 packets, 2324 bytes)
num pkts bytes target prot opt in out source destination
1 1990 166K KUBE-POSTROUTING all -- any any anywhere anywhere /* kubernetes postrouting rules */
2 0 0 MASQUERADE all -- any ens192.152 172.21.0.0/16 anywhere
Line 1 was added by kube-proxy and line 2 was a line I manually added to try to nat anything coming from the pod subnet 172.21.0.0/16 to the node interface ens192.152, but that didn't work.
Here's the kube-proxy logs:
[root#kube-aci-1 ~]# kubectl logs kube-proxy-llq22 -n kube-system
W1117 16:31:59.225870 1 proxier.go:498] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.232006 1 proxier.go:498] Failed to load kernel module ip_vs_rr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.233727 1 proxier.go:498] Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.235700 1 proxier.go:498] Failed to load kernel module ip_vs_sh with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.255278 1 server_others.go:296] Flag proxy-mode="" unknown, assuming iptables proxy
I1117 16:31:59.289360 1 server_others.go:148] Using iptables Proxier.
I1117 16:31:59.296021 1 server_others.go:178] Tearing down inactive rules.
I1117 16:31:59.324352 1 server.go:484] Version: v1.13.10
I1117 16:31:59.335846 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I1117 16:31:59.336443 1 config.go:102] Starting endpoints config controller
I1117 16:31:59.336466 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
I1117 16:31:59.336493 1 config.go:202] Starting service config controller
I1117 16:31:59.336499 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I1117 16:31:59.436617 1 controller_utils.go:1034] Caches are synced for service config controller
I1117 16:31:59.436739 1 controller_utils.go:1034] Caches are synced for endpoints config controller
I have tried flushing the iptables nat table as well as restarted kube-proxy on all nodes, but the problem still persisted. Any clues in the output above, or thoughts on further troubleshooting?
Output of kubectl get nodes:
[root#kube-aci-1 ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube-aci-1 Ready master 85d v1.13.10 10.10.52.217 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://1.13.1
kube-aci-2 Ready <none> 85d v1.13.10 10.10.52.218 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://1.13.1

Turns out it is necessary to use a subnet that is routable on the network with the CNI in use if outbound communication from pods is necessary. I made the subnet routable on the external network and the pods can now communicate outbound.

Related

Can we setup a k8s bare matal server to run Bind DNS server (named) and have an access to it from the outside on port 53?

I have setup a k8s cluster using 2 bare metal servers (1 master and 1 worker) using kubespray with default settings (kube_proxy_mode: iptables and dns_mode: coredns) and I would like to run a BIND DNS server inside to manage a couple of domain names.
I deployed with helm 3 an helloworld web app for testing. Everything works like a charm (HTTP, HTTPs, Let's Encrypt thought cert-manager).
kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T21:03:42Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.7", GitCommit:"be3d344ed06bff7a4fc60656200a93c74f31f9a4", GitTreeState:"clean", BuildDate:"2020-02-11T19:24:46Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8smaster Ready master 22d v1.16.7
k8sslave Ready <none> 21d v1.16.7
I deployed with an Helm 3 chart an image of my BIND DNS Server (named) in default namespace; with a service exposing the port 53 of the bind app container.
I have tested the DNS resolution with a pod and the bind service; it works well. Here is the test of the bind k8s service from the master node:
kubectl -n default get svc bind -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
bind ClusterIP 10.233.31.255 <none> 53/TCP,53/UDP 4m5s app=bind,release=bind
kubectl get endpoints bind
NAME ENDPOINTS AGE
bind 10.233.75.239:53,10.233.93.245:53,10.233.75.239:53 + 1 more... 4m12s
export SERVICE_IP=`kubectl get services bind -o go-template='{{.spec.clusterIP}}{{"\n"}}'`
nslookup www.example.com ${SERVICE_IP}
Server: 10.233.31.255
Address: 10.233.31.255#53
Name: www.example.com
Address: 176.31.XXX.XXX
So the bind DNS app is deployed and is working fine through the bind k8s service.
For the next step; I followed the https://kubernetes.github.io/ingress-nginx/user-guide/exposing-tcp-udp-services/ documentation to setup the Nginx Ingress Controller (both configmap and service) to handle tcp/udp requests on port 53 and to redirect them to the bind DNS app.
When I test the name resolution from an external computer it does not work:
nslookup www.example.com <IP of the k8s master>
;; connection timed out; no servers could be reached
I digg into k8s configuration, logs, etc. and I found a warning message in kube-proxy logs:
ps auxw | grep kube-proxy
root 19984 0.0 0.2 141160 41848 ? Ssl Mar26 19:39 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=k8smaster
journalctl --since "2 days ago" | grep kube-proxy
<NOTHING RETURNED>
KUBEPROXY_FIRST_POD=`kubectl get pods -n kube-system -l k8s-app=kube-proxy -o go-template='{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}' | head -n 1`
kubectl logs -n kube-system ${KUBEPROXY_FIRST_POD}
I0326 22:26:03.491900 1 node.go:135] Successfully retrieved node IP: 91.121.XXX.XXX
I0326 22:26:03.491957 1 server_others.go:150] Using iptables Proxier.
I0326 22:26:03.492453 1 server.go:529] Version: v1.16.7
I0326 22:26:03.493179 1 conntrack.go:52] Setting nf_conntrack_max to 262144
I0326 22:26:03.493647 1 config.go:131] Starting endpoints config controller
I0326 22:26:03.493663 1 config.go:313] Starting service config controller
I0326 22:26:03.493669 1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I0326 22:26:03.493679 1 shared_informer.go:197] Waiting for caches to sync for service config
I0326 22:26:03.593986 1 shared_informer.go:204] Caches are synced for endpoints config
I0326 22:26:03.593992 1 shared_informer.go:204] Caches are synced for service config
E0411 17:02:48.113935 1 proxier.go:927] can't open "externalIP for ingress-nginx/ingress-nginx:bind-udp" (91.121.XXX.XXX:53/udp), skipping this externalIP: listen udp 91.121.XXX.XXX:53: bind: address already in use
E0411 17:02:48.119378 1 proxier.go:927] can't open "externalIP for ingress-nginx/ingress-nginx:bind-tcp" (91.121.XXX.XXX:53/tcp), skipping this externalIP: listen tcp 91.121.XXX.XXX:53: bind: address already in use
Then I look for who was already using the port 53...
netstat -lpnt | grep 53
tcp 0 0 0.0.0.0:5355 0.0.0.0:* LISTEN 1682/systemd-resolv
tcp 0 0 87.98.XXX.XXX:53 0.0.0.0:* LISTEN 19984/kube-proxy
tcp 0 0 169.254.25.10:53 0.0.0.0:* LISTEN 14448/node-cache
tcp6 0 0 :::9253 :::* LISTEN 14448/node-cache
tcp6 0 0 :::9353 :::* LISTEN 14448/node-cache
A look on the proc 14448/node-cache:
cat /proc/14448/cmdline
/node-cache-localip169.254.25.10-conf/etc/coredns/Corefile-upstreamsvccoredns
So coredns is already handling the port 53 which is normal cos it's the k8s internal DNS service.
In coredns documentation (https://github.com/coredns/coredns/blob/master/README.md) they talk about a -dns.port option to use a distinct port... but when I look into kubespray (which has 3 jinja templates https://github.com/kubernetes-sigs/kubespray/tree/release-2.12/roles/kubernetes-apps/ansible/templates for creating the coredns configmap, services etc. similar to https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#coredns) everything is hardcoded with port 53.
So my question is : Is there a k8s cluster configuration/workaround so I can run my own DNS Server and exposed it to port 53?
Maybe?
Setup the coredns to use a a different port than 53 ? Seems hard and I'm really not sure this makes sense!
I can setup my bind k8s service to expose port 5353 and configure the nginx ingress controller to handle this 5353 port and redirect to the app 53 port. But this would require to setup iptables to route external DSN requests* received on port 53 to my bind k8s service on port 5353 ? What would be the iptables config (INPUT / PREROUTING or FORWARD)? Does this kind of network configuration would breakes coredns?
Regards,
Chris

I suppose Your nginx-ingress doesn't work as expected. You need Load Balancer provider, such as MetalLB, to Your bare metal k8s cluster to receive external connections on ports like 53. And You don't need nginx-ingress to use with bind, just change bind Service type from ClusterIP to LoadBalancer and ensure you got an external IP on this Service. Your helm chart manual may help to switch to LoadBalancer.

No route to host from some Kubernetes containers to other containers in same cluster

This is a Kubespray deployment using calico. All the defaults are were left as-is except for the fact that there is a proxy. Kubespray ran to the end without issues.
Access to Kubernetes services started failing and after investigation, there was no route to host to the coredns service. Accessing a K8S service by IP worked. Everything else seems to be correct, so I am left with a cluster that works, but without DNS.
Here is some background information:
Starting up a busybox container:
# nslookup kubernetes.default
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find kubernetes.default: NXDOMAIN
*** Can't find kubernetes.default: No answer
Now the output while explicitly defining the IP of one of the CoreDNS pods:
# nslookup kubernetes.default 10.233.0.3
;; connection timed out; no servers could be reached
Notice that telnet to the Kubernetes API works:
# telnet 10.233.0.1 443
Connected to 10.233.0.1
kube-proxy logs:
10.233.0.3 is the service IP for coredns. The last line looks concerning, even though it is INFO.
$ kubectl logs kube-proxy-45v8n -nkube-system
I1114 14:19:29.657685 1 node.go:135] Successfully retrieved node IP: X.59.172.20
I1114 14:19:29.657769 1 server_others.go:176] Using ipvs Proxier.
I1114 14:19:29.664959 1 server.go:529] Version: v1.16.0
I1114 14:19:29.665427 1 conntrack.go:52] Setting nf_conntrack_max to 262144
I1114 14:19:29.669508 1 config.go:313] Starting service config controller
I1114 14:19:29.669566 1 shared_informer.go:197] Waiting for caches to sync for service config
I1114 14:19:29.669602 1 config.go:131] Starting endpoints config controller
I1114 14:19:29.669612 1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I1114 14:19:29.769705 1 shared_informer.go:204] Caches are synced for service config
I1114 14:19:29.769756 1 shared_informer.go:204] Caches are synced for endpoints config
I1114 14:21:29.666256 1 graceful_termination.go:93] lw: remote out of the list: 10.233.0.3:53/TCP/10.233.124.23:53
I1114 14:21:29.666380 1 graceful_termination.go:93] lw: remote out of the list: 10.233.0.3:53/TCP/10.233.122.11:53
All pods are running without crashing/restarts etc. and otherwise services behave correctly.
IPVS looks correct. CoreDNS service is defined there:
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.233.0.1:443 rr
-> x.59.172.19:6443 Masq 1 0 0
-> x.59.172.20:6443 Masq 1 1 0
TCP 10.233.0.3:53 rr
-> 10.233.122.12:53 Masq 1 0 0
-> 10.233.124.24:53 Masq 1 0 0
TCP 10.233.0.3:9153 rr
-> 10.233.122.12:9153 Masq 1 0 0
-> 10.233.124.24:9153 Masq 1 0 0
TCP 10.233.51.168:3306 rr
-> x.59.172.23:6446 Masq 1 0 0
TCP 10.233.53.155:44134 rr
-> 10.233.89.20:44134 Masq 1 0 0
UDP 10.233.0.3:53 rr
-> 10.233.122.12:53 Masq 1 0 314
-> 10.233.124.24:53 Masq 1 0 312
Host routing also looks correct.
# ip r
default via x.59.172.17 dev ens3 proto dhcp src x.59.172.22 metric 100
10.233.87.0/24 via x.59.172.21 dev tunl0 proto bird onlink
blackhole 10.233.89.0/24 proto bird
10.233.89.20 dev calib88cf6925c2 scope link
10.233.89.21 dev califdffa38ed52 scope link
10.233.122.0/24 via x.59.172.19 dev tunl0 proto bird onlink
10.233.124.0/24 via x.59.172.20 dev tunl0 proto bird onlink
x.59.172.16/28 dev ens3 proto kernel scope link src x.59.172.22
x.59.172.17 dev ens3 proto dhcp scope link src x.59.172.22 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
I have redeployed this same cluster in separate environments with flannel and calico with iptables instead of ipvs. I have also disabled the docker http proxy after deploy temporarily. None of which makes any difference.
Also:
kube_service_addresses: 10.233.0.0/18
kube_pods_subnet: 10.233.64.0/18
(They do not overlap)
What is the next step in debugging this issue?

I highly recommend you to avoid using latest busybox image to troubleshoot DNS. There are few issues reported regarding dnslookup on versions newer than 1.28.
v 1.28.4
user#node1:~$ kubectl exec -ti busybox busybox | head -1
BusyBox v1.28.4 (2018-05-22 17:00:17 UTC) multi-call binary.
user#node1:~$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 169.254.25.10
Address 1: 169.254.25.10
Name: kubernetes.default
Address 1: 10.233.0.1 kubernetes.default.svc.cluster.local
v 1.31.1
user#node1:~$ kubectl exec -ti busyboxlatest busybox | head -1
BusyBox v1.31.1 (2019-10-28 18:40:01 UTC) multi-call binary.
user#node1:~$ kubectl exec -ti busyboxlatest -- nslookup kubernetes.default
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find kubernetes.default: NXDOMAIN
*** Can't find kubernetes.default: No answer
command terminated with exit code 1
Going deeper and exploring more possibilities, I've reproduced your problem on GCP and after some digging I was able to figure out what is causing this communication problem.
GCE (Google Compute Engine) blocks traffic between hosts by default; we have to allow Calico traffic to flow between containers on different hosts.
According to calico documentation, you can do it by creating a firewall allowing this communication rule:
gcloud compute firewall-rules create calico-ipip --allow 4 --network "default" --source-ranges "10.128.0.0/9"
You can verify the rule with this command:
gcloud compute firewall-rules list
This is not present on the most recent calico documentation but it's still true and necessary.
Before creating firewall rule:
user#node1:~$ kubectl exec -ti busybox2 -- nslookup kubernetes.default
Server: 10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
nslookup: can't resolve 'kubernetes.default'
command terminated with exit code 1
After creating firewall rule:
user#node1:~$ kubectl exec -ti busybox2 -- nslookup kubernetes.default
Server: 10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.233.0.1 kubernetes.default.svc.cluster.local
It doesn't matter if you bootstrap your cluster using kubespray or kubeadm, this problem will happen because calico needs to communicate between nodes and GCE is blocking it as default.

This is what works for me, I tried to install my k8s cluster using kubespray configured with calico as CNI and containerd as container runtime
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -F
[delete coredns pod]

flannel restart very often

Flannel on node restarts always.
Log as follows:
root#debian:~# docker logs faa668852544
I0425 07:14:37.721766 1 main.go:514] Determining IP address of default interface
I0425 07:14:37.724855 1 main.go:527] Using interface with name eth0 and address 192.168.50.19
I0425 07:14:37.815135 1 main.go:544] Defaulting external address to interface address (192.168.50.19)
E0425 07:15:07.825910 1 main.go:241] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-arm-bg9rn': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-arm-bg9rn: dial tcp 10.96.0.1:443: i/o timeout
master configuration:
ubuntu: 16.04
node:
embedded system with debian rootfs(linux4.9).
kubernetes version:v1.14.1
docker version：18.09
flannel version：v0.11.0
I hope flannel run normal on node.

First, for flannel to work correctly, you must pass --pod-network-cidr=10.244.0.0/16 to kubeadm init.
kubeadm init --pod-network-cidr=10.244.0.0/16
Set /proc/sys/net/bridge/bridge-nf-call-iptables to 1 by running
sysctl net.bridge.bridge-nf-call-iptables=1
Next is to create the clusterrole and clusterrolebinding
as follows:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

Why doesn't kube-proxy route traffic to another worker node?

I've deployed several different services and always get the same error.
The service is reachable on the node port from the machine where the pod is running. On the two other nodes I get timeouts.
The kube-proxy is running on all worker nodes and I can see in the logfiles from kube-proxy that the service port was added and the node port was opened.
In this case I've deployed the stars demo from calico
Kube-proxy log output:
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.229458 659 service.go:309] Adding new service port "management-ui/management-ui:" at 10.32.0.133:9001/TCP
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.257483 659 proxier.go:1427] Opened local port "nodePort for management-ui/management-ui:" (:30002/tcp)
The kube-proxy is listening on the port 30002
root#kuben1:/tmp# netstat -lanp | grep 30002
tcp6 0 0 :::30002 :::* LISTEN 659/kube-proxy
There are also some iptable rules defined:
root#kuben1:/tmp# iptables -L -t nat | grep management-ui
KUBE-MARK-MASQ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-MARK-MASQ tcp -- !10.200.0.0/16 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
The interesting part is that I can reach the service IP from any worker node
root#kubem1:/tmp# kubectl get svc -n management-ui
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
management-ui NodePort 10.32.0.133 <none> 9001:30002/TCP 52m
The service IP/port can be accessed from any worker node if I do a "curl http://10.32.0.133:9001"
I don't understand why kube-proxy does not "route" this properly...
Has anyone a hint where I can find the error?
Here some cluster specs:
This is a hand build cluster inspired by Kelsey Hightower's "kubernetes the hard way" guide.
6 Nodes (3 master: 3 worker) local vms
OS: Ubuntu 18.04
K8s: v1.13.0
Docker: 18.9.3
Cni: calico
Component status on the master nodes looks okay
root#kubem1:/tmp# kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
The worker nodes are looking okay if I trust kubectl
root#kubem1:/tmp# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kuben1 Ready <none> 39d v1.13.0 192.168.178.77 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben2 Ready <none> 39d v1.13.0 192.168.178.78 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben3 Ready <none> 39d v1.13.0 192.168.178.79 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
As asked by P Ekambaram:
root#kubem1:/tmp# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-bgjdg 1/1 Running 5 40d
calico-node-nwkqw 1/1 Running 5 40d
calico-node-vrwn4 1/1 Running 5 40d
coredns-69cbb76ff8-fpssw 1/1 Running 5 40d
coredns-69cbb76ff8-tm6r8 1/1 Running 5 40d
kubernetes-dashboard-57df4db6b-2xrmb 1/1 Running 5 40d

I've found a solution for my "Problem".
This behavior was caused by a change in Docker v1.13.x and the issue was fixed in kubernetes with version 1.8.
The easy solution was to change the forward rules via iptables.
Run the following cmd on all worker nodes: "iptables -A FORWARD -j ACCEPT"
To fix it the right way i had to tell the kube-proxy the cidr for the pods.
Theoretical that could be solved in two ways:
Add "--cluster-cidr=10.0.0.0/16" as argument to the kube-proxy command line(in my case in the systemd service file)
Add 'clusterCIDR: "10.0.0.0/16"' to the kubeconfig file for kube-proxy
In my case the cmd line argument doesn't had any effect.
As i've added the line to my kubeconfig file and restarted the kube-proxy on all worker nodes everything works well.
Here is the github merge request for this "FORWARD" issue: link

kubernetes service IPs not reachable

So I've got a Kubernetes cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.
$ kubectl get no
NAME STATUS AGE
coreos-master-1 Ready,SchedulingDisabled 1h
coreos-worker-1 Ready 54m
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default curl-2421989462-h0dr7 1/1 Running 1 53m 10.2.26.4 coreos-worker-1
kube-system busybox 1/1 Running 0 55m 10.2.26.3 coreos-worker-1
kube-system kube-apiserver-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-controller-manager-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-worker-1 1/1 Running 0 58m 192.168.0.204 coreos-worker-1
kube-system kube-scheduler-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
$ kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.3.0.1 <none> 443/TCP 1h
As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.
From within the curl pod, I can reproduce the issue:
[ root#curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host
[ root#curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0 src 10.2.26.4
It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.
There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:
core#coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.
So here're the questions I have:
What is the 1st IP of the service network (10.3.0.1 in this case) used for?
Is above description of the traffic flow correct? If not, what steps does it take for a container to reach a service IP?
What are the best ways to debug each step in the traffic flow? (I can't get any idea what's wrong from the logs)
Thanks!

The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.
The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).
Troubleshooting is different on each network/cluster setup. I would generally check:
Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
Locate logs for kube-proxy and find clues (can you post some?)
Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
Is the overlay (pod) network functional?
If you leave comments I will get back to you..

I had this same problem, and the ultimate solution that worked for me was enabling IP forwarding on all nodes in the cluster, which I had neglected to do.
$ sudo sysctl net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
Service IPs and DNS started working immediately afterwards.

I had the same issue, turned out to be a configuration issue in kube-proxy.yaml For the "master" parameter I had the ip address as in - --master=192.168.3.240 but it actually required to be a url like - --master=https://192.168.3.240
FYI my kube-proxy sucessfully uses --proxy-mode=iptables (v1.6.x)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Kubernetes pods cannot communicate outbound - kubernetes

Turns out it is necessary to use a subnet that is routable on the network with the CNI in use if outbound communication from pods is necessary. I made the subnet routable on the external network and the pods can now communicate outbound.

Related

Can we setup a k8s bare matal server to run Bind DNS server (named) and have an access to it from the outside on port 53?

No route to host from some Kubernetes containers to other containers in same cluster

flannel restart very often

Why doesn't kube-proxy route traffic to another worker node?

kubernetes service IPs not reachable

Categories

Resources