Why doesn't kube-proxy route traffic to another worker node? - kubernetes

I've deployed several different services and always get the same error.
The service is reachable on the node port from the machine where the pod is running. On the two other nodes I get timeouts.
The kube-proxy is running on all worker nodes and I can see in the logfiles from kube-proxy that the service port was added and the node port was opened.
In this case I've deployed the stars demo from calico
Kube-proxy log output:
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.229458 659 service.go:309] Adding new service port "management-ui/management-ui:" at 10.32.0.133:9001/TCP
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.257483 659 proxier.go:1427] Opened local port "nodePort for management-ui/management-ui:" (:30002/tcp)
The kube-proxy is listening on the port 30002
root#kuben1:/tmp# netstat -lanp | grep 30002
tcp6 0 0 :::30002 :::* LISTEN 659/kube-proxy
There are also some iptable rules defined:
root#kuben1:/tmp# iptables -L -t nat | grep management-ui
KUBE-MARK-MASQ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-MARK-MASQ tcp -- !10.200.0.0/16 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
The interesting part is that I can reach the service IP from any worker node
root#kubem1:/tmp# kubectl get svc -n management-ui
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
management-ui NodePort 10.32.0.133 <none> 9001:30002/TCP 52m
The service IP/port can be accessed from any worker node if I do a "curl http://10.32.0.133:9001"
I don't understand why kube-proxy does not "route" this properly...
Has anyone a hint where I can find the error?
Here some cluster specs:
This is a hand build cluster inspired by Kelsey Hightower's "kubernetes the hard way" guide.
6 Nodes (3 master: 3 worker) local vms
OS: Ubuntu 18.04
K8s: v1.13.0
Docker: 18.9.3
Cni: calico
Component status on the master nodes looks okay
root#kubem1:/tmp# kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
The worker nodes are looking okay if I trust kubectl
root#kubem1:/tmp# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kuben1 Ready <none> 39d v1.13.0 192.168.178.77 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben2 Ready <none> 39d v1.13.0 192.168.178.78 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben3 Ready <none> 39d v1.13.0 192.168.178.79 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
As asked by P Ekambaram:
root#kubem1:/tmp# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-bgjdg 1/1 Running 5 40d
calico-node-nwkqw 1/1 Running 5 40d
calico-node-vrwn4 1/1 Running 5 40d
coredns-69cbb76ff8-fpssw 1/1 Running 5 40d
coredns-69cbb76ff8-tm6r8 1/1 Running 5 40d
kubernetes-dashboard-57df4db6b-2xrmb 1/1 Running 5 40d

I've found a solution for my "Problem".
This behavior was caused by a change in Docker v1.13.x and the issue was fixed in kubernetes with version 1.8.
The easy solution was to change the forward rules via iptables.
Run the following cmd on all worker nodes: "iptables -A FORWARD -j ACCEPT"
To fix it the right way i had to tell the kube-proxy the cidr for the pods.
Theoretical that could be solved in two ways:
Add "--cluster-cidr=10.0.0.0/16" as argument to the kube-proxy command line(in my case in the systemd service file)
Add 'clusterCIDR: "10.0.0.0/16"' to the kubeconfig file for kube-proxy
In my case the cmd line argument doesn't had any effect.
As i've added the line to my kubeconfig file and restarted the kube-proxy on all worker nodes everything works well.
Here is the github merge request for this "FORWARD" issue: link

Related

unable to upgrade connection: pod does not exist

I have a error when i want to access to my pod:
error: unable to upgrade connection: pod does not exist
it's a cluster with 3 nodes, below some details. Thanks in advance
root#kubm:~/deploy/nginx# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubm Ready master 37h v1.17.0 10.0.2.15 <none> Ubuntu 16.04.6 LTS 4.4.0-150-generic docker://19.3.5
kubnode Ready <none> 37h v1.17.0 10.0.2.15 <none> Ubuntu 16.04.6 LTS 4.4.0-150-generic docker://19.3.5
kubnode2 Ready <none> 37h v1.17.0 10.0.2.15 <none> Ubuntu 16.04.6 LTS 4.4.0-150-generic docker://19.3.5
root#kubm:~/deploy/nginx# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-59c9f8dff-v7dvg 1/1 Running 0 16h 10.244.2.3 kubnode2 <none> <none>
root#kubm:~/deploy/nginx# kubectl exec -it nginx-59c9f8dff-v7dvg -- /bin/bash
**error: unable to upgrade connection: pod does not exist**
I had the same issue running a cluster with Vagrant and Virtualbox the first time.
Adding KUBELET_EXTRA_ARGS=--node-ip=x.x.x.x where x.x.x.x is your VM's IP in /etc/default/kubelet (this can be part of the provisioning script for example) and then restarting kubelet (systemctl restart kubelet) fixes the issues.
This is the recommended way to add extra runtime arguments to kubelet as you can see in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf. Alternatively you can also edit the kubelet config file under /etc/kubernetes/kubelet.conf
The 10.0.2.15 IP address is the default for virtualbox NAT
If you deploy a VM using a vagrantfile, your eth0 adapter will use the 10.0.2.15 IP address and the eth1 adapter will be assigned an other IP address.
K8s uses the eth0 adapter to route packets between pods.
I had the same issue and the problem was POD status as "ImagePullBackOff". Due to this, it was throwing error
error: unable to upgrade connection: container not found ("nginx")
[mayur#mayur_cloudtest ~]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-598b589c46-zcr5d 0/1 ImagePullBackOff 0 116s
[mayur#mayur_cloudtest ~]$
[mayur#mayur_cloudtest ~]$
[mayur#mayur_cloudtest ~]$ kubectl exec -it nginx-598b589c46-zcr5d -- /bin/bash
error: unable to upgrade connection: container not found ("nginx")
I use below command to get into a pod.
kubectl exec -i -t <pod-name> -- /bin/bash
Note -i and -t flag have a space on the command..
If you have multi-container pod you should pass container name with -c flag or it will by default connect to first container in POD.
ubuntu#cluster-master:~$ kubectl exec -i -t nginx -- /bin/bash
root#nginx:/# whoami
root
root#nginx:/# date
Tue Jan 7 14:12:29 UTC 2020
root#nginx:/#
Refer help section of command kubectl exec --help

Kube-state-metrics error: Failed to create client: ... i/o timeout

I'm running Kubernetes in virtual machines and going through the basic tutorials, currently Add logging and metrics to the PHP / Redis Guestbook example. I'm trying to install kube-state-metrics:
git clone https://github.com/kubernetes/kube-state-metrics.git kube-state-metrics
kubectl create -f kube-state-metrics/kubernetes
but it fails.
kubectl describe pod --namespace kube-system kube-state-metrics-7d84474f4d-d5dg7
...
Warning Unhealthy 28m (x8 over 30m) kubelet, kubernetes-node1 Readiness probe failed: Get http://192.168.129.102:8080/healthz: dial tcp 192.168.129.102:8080: connect: connection refused
kubectl logs --namespace kube-system kube-state-metrics-7d84474f4d-d5dg7 -c kube-state-metrics
I0514 17:29:26.980707 1 main.go:85] Using default collectors
I0514 17:29:26.980774 1 main.go:93] Using all namespace
I0514 17:29:26.980780 1 main.go:129] metric white-blacklisting: blacklisting the following items:
W0514 17:29:26.980800 1 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0514 17:29:26.983504 1 main.go:169] Testing communication with server
F0514 17:29:56.984025 1 main.go:137] Failed to create client: ERROR communicating with apiserver: Get https://10.96.0.1:443/version?timeout=32s: dial tcp 10.96.0.1:443: i/o timeout
I'm unsure if this 10.96.0.1 IP is correct. My virtual machines are in a bridged network 10.10.10.0/24 and a host-only network 192.168.59.0/24. When initializing Kubernetes I used the argument --pod-network-cidr=192.168.0.0/16 so that's one more IP range that I'd expect. But 10.96.0.1 looks unfamiliar.
I'm new to Kubernetes, just doing the basic tutorials, so I don't know what to do now. How to fix it or investigate further?
EDIT - additonal info:
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubernetes-master Ready master 15d v1.14.1 10.10.10.11 <none> Ubuntu 18.04.2 LTS 4.15.0-48-generic docker://18.9.2
kubernetes-node1 Ready <none> 15d v1.14.1 10.10.10.5 <none> Ubuntu 18.04.2 LTS 4.15.0-48-generic docker://18.9.2
kubernetes-node2 Ready <none> 15d v1.14.1 10.10.10.98 <none> Ubuntu 18.04.2 LTS 4.15.0-48-generic docker://18.9.2
The command I used to initialize the cluster:
sudo kubeadm init --apiserver-advertise-address=192.168.59.20 --pod-network-cidr=192.168.0.0/16
The reason for this is probably overlapping of Pod network with Node network - you set Pod network CIDR to 192.168.0.0/16 which your host-only network will be included into as its address is 192.168.59.0/24.
To solve this you can either change the pod network CIDR to 192.168.0.0/24 (it is not recommended as this will give you only 255 addresses for your pod networking)
You can also use different range for your Calico. If you want to do it on a running cluster here is an instruction.
Also other way I tried:
edit Calico manifest to different range (for example 10.0.0.0/8) - sudo kubeadm init --apiserver-advertise-address=192.168.59.20 --pod-network-cidr=10.0.0.0/8) and apply it after the init.
Another way would be using different CNI like Flannel (which uses 10.244.0.0/16).
You can find more information about ranges of CNI plugins here.

cilium configuration in IPv6 mode

I am using cilium in Kubernetes 1.12 in Direct Routing mode. It is working fine in IPv4 mode. We are using cilium/cilium:no-routes image and cloudnativelabs/kube-router to advertise the routes through BGP.
Now I would like to configure the same in IPv6 only Kubernetes cluster. But I found that kube-router pod is crashing and not creating the route entries for the --pod-network-cidr.
Following is the lab details -
master node: IPv6 private IP -fd0c:6493:12bf:2942::ac18:1164
Work node: IPv6 private IP -fd0c:6493:12bf:2942::ac18:1165
Public IP for both the nodes are IPv4 as i don't have IPv6 public IP.
IPv6 only K8s cluster is created as
master:
sudo kubeadm init --kubernetes-version v1.13.2 --pod-network-cidr=2001:2::/64 --apiserver-advertise-address=fd0c:6493:12bf:2942::ac18:1164 --token-ttl 0
worker:
sudo kubeadm join [fd0c:6493:12bf:2942::ac18:1164]:6443 --token 9k9sdq.el298rka0sjqy0ha --discovery-token-ca-cert-hash sha256:b830c22dc21561c9e9287275ecc675ec6de012662fabde3bd1aba03be66562eb
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP
EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master NotReady master 38h v1.13.2 fd0c:6493:12bf:2942::ac18:1164
<none> Ubuntu 18.10 4.18.0-13-generic docker://18.6.0
worker1 Ready <none> 38h v1.13.2 fd0c:6493:12bf:2942::ac18:1165
<none> Ubuntu 18.10 4.18.0-10-generic docker://18.6.0
master node is not ready as cni is not configured yet and codedns pods are not up yet.
Now install the cilium in Ipv6.
1. Run the etcd in master node.
sudo docker run -d --network=host \
--name "cilium-etcd" k8s.gcr.io/etcd:3.2.24 \
etcd -name etcd0 \
-advertise-client-urls http://[fd0c:6493:12bf:2942::ac18:1164]:4001 \
-listen-client-urls http://[fd0c:6493:12bf:2942::ac18:1164]:4001 \
-initial-advertise-peer-urls http://[fd0c:6493:12bf:2942::ac18:1164]:2382 \
-listen-peer-urls http://[fd0c:6493:12bf:2942::ac18:1164]:2382 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster etcd0=http://[fd0c:6493:12bf:2942::ac18:1164]:2382 \
-initial-cluster-state new
Here [fd0c:6493:12bf:2942::ac18:1164] is master node ipv6 ip.
2. sudo mount bpffs /sys/fs/bpf -t bpf
3. Run the kuberouter.
Expected Result:
Kube-router adds the routing entry for the POD-CIDR corresponding to the each of the other nodes in the cluster. Node public IP will be set as GW. Following result is obtained for IPv4. For IPv4, routing entry is created in node-1 for node-2 ( public IP 10.40.139.196 and POD CIDR 10.244.1.0/24). Device is the interface where public IP is bound.
$ ip route show
10.244.1.0/24 via 10.40.139.196 dev ens4f0.116 proto 17
Note: For IPv6 only Kubernetes, --pod-network-cidr=2001:2::/64
Actual result -
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-86c58d9df4-g7nvf 0/1 ContainerCreating 0 22h
coredns-86c58d9df4-rrtgp 0/1 ContainerCreating 0 38h
etcd-master 1/1 Running 0 38h
kube-apiserver-master 1/1 Running 0 38h
kube-controller-manager-master 1/1 Running 0 38h
kube-proxy-9xb2c 1/1 Running 0 38h
kube-proxy-jfv2m 1/1 Running 0 38h
kube-router-5xjv4 0/1 CrashLoopBackOff 15 73m
kube-scheduler-master 1/1 Running 0 38h
Question -
Can kuberouter use private IPv6 address which is used by the Kubernetes cluster instead of using the public IP which in our case isenter code here IPv4.

ipvsadm not showing any entry in kubeadm cluster

I have installed kubeadm and created service and pod:
packet#test:~$ kubectl get pod
NAME READY STATUS RESTARTS AGE
udp-server-deployment-6f87f5c9-466ft 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-5j9rt 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-g9wrr 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-ntbkc 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-xlbjq 1/1 Running 0 5m
packet#test:~$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 1h
udp-server-service NodePort 10.102.67.0 <none> 10001:30001/UDP 6m
but still I am not able to access udp-server pod:
packet#test:~$ curl http://192.168.43.161:30001
curl: (7) Failed to connect to 192.168.43.161 port 30001: Connection refused
while debugging i could see kube-proxy is running but there is no entry in IPVS:
root#test:~# ps auxw | grep kube-proxy
root 4050 0.5 0.7 44340 29952 ? Ssl 14:33 0:25 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
root 6094 0.0 0.0 14224 968 pts/1 S+ 15:48 0:00 grep --color=auto kube-proxy
root#test:~# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
Seems to be there is no entry in ipvsadm causing connection time out.
Regards, Ranjith
From this issue (putting aside the load balancer part),
Both externalIPs and status.loadBalancer.ingress[].ip seem to be ignored by kube-proxy in IPVS mode, so external traffic is completely unrouteable.
In contrast, kube-proxy in iptables mode creates DNAT/SNAT rules for external and loadbalancer IPs.
So check if adding a network plugin (flannel, Calico, ...) would improve the situation.
Or check out cloudnativelabs/kube-router, which is also ipvs-based.
A lean yet powerful alternative to several network components used in typical Kubernetes clusters.
All this from a single DaemonSet/Binary. It doesn't get any easier.
Since curl use tcp connection, while 30001 is a udp port, they don't work together, try a udp probe tool, like nmap.
initially I have created VM(Linux VM) using virtual box(running on window),where I found this type of issue.
Now i have created VM(Linux VM) using virtual manager(running on Linux),in this set up there is no issue and every thing works fine.
It would be great if any one tell is there any restriction from virtual box?

kubernetes service IPs not reachable

So I've got a Kubernetes cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.
$ kubectl get no
NAME STATUS AGE
coreos-master-1 Ready,SchedulingDisabled 1h
coreos-worker-1 Ready 54m
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default curl-2421989462-h0dr7 1/1 Running 1 53m 10.2.26.4 coreos-worker-1
kube-system busybox 1/1 Running 0 55m 10.2.26.3 coreos-worker-1
kube-system kube-apiserver-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-controller-manager-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-worker-1 1/1 Running 0 58m 192.168.0.204 coreos-worker-1
kube-system kube-scheduler-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
$ kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.3.0.1 <none> 443/TCP 1h
As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.
From within the curl pod, I can reproduce the issue:
[ root#curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host
[ root#curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0 src 10.2.26.4
It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.
There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:
core#coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.
So here're the questions I have:
What is the 1st IP of the service network (10.3.0.1 in this case) used for?
Is above description of the traffic flow correct? If not, what steps does it take for a container to reach a service IP?
What are the best ways to debug each step in the traffic flow? (I can't get any idea what's wrong from the logs)
Thanks!
The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.
The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).
Troubleshooting is different on each network/cluster setup. I would generally check:
Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
Locate logs for kube-proxy and find clues (can you post some?)
Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
Is the overlay (pod) network functional?
If you leave comments I will get back to you..
I had this same problem, and the ultimate solution that worked for me was enabling IP forwarding on all nodes in the cluster, which I had neglected to do.
$ sudo sysctl net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
Service IPs and DNS started working immediately afterwards.
I had the same issue, turned out to be a configuration issue in kube-proxy.yaml For the "master" parameter I had the ip address as in - --master=192.168.3.240 but it actually required to be a url like - --master=https://192.168.3.240
FYI my kube-proxy sucessfully uses --proxy-mode=iptables (v1.6.x)