VPC native Clusters in GKE can't communicate in GKE 1.14 - kubernetes

I have created two seperate GKE clusters on K8s 1.14.10.
VPN access to in-house network not working after GKE cluster upgrade to 1.14.6
I have followed this and the IP masquerading agent documentation.
I have tried to test this using a client pod and server pod to exchange messages.
I'm using Internal node IP to send message and created a ClusterIP to expose the pods.
I have allowed requests for every instance in firewall rules for ingress and egress i.e 0.0.0.0/0.
Pic:This is the description of the cluster which I have created
The config map of the IP masquerading agent stays the same as in the documentation.
I'm able to ping the other node from within the pod but curl request says connection refused and tcpdump shows no data.
Problem:
I need to communicate from cluster A to cluster B in gke 1.14 with ipmasquerading set to true. I either get connection refused or i/o timeout. I have tried using internal and external node IPs as well as using a loadbalancer.

You have provided quite general information and without details I cannot provide specific scenario answer. It might be related to how did you create clusters or other firewalls settings. Due to that I will provide correct steps to creation and configuration 2 clusters with firewall and masquerade. Maybe you will be able to find which step you missed or misconfigured.
Clusters configuration (node,pods,svc) are on the bottom of the answer.
1. Create VPC and 2 clusters
In docs it says about 2 different projects but you can do it in one project.
Good example of VPC creation and 2 clusters can be found in GKE docs. Create VPC and Crate 2 clusters. In cluster Tier1 you can enable NetworkPolicy now instead of enabling it later.
After that you will need to create Firewall Rules. You will also need to add ICMP protocol to firewall rule.
At this point you should be able to ping between nodes from 2 clusters.
For additional Firewall rules (allowing connection between pods, svc, etc) please check this docs.
2. Enable IP masquerade agent
As mentioned in docs, to run IPMasquerade:
The ip-masq-agent DaemonSet is automatically installed as an add-on with --nomasq-all-reserved-ranges argument in a GKE cluster, if one or more of the following is true:
The cluster has a network policy.
OR
The Pod's CIDR range is not within 10.0.0.0/8.
It mean that tier-2-cluster already have ip-masq-agent in kube-system namespace (because The Pod's CIDR range is not within 10.0.0.0/8.). And if you enabled NetworkPolicy during creation of tier-1-cluster it should be have also installed. If not, you will need to enable it using command:
$ gcloud container clusters update tier-1-cluster --update-addons=NetworkPolicy=ENABLED --zone=us-central1-a
To verify if everything is ok you have to check if Daemonset ip-masq-agent pods were created. (Each pod for node).
$ kubectl get ds ip-masq-agent -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ip-masq-agent 3 3 3 3 3 beta.kubernetes.io/masq-agent-ds-ready=true 168m
If you will SSH to any of your nodes you will be able to see default iptables entries.
$ sudo iptables -t nat -L IP-MASQ
Chain IP-MASQ (1 references)
target prot opt source destination
RETURN all -- anywhere 169.254.0.0/16 /* ip-masq: local traffic is not subject to MASQUERADE */
RETURN all -- anywhere 10.0.0.0/8 /* ip-masq: RFC 1918 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 172.16.0.0/12 /* ip-masq: RFC 1918 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.168.0.0/16 /* ip-masq: RFC 1918 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 240.0.0.0/4 /* ip-masq: RFC 5735 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.0.2.0/24 /* ip-masq: RFC 5737 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 198.51.100.0/24 /* ip-masq: RFC 5737 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 203.0.113.0/24 /* ip-masq: RFC 5737 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 100.64.0.0/10 /* ip-masq: RFC 6598 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 198.18.0.0/15 /* ip-masq: RFC 6815 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.0.0.0/24 /* ip-masq: RFC 6890 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.88.99.0/24 /* ip-masq: RFC 7526 reserved range is not subject to MASQUERADE */
MASQUERADE all -- anywhere anywhere /* ip-masq: outbound traffic is subject to MASQUERADE (must be last in chain) */
3. Deploy test application
I've used Hello application from GKE docs and deployed on both Clusters. In addition I have also deployed ubuntu image for tests.
4. Apply proper configuration for IPMasquerade
This config need to be on the source cluster.
In short, if destination CIDR is in nonMasqueradeCIDRs:, it will show it internal IP, otherwise it will show NodeIP as source.
Save to file config below text:
nonMasqueradeCIDRs:
- 10.0.0.0/8
resyncInterval: 2s
masqLinkLocal: true
Create IPMasquarade ConfigMap
$ kubectl create configmap ip-masq-agent --from-file config --namespace kube-system
It will overwrite iptables configuration
$ sudo iptables -t nat -L IP-MASQ
Chain IP-MASQ (2 references)
target prot opt source destination
RETURN all -- anywhere 10.0.0.0/8 /* ip-masq-agent: local traffic is not subject to MASQUERADE */
MASQUERADE all -- anywhere anywhere /* ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */
5. Tests:
When IP is Masqueraded
SSH to Node form Tier2 cluster and run:
sudo toolbox bash
apt-get update
apt install -y tcpdump
Now you should listen using command below. Port 32502 is NodePort service from Tier 2 Cluster
tcpdump -i eth0 -nn -s0 -v port 32502
In Cluster Tier1 you need to enter ubuntu pod and curl NodeIP:NodePort
$ kubectl exec -ti ubuntu -- bin/bash
You will need to install curl apt-get install curl.
curl NodeIP:NodePort (Node which is listening, NodePort from service from Cluster Tier 2).
CLI:
root#ubuntu:/# curl 172.16.4.3:32502
Hello, world!
Version: 2.0.0
Hostname: hello-world-deployment-7f67f479f5-h4wdm
On Node you can see entry like:
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:53:30.321641 IP (tos 0x0, ttl 63, id 25373, offset 0, flags [DF], proto TCP (6), length 60)
10.0.4.4.56018 > 172.16.4.3.32502: Flags [S], cksum 0x8648 (correct), seq 3001889856
10.0.4.4 is NodeIP where Ubuntu pod is located.
When IP was not Masqueraded
Remove ConfigMap from Cluster Tier 1
$ kubectl delete cm ip-masq-agent -n kube-system
Change in file config CIDR to 172.16.4.0/22 which is Tier 2 nodes pool and reapply CM
$ kubectl create configmap ip-masq-agent --from-file config --namespace kube-system
SSH to any node from Tier 1 to check if iptables rules were changed.
sudo iptables -t nat -L IP-MASQ
Chain IP-MASQ (2 references)
target prot opt source destination
RETURN all -- anywhere 172.16.4.0/22 /* ip-masq-agent: local traffic is not subject to MASQUERADE */
MASQUERADE all -- anywhere anywhere /* ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */
Now for test I have again used Ubuntu pod and curl the same ip like before.
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:16:50.316234 IP (tos 0x0, ttl 63, id 53160, offset 0, flags [DF], proto TCP (6), length 60)
10.4.2.8.57876 > 172.16.4.3.32502
10.4.2.8 is internal IP of Ubuntu pod.
Configuration for Tests:
TIER1
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/hello-world-deployment-7f67f479f5-b2qqz 1/1 Running 0 15m 10.4.1.8 gke-tier-1-cluster-default-pool-e006097b-5tnj <none> <none>
pod/hello-world-deployment-7f67f479f5-shqrt 1/1 Running 0 15m 10.4.2.5 gke-tier-1-cluster-default-pool-e006097b-lfvh <none> <none>
pod/hello-world-deployment-7f67f479f5-x7jvr 1/1 Running 0 15m 10.4.0.8 gke-tier-1-cluster-default-pool-e006097b-1wbf <none> <none>
ubuntu 1/1 Running 0 91s 10.4.2.8 gke-tier-1-cluster-default-pool-e006097b-lfvh <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/hello-world NodePort 10.0.36.46 <none> 60000:31694/TCP 14m department=world,greeting=hello
service/kubernetes ClusterIP 10.0.32.1 <none> 443/TCP 115m <none>
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node/gke-tier-1-cluster-default-pool-e006097b-1wbf Ready <none> 115m v1.14.10-gke.36 10.0.4.2 35.184.38.21 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-1-cluster-default-pool-e006097b-5tnj Ready <none> 115m v1.14.10-gke.36 10.0.4.3 35.184.207.20 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-1-cluster-default-pool-e006097b-lfvh Ready <none> 115m v1.14.10-gke.36 10.0.4.4 35.226.105.31 Container-Optimized OS from Google 4.14.138+ docker://18.9.7<none> 100m v1.14.10-gke.36 10.0.4.4 35.226.105.31 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
TIER2
$ kubectl get pods,svc,nodes -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/hello-world-deployment-7f67f479f5-92zvk 1/1 Running 0 12m 172.20.1.5 gke-tier-2-cluster-default-pool-57b1cc66-xqt5 <none> <none>
pod/hello-world-deployment-7f67f479f5-h4wdm 1/1 Running 0 12m 172.20.1.6 gke-tier-2-cluster-default-pool-57b1cc66-xqt5 <none> <none>
pod/hello-world-deployment-7f67f479f5-m85jn 1/1 Running 0 12m 172.20.1.7 gke-tier-2-cluster-default-pool-57b1cc66-xqt5 <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/hello-world NodePort 172.16.24.206 <none> 60000:32502/TCP 12m department=world,greeting=hello
service/kubernetes ClusterIP 172.16.16.1 <none> 443/TCP 113m <none>
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node/gke-tier-2-cluster-default-pool-57b1cc66-84ng Ready <none> 112m v1.14.10-gke.36 172.16.4.2 35.184.118.151 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-2-cluster-default-pool-57b1cc66-mlmn Ready <none> 112m v1.14.10-gke.36 172.16.4.3 35.238.231.160 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-2-cluster-default-pool-57b1cc66-xqt5 Ready <none> 112m v1.14.10-gke.36 172.16.4.4 35.202.94.194 Container-Optimized OS from Google 4.14.138+ docker://18.9.7

Related

Why Kube-proxy not working ? No DNS resolution [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
i've juste set up a fresh cluster with kubeadm and kubernetes 1.21. All pods are mark ready. But i can't access any of them. After digging into the problem, it appear that no DNS resolution is possible. It seems that kube-proxy does not work.
this is a log of a kube-proxy pods
I0712 05:50:46.511967 1 node.go:172] Successfully retrieved node IP: x.x.x.x
I0712 05:50:46.512039 1 server_others.go:140] Detected node IP x.x.x.x
W0712 05:50:46.512077 1 server_others.go:598] Unknown proxy mode "", assuming iptables proxy
I0712 05:50:46.545626 1 server_others.go:206] kube-proxy running in dual-stack mode, IPv4-primary
I0712 05:50:46.545672 1 server_others.go:212] Using iptables Proxier.
I0712 05:50:46.545692 1 server_others.go:219] creating dualStackProxier for iptables.
W0712 05:50:46.545715 1 server_others.go:512] detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6
I0712 05:50:46.546089 1 server.go:643] Version: v1.21.2
I0712 05:50:46.549861 1 conntrack.go:52] Setting nf_conntrack_max to 196608
I0712 05:50:46.550300 1 config.go:224] Starting endpoint slice config controller
I0712 05:50:46.550338 1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0712 05:50:46.550332 1 config.go:315] Starting service config controller
I0712 05:50:46.550354 1 shared_informer.go:240] Waiting for caches to sync for service config
W0712 05:50:46.553020 1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
W0712 05:50:46.555115 1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
I0712 05:50:46.650614 1 shared_informer.go:247] Caches are synced for service config
I0712 05:50:46.650634 1 shared_informer.go:247] Caches are synced for endpoint slice config
W0712 05:57:14.556916 1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
W0712 06:06:34.558550 1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
and this is my pods running :
kube-system pod/coredns-558bd4d5db-qpf5m 1/1 Running 1 8h
kube-system pod/coredns-558bd4d5db-r5jwz 1/1 Running 0 8h
kube-system pod/etcd-master2 1/1 Running 3 20h
kube-system pod/kube-apiserver-master2 1/1 Running 3 20h
kube-system pod/kube-controller-manager-master2 1/1 Running 3 8h
kube-system pod/kube-flannel-ds-b7xrm 1/1 Running 0 8h
kube-system pod/kube-flannel-ds-hcn7f 1/1 Running 0 8h
kube-system pod/kube-flannel-ds-rx8j6 1/1 Running 1 8h
kube-system pod/kube-flannel-ds-wc2jc 1/1 Running 0 8h
kube-system pod/kube-proxy-48wmr 1/1 Running 0 25m
kube-system pod/kube-proxy-4gw8t 1/1 Running 0 25m
kube-system pod/kube-proxy-h9djp 1/1 Running 0 25m
kube-system pod/kube-proxy-r4k9t 1/1 Running 0 24m
kube-system pod/kube-scheduler-master2 1/1 Running 3 20h
the command kubectl run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox nslookup kubernetes.default give me :
Address 1: x.x.x.x
nslookup: can't resolve 'kubernetes.default'
pod "busybox" deleted
pod default/busybox terminated (Error)
My iptables rules :
Chain INPUT (policy ACCEPT)
target prot opt source destination
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes health check service ports */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
KUBE-FORWARD all -- anywhere anywhere /* kubernetes forwarding rules */
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
DOCKER-USER all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- 10.244.0.0/16 anywhere
ACCEPT all -- anywhere 10.244.0.0/16
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-EXTERNAL-SERVICES (2 references)
target prot opt source destination
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- anywhere anywhere /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP all -- !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT
Chain KUBE-FORWARD (1 references)
target prot opt source destination
DROP all -- anywhere anywhere ctstate INVALID
ACCEPT all -- anywhere anywhere /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT all -- anywhere anywhere /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
Chain KUBE-PROXY-CANARY (0 references)
target prot opt source destination
Chain KUBE-SERVICES (2 references)
target prot opt source destination
any idea?
[Edit]
#kubectl edit cm -n kube-system kubelet-config-1.21
apiVersion: v1
data:
kubelet: |
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
#kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 22h
Kube-proxy it's network service.
DNS-provider responsible for DNS resolution. As I see, you already have coredns installed.
Check your kubelet configuration. It should point to correct service, and this service should be accessible within your pods.
Also please check if your firewalld or iptables service is disabled on all nodes.
Like this:
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: "/var/lib/kubernetes/ca.pem"
authorization:
mode: Webhook
clusterDomain: "cluster.local"
clusterDNS:
- "10.33.0.10"
kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.33.0.10 <none> 53/UDP,53/TCP,9153/TCP 35h
And then:
kubectl exec -ti net-diag-86589fd8f5-r28qq -- nslookup kubernetes.default
Server: 10.33.0.10
Address: 10.33.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.33.0.1
UPD.
I Just noticed that you have Docker as a container runtime and flannel as a network provider. Per my understanding the problem may be is that Docker messing around with your iptables rules, try to set all docker rules as prmissive and see if it'll work.
I'm not a big expert in iptables configuration but something like this may help:
https://unrouted.io/2017/08/15/docker-firewall/
Also if you are using Flannel, make sure that you are using correct iface option. It may be critical if you are running non-cloud installations.
https://github.com/flannel-io/flannel/blob/master/Documentation/configuration.md#key-command-line-options

Unable to acccess nginx pod across nodes using ClusterIP

I have created nginx deployment and nginx service(ClusterIP) to access nginx pod. But not able to access pod through cluster IP across nodes other than node where pod is scheduled.
I tried looking for IPtable too. But do not DNAT entry over there.
root#kdm-master-1:~# k get all -A -o wide |grep nginx
default pod/nginx-6db489d4b7-pfkm9 1/1 Running 0 3h16m 10.244.1.3 kdm-worker-1 <none> <none>
default service/nginx ClusterIP 10.102.239.131 <none> 80/TCP 3h20m run=nginx
default deployment.apps/nginx 1/1 1 1 3h32m nginx nginx run=nginx
default replicaset.apps/nginx-6db489d4b7 1 1 1 3h32m nginx nginx pod-template-hash=6db489d4b7,run=nginx
IP table:
root#kdm-master-1:~# iptables -L -t nat|grep nginx
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.102.239.131 /* default/nginx:80-80 cluster IP */ tcp dpt:http
KUBE-SVC-OVTWZ4GROBJZO4C5 tcp -- anywhere 10.102.239.131 /* default/nginx:80-80 cluster IP */ tcp dpt:http
# Warning: iptables-legacy tables present, use iptables-legacy to see them
Please advice how can I resolve it?
set net.ipv4.ip_forward=1 in /etc/sysctl.conf
run sysctl --system
This will resolve the issue and one will be able able to access the pod from any node.

Why doesn't kube-proxy route traffic to another worker node?

I've deployed several different services and always get the same error.
The service is reachable on the node port from the machine where the pod is running. On the two other nodes I get timeouts.
The kube-proxy is running on all worker nodes and I can see in the logfiles from kube-proxy that the service port was added and the node port was opened.
In this case I've deployed the stars demo from calico
Kube-proxy log output:
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.229458 659 service.go:309] Adding new service port "management-ui/management-ui:" at 10.32.0.133:9001/TCP
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.257483 659 proxier.go:1427] Opened local port "nodePort for management-ui/management-ui:" (:30002/tcp)
The kube-proxy is listening on the port 30002
root#kuben1:/tmp# netstat -lanp | grep 30002
tcp6 0 0 :::30002 :::* LISTEN 659/kube-proxy
There are also some iptable rules defined:
root#kuben1:/tmp# iptables -L -t nat | grep management-ui
KUBE-MARK-MASQ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-MARK-MASQ tcp -- !10.200.0.0/16 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
The interesting part is that I can reach the service IP from any worker node
root#kubem1:/tmp# kubectl get svc -n management-ui
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
management-ui NodePort 10.32.0.133 <none> 9001:30002/TCP 52m
The service IP/port can be accessed from any worker node if I do a "curl http://10.32.0.133:9001"
I don't understand why kube-proxy does not "route" this properly...
Has anyone a hint where I can find the error?
Here some cluster specs:
This is a hand build cluster inspired by Kelsey Hightower's "kubernetes the hard way" guide.
6 Nodes (3 master: 3 worker) local vms
OS: Ubuntu 18.04
K8s: v1.13.0
Docker: 18.9.3
Cni: calico
Component status on the master nodes looks okay
root#kubem1:/tmp# kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
The worker nodes are looking okay if I trust kubectl
root#kubem1:/tmp# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kuben1 Ready <none> 39d v1.13.0 192.168.178.77 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben2 Ready <none> 39d v1.13.0 192.168.178.78 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben3 Ready <none> 39d v1.13.0 192.168.178.79 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
As asked by P Ekambaram:
root#kubem1:/tmp# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-bgjdg 1/1 Running 5 40d
calico-node-nwkqw 1/1 Running 5 40d
calico-node-vrwn4 1/1 Running 5 40d
coredns-69cbb76ff8-fpssw 1/1 Running 5 40d
coredns-69cbb76ff8-tm6r8 1/1 Running 5 40d
kubernetes-dashboard-57df4db6b-2xrmb 1/1 Running 5 40d
I've found a solution for my "Problem".
This behavior was caused by a change in Docker v1.13.x and the issue was fixed in kubernetes with version 1.8.
The easy solution was to change the forward rules via iptables.
Run the following cmd on all worker nodes: "iptables -A FORWARD -j ACCEPT"
To fix it the right way i had to tell the kube-proxy the cidr for the pods.
Theoretical that could be solved in two ways:
Add "--cluster-cidr=10.0.0.0/16" as argument to the kube-proxy command line(in my case in the systemd service file)
Add 'clusterCIDR: "10.0.0.0/16"' to the kubeconfig file for kube-proxy
In my case the cmd line argument doesn't had any effect.
As i've added the line to my kubeconfig file and restarted the kube-proxy on all worker nodes everything works well.
Here is the github merge request for this "FORWARD" issue: link

ipvsadm not showing any entry in kubeadm cluster

I have installed kubeadm and created service and pod:
packet#test:~$ kubectl get pod
NAME READY STATUS RESTARTS AGE
udp-server-deployment-6f87f5c9-466ft 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-5j9rt 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-g9wrr 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-ntbkc 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-xlbjq 1/1 Running 0 5m
packet#test:~$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 1h
udp-server-service NodePort 10.102.67.0 <none> 10001:30001/UDP 6m
but still I am not able to access udp-server pod:
packet#test:~$ curl http://192.168.43.161:30001
curl: (7) Failed to connect to 192.168.43.161 port 30001: Connection refused
while debugging i could see kube-proxy is running but there is no entry in IPVS:
root#test:~# ps auxw | grep kube-proxy
root 4050 0.5 0.7 44340 29952 ? Ssl 14:33 0:25 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
root 6094 0.0 0.0 14224 968 pts/1 S+ 15:48 0:00 grep --color=auto kube-proxy
root#test:~# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
Seems to be there is no entry in ipvsadm causing connection time out.
Regards, Ranjith
From this issue (putting aside the load balancer part),
Both externalIPs and status.loadBalancer.ingress[].ip seem to be ignored by kube-proxy in IPVS mode, so external traffic is completely unrouteable.
In contrast, kube-proxy in iptables mode creates DNAT/SNAT rules for external and loadbalancer IPs.
So check if adding a network plugin (flannel, Calico, ...) would improve the situation.
Or check out cloudnativelabs/kube-router, which is also ipvs-based.
A lean yet powerful alternative to several network components used in typical Kubernetes clusters.
All this from a single DaemonSet/Binary. It doesn't get any easier.
Since curl use tcp connection, while 30001 is a udp port, they don't work together, try a udp probe tool, like nmap.
initially I have created VM(Linux VM) using virtual box(running on window),where I found this type of issue.
Now i have created VM(Linux VM) using virtual manager(running on Linux),in this set up there is no issue and every thing works fine.
It would be great if any one tell is there any restriction from virtual box?

kubernetes service IPs not reachable

So I've got a Kubernetes cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.
$ kubectl get no
NAME STATUS AGE
coreos-master-1 Ready,SchedulingDisabled 1h
coreos-worker-1 Ready 54m
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default curl-2421989462-h0dr7 1/1 Running 1 53m 10.2.26.4 coreos-worker-1
kube-system busybox 1/1 Running 0 55m 10.2.26.3 coreos-worker-1
kube-system kube-apiserver-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-controller-manager-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-worker-1 1/1 Running 0 58m 192.168.0.204 coreos-worker-1
kube-system kube-scheduler-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
$ kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.3.0.1 <none> 443/TCP 1h
As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.
From within the curl pod, I can reproduce the issue:
[ root#curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host
[ root#curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0 src 10.2.26.4
It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.
There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:
core#coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.
So here're the questions I have:
What is the 1st IP of the service network (10.3.0.1 in this case) used for?
Is above description of the traffic flow correct? If not, what steps does it take for a container to reach a service IP?
What are the best ways to debug each step in the traffic flow? (I can't get any idea what's wrong from the logs)
Thanks!
The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.
The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).
Troubleshooting is different on each network/cluster setup. I would generally check:
Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
Locate logs for kube-proxy and find clues (can you post some?)
Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
Is the overlay (pod) network functional?
If you leave comments I will get back to you..
I had this same problem, and the ultimate solution that worked for me was enabling IP forwarding on all nodes in the cluster, which I had neglected to do.
$ sudo sysctl net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
Service IPs and DNS started working immediately afterwards.
I had the same issue, turned out to be a configuration issue in kube-proxy.yaml For the "master" parameter I had the ip address as in - --master=192.168.3.240 but it actually required to be a url like - --master=https://192.168.3.240
FYI my kube-proxy sucessfully uses --proxy-mode=iptables (v1.6.x)