iptables rules for kube-dns - kubernetes

I looked into the iptables rules which are used by the kube-dns, I'm a little bit confused by the sub chain "KUBE-SEP-V7KWRXXOBQHQVWAT", the content of this sub chain is below:
My question is why we need the target "KUBE-MARK-MASQ" when the source IP address(172.168.1.5) is the kube-dns IP address. Per my understanding, the target IP address should be the kube-dns pod's address 172.168.1.5, not the source IP address. Cause all the DNS queries are from other addesses(serivces), The DNS queries cannot be originated from itself.
# iptables -t nat -L KUBE-SEP-V7KWRXXOBQHQVWAT
Chain KUBE-SEP-V7KWRXXOBQHQVWAT (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 172.18.1.5 anywhere /* kube-system/kube-dns:dns-tcp */
DNAT tcp -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */ tcp to:172.18.1.5:53
Here is the full chain information:
# iptables -t nat -L KUBE-SERVICES
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !172.18.1.0/24 10.0.62.222 /* kube-system/metrics-server cluster IP */ tcp dpt:https
KUBE-SVC-QMWWTXBG7KFJQKLO tcp -- anywhere 10.0.62.222 /* kube-system/metrics-server cluster IP */ tcp dpt:https
KUBE-MARK-MASQ tcp -- !172.18.1.0/24 10.0.213.2 /* kube-system/healthmodel-replicaset-service cluster IP */ tcp dpt:25227
KUBE-SVC-WT3SFWJ44Q74XUPR tcp -- anywhere 10.0.213.2 /* kube-system/healthmodel-replicaset-service cluster IP */ tcp dpt:25227
KUBE-MARK-MASQ tcp -- !172.18.1.0/24 10.0.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- anywhere 10.0.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-MARK-MASQ udp -- !172.18.1.0/24 10.0.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-TCOU7JCQXEZGVUNU udp -- anywhere 10.0.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-MARK-MASQ tcp -- !172.18.1.0/24 10.0.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SVC-ERIFXISQEP7F7OF4 tcp -- anywhere 10.0.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
# iptables -t nat -L KUBE-SVC-ERIFXISQEP7F7OF4
Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
target prot opt source destination
KUBE-SEP-V7KWRXXOBQHQVWAT all -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */ statistic mode random probability 0.50000000000
KUBE-SEP-BWCLCJLZ5KI6FXBW all -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */
# iptables -t nat -L KUBE-SEP-V7KWRXXOBQHQVWAT
Chain KUBE-SEP-V7KWRXXOBQHQVWAT (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 172.18.1.5 anywhere /* kube-system/kube-dns:dns-tcp */
DNAT tcp -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */ tcp to:172.18.1.5:53

You can think of kubernetes service routing in iptables as the following steps:
Loop through chain holding all kubernetes services
If you hit a matching service address and IP, go to service chain
The service chain will randomly select an endpoint from the list of endpoints (using probabilities)
If the endpoint selected has the same IP as the source address of the traffic, mark it for MASQUERADE later (this is the KUBE-MARK-MASQ you are asking about). In other words, if a pod tries to talk to a service IP and that service IP "resolves" to the pod itself, we need to mark it for MASQUERADE later (actual MASQUERADE target is in the POSTROUTING chain because it's only allowed to happen there)
Do the DNAT to selected endpoint and port. This happens regardless of whether 3) occurs or not.
If you look at iptables -t nat -L POSTROUTING there will be a rule that is looking for marked packets which is where the MASQUERADE actually happens.
The reason why the KUBE-MARK-MASQ rule has to exist is for hairpin NAT. The details why are a somewhat involved explanation, but here's my best attempt:
If MASQUERADE didn't happen, traffic would leave the pod's network namespace as (pod IP, source port -> virtual IP, virtual port) and then be NAT'd to (pod IP, source port-> pod IP, service port) and immediately sent back to the pod. Thus, this traffic would then arrive at the service with the source being (pod IP, source port). So when this service replies it will be replying to (pod IP, source port), but the pod (the kernel, really) is expecting traffic to come back on the same IP and port it sent the traffic to originally, which is (virtual IP, virtual port) and thus the traffic would get dropped on the way back.

Related

How do GCP load balancers route traffic to GKE services?

I'm relatively new (< 1 year) to GCP, and I'm still in the process of mapping the various services onto my existing networking mental model.
Once knowledge gap I'm struggling to fill is how HTTP requests are load balanced to services running in our GKE clusters.
On a test cluster, I created a service in front of pods that serve HTTP:
apiVersion: v1
kind: Service
metadata:
name: contour
spec:
ports:
- port: 80
name: http
protocol: TCP
targetPort: 8080
- port: 443
name: https
protocol: TCP
targetPort: 8443
selector:
app: contour
type: LoadBalancer
The service is listening on node ports 30472 and 30816.:
$ kubectl get svc contour
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
contour LoadBalancer 10.63.241.69 35.x.y.z 80:30472/TCP,443:30816/TCP 41m
A GCP network load balancer is automatically created for me. It has its own public IP at 35.x.y.z, and is listening on ports 80-443:
Curling the load balancer IP works:
$ curl -q -v 35.x.y.z
* TCP_NODELAY set
* Connected to 35.x.y.z (35.x.y.z) port 80 (#0)
> GET / HTTP/1.1
> Host: 35.x.y.z
> User-Agent: curl/7.62.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Mon, 07 Jan 2019 05:33:44 GMT
< server: envoy
< content-length: 0
<
If I ssh into the GKE node, I can see the kube-proxy is listening on the service nodePorts (30472 and 30816) and nothing has a socket listening on ports 80 or 443:
# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:20256 0.0.0.0:* LISTEN 1022/node-problem-d
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 1221/kubelet
tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 1369/kube-proxy
tcp 0 0 0.0.0.0:5355 0.0.0.0:* LISTEN 297/systemd-resolve
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 330/sshd
tcp6 0 0 :::30816 :::* LISTEN 1369/kube-proxy
tcp6 0 0 :::4194 :::* LISTEN 1221/kubelet
tcp6 0 0 :::30472 :::* LISTEN 1369/kube-proxy
tcp6 0 0 :::10250 :::* LISTEN 1221/kubelet
tcp6 0 0 :::5355 :::* LISTEN 297/systemd-resolve
tcp6 0 0 :::10255 :::* LISTEN 1221/kubelet
tcp6 0 0 :::10256 :::* LISTEN 1369/kube-proxy
Two questions:
Given nothing on the node is listening on ports 80 or 443, is the load balancer directing traffic to ports 30472 and 30816?
If the load balancer is accepting traffic on 80/443 and forwarding to 30472/30816, where can I see that configuration? Clicking around the load balancer screens I can't see any mention of ports 30472 and 30816.
I think I found the answer to my own question - can anyone confirm I'm on the right track?
The network load balancer redirects the traffic to a node in the cluster without modifying the packet - packets for port 80/443 still have port 80/443 when they reach the node.
There's nothing listening on ports 80/443 on the nodes. However kube-proxy has written iptables rules that match packets to the load balancer IP, and rewrite them with the appropriate ClusterIP and port:
You can see the iptables config on the node:
$ iptables-save | grep KUBE-SERVICES | grep loadbalancer
-A KUBE-SERVICES -d 35.x.y.z/32 -p tcp -m comment --comment "default/contour:http loadbalancer IP" -m tcp --dport 80 -j KUBE-FW-D53V3CDHSZT2BLQV
-A KUBE-SERVICES -d 35.x.y.z/32 -p tcp -m comment --comment "default/contour:https loadbalancer IP" -m tcp --dport 443 -j KUBE-FW-J3VGAQUVMYYL5VK6
$ iptables-save | grep KUBE-SEP-ZAA234GWNBHH7FD4
:KUBE-SEP-ZAA234GWNBHH7FD4 - [0:0]
-A KUBE-SEP-ZAA234GWNBHH7FD4 -s 10.60.0.30/32 -m comment --comment "default/contour:http" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZAA234GWNBHH7FD4 -p tcp -m comment --comment "default/contour:http" -m tcp -j DNAT --to-destination 10.60.0.30:8080
$ iptables-save | grep KUBE-SEP-CXQOVJCC5AE7U6UC
:KUBE-SEP-CXQOVJCC5AE7U6UC - [0:0]
-A KUBE-SEP-CXQOVJCC5AE7U6UC -s 10.60.0.30/32 -m comment --comment "default/contour:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CXQOVJCC5AE7U6UC -p tcp -m comment --comment "default/contour:https" -m tcp -j DNAT --to-destination 10.60.0.30:8443
An interesting implication is the the nodePort is created but doesn't appear to be used. That matches this comment in the kube docs:
Google Compute Engine does not need to allocate a NodePort to make LoadBalancer work
It also explains why GKE creates an automatic firewall rule that allows traffic from 0.0.0.0/0 towards ports 80/443 on the nodes. The load balancer isn't rewriting the packets, so the firewall needs to allow traffic from anywhere to reach iptables on the node, and it's rewritten there.
To understand LoadBalancer services, you first have to grok NodePort services. The way those work is that there is a proxy (usually actually implemented in iptables or ipvs now for perf but that's an implementation detail) on every node in your cluster, and when create a NodePort service it picks a port that is unused and sets every one of those proxies to forward packets to your Kubernetes pod. A LoadBalancer service builds on top of that, so on GCP/GKE it creates a GCLB forwarding rule mapping the requested port to a rotation of all those node-level proxies. So the GCLB listens on port 80, which proxies to some random port on a random node, which proxies to the internal port on your pod.
The process is a bit more customizable than that, but that's the basic defaults.

Kubernetes service cluster ip not accessible but endpoints ip is accessible from within the node

I setup a single node kubernetes following the kubernetes-the-hard-way guide, except that I'm running on CentOS-7 and I deploy one master and one worker in the same node. I already turn off the firewalld service.
After the installation, I deploy a mongodb service, however the cluster IP is not accessible but the endpoint is accessible. The service detail is as follows:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 2m
mongodb ClusterIP 10.254.0.117 <none> 27017/TCP 55s
$ kubectl describe svc mongodb
Name: mongodb
Namespace: default
Labels: io.kompose.service=mongodb
Annotations: kompose.cmd=kompose convert -f docker-compose.yml
kompose.version=1.11.0 (39ad614)
kubectl.kubernetes.io/last-applied-configuration=
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":
{"kompose.cmd":"kompose convert -f docker-compose.yml","kompose.version":"1.11.0
(39ad614...
Selector: io.kompose.service=mongodb
Type: ClusterIP
IP: 10.254.0.117
Port: 27017 27017/TCP
TargetPort: 27017/TCP
Endpoints: 10.254.0.2:27017
Session Affinity: None
Events: <none>
I run mongo 10.254.0.2 on the host, it works, but when I run mongo 10.254.0.117, it does not works. By the way, if I start another mongo pod for example
kubectl run mongo-shell -ti --image=mongo --restart=Never bash
and I tried mongo 10.254.0.2 and mongo 10.254.0.117, they did not work at all.
The kubernetes version I use is 1.10.0.
I think this is a kube-proxy issue, the kube-proxy is configured as follows:
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://kubernetes.io/docs/concepts/overview/components/#kube-
proxy https://kubernetes.io/docs/reference/generated/kube-proxy/
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-proxy \
--config=/var/lib/kubelet/kube-proxy-config.yaml \
--logtostderr=true \
--v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
and the config file is
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
kubeconfig: "/var/lib/kubelet/kube-proxy.kubeconfig"
mode: "iptables"
clusterCIDR: "10.254.0.0/16"
This is the ip tables I get
sudo iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
DOCKER all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
MASQUERADE all -- 172.17.0.0/16 0.0.0.0/0
CNI-0f56c935ec75c77eb189a5fe all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "a54a2f20dbe5d24ec4fb6b059f23aae392cc26853cf2b474a56dff2a2f2d6bb6" */
CNI-d2a650ff06e253010ea31f3d all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "f3252d60a15faa5ff6c4b2aabebdb47aa5652e12c9d874f538b33d6c5913ba47" */
CNI-34b02c799f7bc4e979c15266 all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "5a87d86a62dd299e1d36b2ccd631d58896f2724ad9b4e14a93b9dfaa162b09e3" */
CNI-eb80e2736e1009010a27b4b4 all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "1891a61e27b764e4a36717166a2b83ce7d2baa5258e54f0ea183c4433b04de38" */
CNI-4d1b80b0072ade1be68c43d1 all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "2b90e720350fa78bf6e6756b941526bf181e0b48c6b87207bbc8f097933e67ba" */
CNI-7699fcd0ab82a702bac28bc9 all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "3feed2ec479bd17f82cac60adfd1c79c81d4c53d536daa74a46e05f462e2d895" */
CNI-871343dd2a1a9738c94b4dba all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "1a3a7b27889e54494d1e9699efb158dc8f3bb85b147b80db84038c07fd4c9910" */
CNI-3c0d02d02e5aa29b38ada7ba all -- 10.254.0.0/24 0.0.0.0/0 /* name: "bridge" id: "cdd5d6cf1a772b2acd37471046f53d0aa635733f0d5447a11d76dbb2ee216378" */
Chain CNI-0f56c935ec75c77eb189a5fe (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "a54a2f20dbe5d24ec4fb6b059f23aae392cc26853cf2b474a56dff2a2f2d6bb6" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "a54a2f20dbe5d24ec4fb6b059f23aae392cc26853cf2b474a56dff2a2f2d6bb6" */
Chain CNI-34b02c799f7bc4e979c15266 (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "5a87d86a62dd299e1d36b2ccd631d58896f2724ad9b4e14a93b9dfaa162b09e3" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "5a87d86a62dd299e1d36b2ccd631d58896f2724ad9b4e14a93b9dfaa162b09e3" */
Chain CNI-3c0d02d02e5aa29b38ada7ba (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "cdd5d6cf1a772b2acd37471046f53d0aa635733f0d5447a11d76dbb2ee216378" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "cdd5d6cf1a772b2acd37471046f53d0aa635733f0d5447a11d76dbb2ee216378" */
Chain CNI-4d1b80b0072ade1be68c43d1 (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "2b90e720350fa78bf6e6756b941526bf181e0b48c6b87207bbc8f097933e67ba" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "2b90e720350fa78bf6e6756b941526bf181e0b48c6b87207bbc8f097933e67ba" */
Chain CNI-7699fcd0ab82a702bac28bc9 (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "3feed2ec479bd17f82cac60adfd1c79c81d4c53d536daa74a46e05f462e2d895" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "3feed2ec479bd17f82cac60adfd1c79c81d4c53d536daa74a46e05f462e2d895" */
Chain CNI-871343dd2a1a9738c94b4dba (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "1a3a7b27889e54494d1e9699efb158dc8f3bb85b147b80db84038c07fd4c9910" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "1a3a7b27889e54494d1e9699efb158dc8f3bb85b147b80db84038c07fd4c9910" */
Chain CNI-d2a650ff06e253010ea31f3d (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "f3252d60a15faa5ff6c4b2aabebdb47aa5652e12c9d874f538b33d6c5913ba47" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "f3252d60a15faa5ff6c4b2aabebdb47aa5652e12c9d874f538b33d6c5913ba47" */
Chain CNI-eb80e2736e1009010a27b4b4 (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.254.0.0/24 /* name: "bridge" id: "1891a61e27b764e4a36717166a2b83ce7d2baa5258e54f0ea183c4433b04de38" */
MASQUERADE all -- 0.0.0.0/0 !224.0.0.0/4 /* name: "bridge" id: "1891a61e27b764e4a36717166a2b83ce7d2baa5258e54f0ea183c4433b04de38" */
Chain DOCKER (2 references)
target prot opt source destination
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain KUBE-MARK-DROP (0 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
Chain KUBE-MARK-MASQ (4 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
Chain KUBE-SEP-G5V522HWZT6RKRAC (2 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 192.168.56.3 0.0.0.0/0 /* default/kubernetes:https */
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/kubernetes:https */ recent: SET name: KUBE-SEP-G5V522HWZT6RKRAC side: source mask: 255.255.255.255 tcp to:192.168.56.3:6443
Chain KUBE-SEP-O34O4OGFBAADOMEG (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.254.0.2 0.0.0.0/0 /* default/mongodb:27017 */
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/mongodb:27017 */ tcp to:10.254.0.2:27017
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !10.254.0.0/16 10.254.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:443
KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- 0.0.0.0/0 10.254.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:443
KUBE-MARK-MASQ tcp -- !10.254.0.0/16 10.254.0.117 /* default/mongodb:27017 cluster IP */ tcp dpt:27017
KUBE-SVC-ZDG6MRTNE2LQFT34 tcp -- 0.0.0.0/0 10.254.0.117 /* default/mongodb:27017 cluster IP */ tcp dpt:27017
KUBE-NODEPORTS all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
target prot opt source destination
KUBE-SEP-G5V522HWZT6RKRAC all -- 0.0.0.0/0 0.0.0.0/0 /* default/kubernetes:https */ recent: CHECK seconds: 10800 reap name: KUBE-SEP-G5V522HWZT6RKRAC side: source mask: 255.255.255.255
KUBE-SEP-G5V522HWZT6RKRAC all -- 0.0.0.0/0 0.0.0.0/0 /* default/kubernetes:https */
Chain KUBE-SVC-ZDG6MRTNE2LQFT34 (1 references)
target prot opt source destination
KUBE-SEP-O34O4OGFBAADOMEG all -- 0.0.0.0/0 0.0.0.0/0 /* default/mongodb:27017 */
I remove the --network-plugin=cni flag for kubelet service and upgrade the kubernetes to 1.13.0 solve the problem

kube-dns gets kube-proxy Failed to list *core.Endpoints

Fresh Kubernetes (1.10.0) cluster stood up using kubeadm (1.10.0) install on RHEL7 bare metal VMs
Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Dec 28 14:23:39 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
kubeadm.x86_64 1.10.0-0 installed
kubectl.x86_64 1.10.0-0 installed
kubelet.x86_64 1.10.0-0 installed
kubernetes-cni.x86_64 0.6.0-0 installed
and 1.12 docker
docker-engine.x86_64 1.12.6-1.el7.centos installed
docker-engine-selinux.noarch 1.12.6-1.el7.centos installed
With Flannel v0.9.1 pod network installed
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
kubeadm init comand I ran is
kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version stable-1.10
which completes successfully and kubeadm join on worker node also successful. I can deploy busybox pod on the master and nslookups are successful, but as soon as I deploy anything to the worker node I get failed API calls from the worker node on the master:
E0331 03:28:44.368253 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://172.30.0.85:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 172.30.0.85:6443: getsockopt: connection refused
E0331 03:28:44.368987 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://172.30.0.85:6443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 172.30.0.85:6443: getsockopt: connection refused
E0331 03:28:44.735886 1 event.go:209] Unable to write event: 'Post https://172.30.0.85:6443/api/v1/namespaces/default/events: dial tcp 172.30.0.85:6443: getsockopt: connection refused' (may retry after sleeping)
E0331 03:28:51.980131 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:kube-proxy" cannot list endpoints at the cluster scope
I0331 03:28:52.048995 1 controller_utils.go:1026] Caches are synced for service config controller
I0331 03:28:53.049005 1 controller_utils.go:1026] Caches are synced for endpoints config controller
and nslookup times out
kubectl exec -it busybox -- nslookup kubernetes
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'kubernetes'
command terminated with exit code 1
I have looked at many similar post on stackoverflow and github and all seem to be resolved with setting iptables -A FORWARD -j ACCEPT but not this time. I have also included the iptables from the worker node
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */
DOCKER all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */
DOCKER all -- anywhere !loopback/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
KUBE-POSTROUTING all -- anywhere anywhere /* kubernetes postrouting rules */
MASQUERADE all -- 172.17.0.0/16 anywhere
RETURN all -- 10.244.0.0/16 10.244.0.0/16
MASQUERADE all -- 10.244.0.0/16 !base-address.mcast.net/4
RETURN all -- !10.244.0.0/16 box2.ara.ac.nz/24
MASQUERADE all -- !10.244.0.0/16 10.244.0.0/16
Chain DOCKER (2 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-MARK-DROP (0 references)
target prot opt source destination
MARK all -- anywhere anywhere MARK or 0x8000
Chain KUBE-MARK-MASQ (6 references)
target prot opt source destination
MARK all -- anywhere anywhere MARK or 0x4000
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- anywhere anywhere /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
Chain KUBE-SEP-HZC4RESJCS322LXV (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.244.0.18 anywhere /* kube-system/kube-dns:dns-tcp */
DNAT tcp -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */ tcp to:10.244.0.18:53
Chain KUBE-SEP-JNNVSHBUREKVBFWD (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.244.0.18 anywhere /* kube-system/kube-dns:dns */
DNAT udp -- anywhere anywhere /* kube-system/kube-dns:dns */ udp to:10.244.0.18:53
Chain KUBE-SEP-U3UDAUPXUG5BP2NG (2 references)
target prot opt source destination
KUBE-MARK-MASQ all -- box1.ara.ac.nz anywhere /* default/kubernetes:https */
DNAT tcp -- anywhere anywhere /* default/kubernetes:https */ recent: SET name: KUBE-SEP-U3UDAUPXUG5BP2NG side: source mask: 255.255.255.255 tcp to:172.30.0.85:6443
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.96.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- anywhere 10.96.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-MARK-MASQ udp -- !10.244.0.0/16 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-TCOU7JCQXEZGVUNU udp -- anywhere 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.96.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SVC-ERIFXISQEP7F7OF4 tcp -- anywhere 10.96.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
target prot opt source destination
KUBE-SEP-HZC4RESJCS322LXV all -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */
Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
target prot opt source destination
KUBE-SEP-U3UDAUPXUG5BP2NG all -- anywhere anywhere /* default/kubernetes:https */ recent: CHECK seconds: 10800 reap name: KUBE-SEP-U3UDAUPXUG5BP2NG side: source mask: 255.255.255.255
KUBE-SEP-U3UDAUPXUG5BP2NG all -- anywhere anywhere /* default/kubernetes:https */
Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
target prot opt source destination
KUBE-SEP-JNNVSHBUREKVBFWD all -- anywhere anywhere /* kube-system/kube-dns:dns */
Chain WEAVE (0 references)
target prot opt source destination
Chain cali-OUTPUT (0 references)
target prot opt source destination
Chain cali-POSTROUTING (0 references)
target prot opt source destination
Chain cali-PREROUTING (0 references)
target prot opt source destination
Chain cali-fip-dnat (0 references)
target prot opt source destination
Chain cali-fip-snat (0 references)
target prot opt source destination
Chain cali-nat-outgoing (0 references)
target prot opt source destination
Also I can see packets getting dropped on the flannel interface
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.1.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::a096:47ff:fe58:e438 prefixlen 64 scopeid 0x20<link>
ether a2:96:47:58:e4:38 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 198 bytes 14747 (14.4 KiB)
TX errors 0 dropped 27 overruns 0 carrier 0 collisions 0
I have installed the same versions of Kubernetes/Docker and Flannel on other VMs and it works, but not sure why I am getting these failed API calls to the master proxy from the worker nodes in this install? I have several fresh installs and tried weave and calico pod networks as well with the same result.
Right so I got this going by changing from flannel to weave container networking, and a kubeadm reset and restart of the VMs.
Not sure what the problem was flannel and my VMs but happy to have got it going.

Kubernetes Service Proxy Concept: PublicIP

I am trying to launch Guestbook.go example from kubernetes doc as follows:
https://github.com/GoogleCloudPlatform/kubernetes/tree/master/examples/guestbook-go
I have modified the guestbook-service.json in the above link to include PublicIPs
{
"apiVersion": "v1beta1",
"kind": "Service",
"id": "guestbook",
"port": 3000,
"containerPort": "http-server",
"publicIPs": ["10.39.67.97","10.39.66.113"],
"selector": { "name": "guestbook" }
}
I have one master and two minions as shown below:
centos-minion -> 10.39.67.97
centos-minion2 -> 10.39.66.113
I am using my publicIPs as my minions eth0 IP. But the iptables get created only on one of the minions:
[root#ip-10-39-67-97 ~]# iptables -L -n -t nat
Chain KUBE-PORTALS-HOST (1 references)
target prot opt source destination
DNAT tcp -- 0.0.0.0/0 10.254.208.93 /* redis-slave */ tcp dpt:6379 to:10.39.67.240:56872
DNAT tcp -- 0.0.0.0/0 10.254.223.192 /* guestbook */ tcp dpt:3000 to:10.39.67.240:58746
DNAT tcp -- 0.0.0.0/0 10.39.67.97 /* guestbook */ tcp dpt:3000 to:10.39.67.240:58746
DNAT tcp -- 0.0.0.0/0 10.39.66.113 /* guestbook */ tcp dpt:3000 to:10.39.67.240:58746
DNAT tcp -- 0.0.0.0/0 10.254.0.2 /* kubernetes */ tcp dpt:443 to:10.39.67.240:33003
DNAT tcp -- 0.0.0.0/0 10.254.0.1 /* kubernetes-ro */ tcp dpt:80 to:10.39.67.240:58434
DNAT tcp -- 0.0.0.0/0 10.254.131.70 /* redis-master */ tcp dpt:6379 to:10.39.67.240:50754
So even if i redundancy with my Pods if i bring down the minion with that IPTABLE my external publicIP entry point dies.. I am sure i have conceptual misunderstanding.. Can anyone help
(sorry for the delay in answering your question)
PublicIPs are defined on every host node, so if you make sure that you round-robin packets into the host nodes from your external load balancer, (e.g. an edge router) then you can have redundancy, even if one of the host machines fails.
--brendan
You should use kubectl proxy  --accept-hosts='^*$' --address='0.0.0.0' --port=8001 to expose your kube-proxy port to the public IP or internal iP

iptables blocking local connection to mongodb

I have a VM (Ubuntu 12.04.4 LTS) with mongodb (2.0.4) that I want to restrict with iptables to only accepting SSH (in/out) and nothing else.
This is how my setup script looks like to setup the rules:
#!/bin/sh
# DROP everything
iptables -F
iptables -X
iptables -P FORWARD DROP
iptables -P INPUT DROP
iptables -P OUTPUT DROP
# input
iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A INPUT -s 127.0.0.1 -j ACCEPT # accept all ports for local conns
# output
iptables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -p tcp -m tcp --dport 22 -j ACCEPT # ssh
But with these rules activated, I can't connect to mongodb locally.
ubuntu ~ $ mongo
MongoDB shell version: 2.0.4
connecting to: test
Fri Mar 28 09:40:40 Error: couldn't connect to server 127.0.0.1 shell/mongo.js:84
exception: connect failed
Without them, it works fine. Is there any special firewall case one needs to consider when deploying mongodb?
I tried installing mysql, and it works perfectly for local connections.
SSH works as exepected (can connect from outside and inside).
The iptables rules looks like this once set:
ubuntu ~ $ sudo iptables -nvL
Chain INPUT (policy DROP 8 packets, 1015 bytes)
pkts bytes target prot opt in out source destination
449 108K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
0 0 ACCEPT all -- * * 127.0.0.1 0.0.0.0/0
0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22
0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80
32 2048 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:443
Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy DROP 27 packets, 6712 bytes)
pkts bytes target prot opt in out source destination
379 175K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22
Outbound traffic must be accepted for the loopback (127.0.0.1) as well.
Adding this made it work:
iptables -A OUTPUT -o lo -j ACCEPT
You migth want to try, substituting the line
iptables -A INPUT -s 127.0.0.1 -j ACCEPT
With
iptables -A INPUT -i lo -j ACCEPT