Unable to acccess nginx pod across nodes using ClusterIP - kubernetes

I have created nginx deployment and nginx service(ClusterIP) to access nginx pod. But not able to access pod through cluster IP across nodes other than node where pod is scheduled.
I tried looking for IPtable too. But do not DNAT entry over there.
root#kdm-master-1:~# k get all -A -o wide |grep nginx
default pod/nginx-6db489d4b7-pfkm9 1/1 Running 0 3h16m 10.244.1.3 kdm-worker-1 <none> <none>
default service/nginx ClusterIP 10.102.239.131 <none> 80/TCP 3h20m run=nginx
default deployment.apps/nginx 1/1 1 1 3h32m nginx nginx run=nginx
default replicaset.apps/nginx-6db489d4b7 1 1 1 3h32m nginx nginx pod-template-hash=6db489d4b7,run=nginx
IP table:
root#kdm-master-1:~# iptables -L -t nat|grep nginx
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.102.239.131 /* default/nginx:80-80 cluster IP */ tcp dpt:http
KUBE-SVC-OVTWZ4GROBJZO4C5 tcp -- anywhere 10.102.239.131 /* default/nginx:80-80 cluster IP */ tcp dpt:http
# Warning: iptables-legacy tables present, use iptables-legacy to see them
Please advice how can I resolve it?

set net.ipv4.ip_forward=1 in /etc/sysctl.conf
run sysctl --system
This will resolve the issue and one will be able able to access the pod from any node.

Related

Not able to access Nginx from an external IP even after k8s nodeport service exposed

I am not able to access the nginx server using http://:30602 and also http://:30602
OS: Ubuntu 22
I also checked if any firewall is blocking it.
Using ufw
admin#tst-server:~$ sudo ufw status verbose
Status: inactive
Using netstat
admin#tst-server:~$ netstat -an | grep 22 | grep -i listen
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp6 0 0 :::22 :::* LISTEN
unix 2 [ ACC ] STREAM LISTENING 354787 /run/containerd/s/9a866c6ea3a4fe1976aaed0884400cd59228d43776774cc3fad2d0b9a7c2ed7b
unix 2 [ ACC ] STREAM LISTENING 21722 /run/systemd/private
admin#tst-server:~$ netstat -an | grep 30602 | grep -i listen
Commands used for nginx deployment
Create Deployment
kubectl create deployment nginx --image=nginx
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
myapp 2/2 2 2 8d
nginx 1/1 1 1 9m50s
Create Service
kubectl create service nodeport nginx --tcp=80:80
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8d
nginx NodePort 10.109.112.116 <none> 80:30602/TCP 10m
Test it out
admin#tst-server:~$ hostname
tst-server.com
admin#tst-server:~$ curl tst-server.com:30602
curl: (7) Failed to connect to tst-server.com port 30602 after 10 ms: Connection refused
Got it working by getting the Node IP address for Minikube using following command
$ kubectl cluster-info
and then
curl http://<node_ip>:30008
Upon curl test-server.com:30602 why it redirects to tst-server.kanaaritech.com?
To check whether the node port is working or not you can check once with the node's IP with port 30602.

Can't resolve dns in kubernetes

I use next command to check dns issue in my k8s:
kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
kubectl exec -i -t dnsutils -- nslookup kubernetes.default
The nslookup result is:
;; connection timed out; no servers could be reached
command terminated with exit code 1
dnsutils.yaml:
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
NOTE: it's a machine which default disable all ports, so I ask our IT admin already open the port based on next doc check-required-ports, I'm not sure if this matters.
And use next I could get the pod ip of coredns.
kubectl get pods -n kube-system -o wide | grep core
coredns-7877db9d45-swb6c 1/1 Running 0 2m58s 10.244.1.8 node2 <none> <none>
coredns-7877db9d45-zwc8v 1/1 Running 0 2m57s 10.244.0.6 node1 <none> <none>
Here, 10.244.0.6 is my master while 10.244.1.8 is my working node.
Then if I directly specify coredns pod ip:
master node ok:
kubectl exec -i -t dnsutils -- nslookup kubernetes.default 10.244.0.6
Server: 10.244.0.6
Address: 10.244.0.6#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
work node not ok:
# kubectl exec -i -t dnsutils -- nslookup kubernetes.default 10.244.1.8
;; connection timed out; no servers could be reached
command terminated with exit code 1
So, the question narrow down to why COREDNS on work node not works? Anything I need to pay attention?
Environment:
OS: ubuntu18.04
K8S: v1.21.0
Cluster boot command:
kubeadm init --pod-network-cidr=10.244.0.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Finally, I find the root cause, this is hardware firewall issue, see this:
Firewalls
When using udp backend, flannel uses UDP port 8285 for sending encapsulated packets.
When using vxlan backend, kernel uses UDP port 8472 for sending encapsulated packets.
Make sure that your firewall rules allow this traffic for all hosts participating in the overlay network.
Make sure that your firewall rules allow traffic from pod network cidr visit your kubernetes master node.
When nslookup client on the same node of dns server, it won't trigger firewall block, so everything is ok.
When nslookup client not on the same node of dns server, it will trigger firewall block, so we can't access dns server.
So, after open the ports, everything ok now.

VPC native Clusters in GKE can't communicate in GKE 1.14

I have created two seperate GKE clusters on K8s 1.14.10.
VPN access to in-house network not working after GKE cluster upgrade to 1.14.6
I have followed this and the IP masquerading agent documentation.
I have tried to test this using a client pod and server pod to exchange messages.
I'm using Internal node IP to send message and created a ClusterIP to expose the pods.
I have allowed requests for every instance in firewall rules for ingress and egress i.e 0.0.0.0/0.
Pic:This is the description of the cluster which I have created
The config map of the IP masquerading agent stays the same as in the documentation.
I'm able to ping the other node from within the pod but curl request says connection refused and tcpdump shows no data.
Problem:
I need to communicate from cluster A to cluster B in gke 1.14 with ipmasquerading set to true. I either get connection refused or i/o timeout. I have tried using internal and external node IPs as well as using a loadbalancer.
You have provided quite general information and without details I cannot provide specific scenario answer. It might be related to how did you create clusters or other firewalls settings. Due to that I will provide correct steps to creation and configuration 2 clusters with firewall and masquerade. Maybe you will be able to find which step you missed or misconfigured.
Clusters configuration (node,pods,svc) are on the bottom of the answer.
1. Create VPC and 2 clusters
In docs it says about 2 different projects but you can do it in one project.
Good example of VPC creation and 2 clusters can be found in GKE docs. Create VPC and Crate 2 clusters. In cluster Tier1 you can enable NetworkPolicy now instead of enabling it later.
After that you will need to create Firewall Rules. You will also need to add ICMP protocol to firewall rule.
At this point you should be able to ping between nodes from 2 clusters.
For additional Firewall rules (allowing connection between pods, svc, etc) please check this docs.
2. Enable IP masquerade agent
As mentioned in docs, to run IPMasquerade:
The ip-masq-agent DaemonSet is automatically installed as an add-on with --nomasq-all-reserved-ranges argument in a GKE cluster, if one or more of the following is true:
The cluster has a network policy.
OR
The Pod's CIDR range is not within 10.0.0.0/8.
It mean that tier-2-cluster already have ip-masq-agent in kube-system namespace (because The Pod's CIDR range is not within 10.0.0.0/8.). And if you enabled NetworkPolicy during creation of tier-1-cluster it should be have also installed. If not, you will need to enable it using command:
$ gcloud container clusters update tier-1-cluster --update-addons=NetworkPolicy=ENABLED --zone=us-central1-a
To verify if everything is ok you have to check if Daemonset ip-masq-agent pods were created. (Each pod for node).
$ kubectl get ds ip-masq-agent -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ip-masq-agent 3 3 3 3 3 beta.kubernetes.io/masq-agent-ds-ready=true 168m
If you will SSH to any of your nodes you will be able to see default iptables entries.
$ sudo iptables -t nat -L IP-MASQ
Chain IP-MASQ (1 references)
target prot opt source destination
RETURN all -- anywhere 169.254.0.0/16 /* ip-masq: local traffic is not subject to MASQUERADE */
RETURN all -- anywhere 10.0.0.0/8 /* ip-masq: RFC 1918 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 172.16.0.0/12 /* ip-masq: RFC 1918 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.168.0.0/16 /* ip-masq: RFC 1918 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 240.0.0.0/4 /* ip-masq: RFC 5735 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.0.2.0/24 /* ip-masq: RFC 5737 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 198.51.100.0/24 /* ip-masq: RFC 5737 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 203.0.113.0/24 /* ip-masq: RFC 5737 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 100.64.0.0/10 /* ip-masq: RFC 6598 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 198.18.0.0/15 /* ip-masq: RFC 6815 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.0.0.0/24 /* ip-masq: RFC 6890 reserved range is not subject to MASQUERADE */
RETURN all -- anywhere 192.88.99.0/24 /* ip-masq: RFC 7526 reserved range is not subject to MASQUERADE */
MASQUERADE all -- anywhere anywhere /* ip-masq: outbound traffic is subject to MASQUERADE (must be last in chain) */
3. Deploy test application
I've used Hello application from GKE docs and deployed on both Clusters. In addition I have also deployed ubuntu image for tests.
4. Apply proper configuration for IPMasquerade
This config need to be on the source cluster.
In short, if destination CIDR is in nonMasqueradeCIDRs:, it will show it internal IP, otherwise it will show NodeIP as source.
Save to file config below text:
nonMasqueradeCIDRs:
- 10.0.0.0/8
resyncInterval: 2s
masqLinkLocal: true
Create IPMasquarade ConfigMap
$ kubectl create configmap ip-masq-agent --from-file config --namespace kube-system
It will overwrite iptables configuration
$ sudo iptables -t nat -L IP-MASQ
Chain IP-MASQ (2 references)
target prot opt source destination
RETURN all -- anywhere 10.0.0.0/8 /* ip-masq-agent: local traffic is not subject to MASQUERADE */
MASQUERADE all -- anywhere anywhere /* ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */
5. Tests:
When IP is Masqueraded
SSH to Node form Tier2 cluster and run:
sudo toolbox bash
apt-get update
apt install -y tcpdump
Now you should listen using command below. Port 32502 is NodePort service from Tier 2 Cluster
tcpdump -i eth0 -nn -s0 -v port 32502
In Cluster Tier1 you need to enter ubuntu pod and curl NodeIP:NodePort
$ kubectl exec -ti ubuntu -- bin/bash
You will need to install curl apt-get install curl.
curl NodeIP:NodePort (Node which is listening, NodePort from service from Cluster Tier 2).
CLI:
root#ubuntu:/# curl 172.16.4.3:32502
Hello, world!
Version: 2.0.0
Hostname: hello-world-deployment-7f67f479f5-h4wdm
On Node you can see entry like:
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:53:30.321641 IP (tos 0x0, ttl 63, id 25373, offset 0, flags [DF], proto TCP (6), length 60)
10.0.4.4.56018 > 172.16.4.3.32502: Flags [S], cksum 0x8648 (correct), seq 3001889856
10.0.4.4 is NodeIP where Ubuntu pod is located.
When IP was not Masqueraded
Remove ConfigMap from Cluster Tier 1
$ kubectl delete cm ip-masq-agent -n kube-system
Change in file config CIDR to 172.16.4.0/22 which is Tier 2 nodes pool and reapply CM
$ kubectl create configmap ip-masq-agent --from-file config --namespace kube-system
SSH to any node from Tier 1 to check if iptables rules were changed.
sudo iptables -t nat -L IP-MASQ
Chain IP-MASQ (2 references)
target prot opt source destination
RETURN all -- anywhere 172.16.4.0/22 /* ip-masq-agent: local traffic is not subject to MASQUERADE */
MASQUERADE all -- anywhere anywhere /* ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */
Now for test I have again used Ubuntu pod and curl the same ip like before.
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:16:50.316234 IP (tos 0x0, ttl 63, id 53160, offset 0, flags [DF], proto TCP (6), length 60)
10.4.2.8.57876 > 172.16.4.3.32502
10.4.2.8 is internal IP of Ubuntu pod.
Configuration for Tests:
TIER1
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/hello-world-deployment-7f67f479f5-b2qqz 1/1 Running 0 15m 10.4.1.8 gke-tier-1-cluster-default-pool-e006097b-5tnj <none> <none>
pod/hello-world-deployment-7f67f479f5-shqrt 1/1 Running 0 15m 10.4.2.5 gke-tier-1-cluster-default-pool-e006097b-lfvh <none> <none>
pod/hello-world-deployment-7f67f479f5-x7jvr 1/1 Running 0 15m 10.4.0.8 gke-tier-1-cluster-default-pool-e006097b-1wbf <none> <none>
ubuntu 1/1 Running 0 91s 10.4.2.8 gke-tier-1-cluster-default-pool-e006097b-lfvh <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/hello-world NodePort 10.0.36.46 <none> 60000:31694/TCP 14m department=world,greeting=hello
service/kubernetes ClusterIP 10.0.32.1 <none> 443/TCP 115m <none>
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node/gke-tier-1-cluster-default-pool-e006097b-1wbf Ready <none> 115m v1.14.10-gke.36 10.0.4.2 35.184.38.21 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-1-cluster-default-pool-e006097b-5tnj Ready <none> 115m v1.14.10-gke.36 10.0.4.3 35.184.207.20 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-1-cluster-default-pool-e006097b-lfvh Ready <none> 115m v1.14.10-gke.36 10.0.4.4 35.226.105.31 Container-Optimized OS from Google 4.14.138+ docker://18.9.7<none> 100m v1.14.10-gke.36 10.0.4.4 35.226.105.31 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
TIER2
$ kubectl get pods,svc,nodes -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/hello-world-deployment-7f67f479f5-92zvk 1/1 Running 0 12m 172.20.1.5 gke-tier-2-cluster-default-pool-57b1cc66-xqt5 <none> <none>
pod/hello-world-deployment-7f67f479f5-h4wdm 1/1 Running 0 12m 172.20.1.6 gke-tier-2-cluster-default-pool-57b1cc66-xqt5 <none> <none>
pod/hello-world-deployment-7f67f479f5-m85jn 1/1 Running 0 12m 172.20.1.7 gke-tier-2-cluster-default-pool-57b1cc66-xqt5 <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/hello-world NodePort 172.16.24.206 <none> 60000:32502/TCP 12m department=world,greeting=hello
service/kubernetes ClusterIP 172.16.16.1 <none> 443/TCP 113m <none>
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node/gke-tier-2-cluster-default-pool-57b1cc66-84ng Ready <none> 112m v1.14.10-gke.36 172.16.4.2 35.184.118.151 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-2-cluster-default-pool-57b1cc66-mlmn Ready <none> 112m v1.14.10-gke.36 172.16.4.3 35.238.231.160 Container-Optimized OS from Google 4.14.138+ docker://18.9.7
node/gke-tier-2-cluster-default-pool-57b1cc66-xqt5 Ready <none> 112m v1.14.10-gke.36 172.16.4.4 35.202.94.194 Container-Optimized OS from Google 4.14.138+ docker://18.9.7

Kubernetes pods can't ping each other using ClusterIP

I'm trying to ping the kube-dns service from a dnstools pod using the cluster IP assigned to the kube-dns service. The ping request times out. From the same dnstools pod, I tried to curl the kube-dns service using the exposed port, but that timed out as well.
Following is the output of kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default pod/busybox 1/1 Running 62 2d14h 192.168.1.37 kubenode <none>
default pod/dnstools 1/1 Running 0 2d13h 192.168.1.45 kubenode <none>
default pod/nginx-deploy-7c45b84548-ckqzb 1/1 Running 0 6d11h 192.168.1.5 kubenode <none>
default pod/nginx-deploy-7c45b84548-vl4kh 1/1 Running 0 6d11h 192.168.1.4 kubenode <none>
dmi pod/elastic-deploy-5d7c85b8c-btptq 1/1 Running 0 2d14h 192.168.1.39 kubenode <none>
kube-system pod/calico-node-68lc7 2/2 Running 0 6d11h 10.62.194.5 kubenode <none>
kube-system pod/calico-node-9c2jz 2/2 Running 0 6d12h 10.62.194.4 kubemaster <none>
kube-system pod/coredns-5c98db65d4-5nprd 1/1 Running 0 6d12h 192.168.0.2 kubemaster <none>
kube-system pod/coredns-5c98db65d4-5vw95 1/1 Running 0 6d12h 192.168.0.3 kubemaster <none>
kube-system pod/etcd-kubemaster 1/1 Running 0 6d12h 10.62.194.4 kubemaster <none>
kube-system pod/kube-apiserver-kubemaster 1/1 Running 0 6d12h 10.62.194.4 kubemaster <none>
kube-system pod/kube-controller-manager-kubemaster 1/1 Running 1 6d12h 10.62.194.4 kubemaster <none>
kube-system pod/kube-proxy-9hcgv 1/1 Running 0 6d11h 10.62.194.5 kubenode <none>
kube-system pod/kube-proxy-bxw9s 1/1 Running 0 6d12h 10.62.194.4 kubemaster <none>
kube-system pod/kube-scheduler-kubemaster 1/1 Running 1 6d12h 10.62.194.4 kubemaster <none>
kube-system pod/tiller-deploy-767d9b9584-5k95j 1/1 Running 0 3d9h 192.168.1.8 kubenode <none>
nginx-ingress pod/nginx-ingress-66wts 1/1 Running 0 5d17h 192.168.1.6 kubenode <none>
In the above output, why do some pods have an IP assigned in the 192.168.0.0/24 subnet whereas others have an IP that is equal to the IP address of my node/master? (10.62.194.4 is the IP of my master, 10.62.194.5 is the IP of my node)
This is the config.yml I used to initialize the cluster using kubeadm init --config=config.yml
apiServer:
certSANs:
- 10.62.194.4
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: dev-cluster
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.15.1
networking:
dnsDomain: cluster.local
podSubnet: 192.168.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
Result of kubectl get svc --all-namespaces -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d12h <none>
default service/nginx-deploy ClusterIP 10.97.5.194 <none> 80/TCP 5d17h run=nginx
dmi service/elasticsearch ClusterIP 10.107.84.159 <none> 9200/TCP,9300/TCP 2d14h app=dmi,component=elasticse
dmi service/metric-server ClusterIP 10.106.117.2 <none> 8098/TCP 2d14h app=dmi,component=metric-se
kube-system service/calico-typha ClusterIP 10.97.201.232 <none> 5473/TCP 6d12h k8s-app=calico-typha
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 6d12h k8s-app=kube-dns
kube-system service/tiller-deploy ClusterIP 10.98.133.94 <none> 44134/TCP 3d9h app=helm,name=tiller
The command I ran was kubectl exec -ti dnstools -- curl 10.96.0.10:53
EDIT:
I raised this question because I got this error when trying to resolve service names from within the cluster. I was under the impression that I got this error because I cannot ping the DNS server from a pod.
Output of kubectl exec -ti dnstools -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
command terminated with exit code 1
Output of kubectl exec dnstools cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local reddog.microsoft.com
options ndots:5
Result of kubectl get ep kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 192.168.0.2:53,192.168.0.3:53,192.168.0.2:53 + 3 more... 6d13h
EDIT:
Ping-ing the CoreDNS pod directly using its Pod IP times out as well:
/ # ping 192.168.0.2
PING 192.168.0.2 (192.168.0.2): 56 data bytes
^C
--- 192.168.0.2 ping statistics ---
24 packets transmitted, 0 packets received, 100% packet loss
EDIT:
I think something has gone wrong when I was setting up the cluster. Below are the steps I took when setting up the cluster:
Edit host files on master and worker to include the IP's and hostnames of the nodes
Disabled swap using swapoff -a and disabled swap permanantly by editing /etc/fstab
Install docker prerequisites using apt-get install apt-transport-https ca-certificates curl software-properties-common -y
Added Docker GPG key using curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
Added Docker repo using add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
Install Docker using apt-get update -y; -get install docker-ce -y
Install Kubernetes prerequisites using curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
Added Kubernetes repo using echo 'deb http://apt.kubernetes.io/ kubernetes-xenial main' | sudo tee /etc/apt/sources.list.d/kubernetes.list
Update repo and install Kubernetes components using apt-get update -y; apt-get install kubelet kubeadm kubectl -y
Configure master node:
kubeadm init --apiserver-advertise-address=10.62.194.4 --apiserver-cert-extra-sans=10.62.194.4 --pod-network-cidr=192.168.0.0/16
Copy Kube config to $HOME: mkdir -p $HOME/.kube; sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config; sudo chown $(id -u):$(id -g) $HOME/.kube/config
Installed Calico using kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml; kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
On node:
On the node I did the kubeadm join command using the command printed out from kubeadm token create --print-join-command on the master
The kubernetes system pods get assigned the host ip since they provide low level services that are not dependant on an overlay network (or in case of calico even provide the overlay network). They have the ip of the node where they run.
A common pod uses the overlay network and gets assigned an ip from the calico range, not from the metal node they run on.
You can't access DNS (port 53) with HTTP using curl. You can use dig to query a DNS resolver.
A service IP is not reachable by ping since it is a virtual IP just used as a routing handle for the iptables rules setup by kube-proxy, therefore a TCP connection works, but ICMP not.
You can ping a pod IP though, since it is assigned from the overlay network.
You should check on the same namespace
Currently, you are in default namespace and curl to other kube-system namespace.
You should check in the same namespace, I think it works.
On some cases the local host that Elasticsearch publishes is not routable/accessible from other hosts. On these cases you will have to configure network.publish_host in the yml config file, in order for Elasticsearch to use and publish the right address.
Try configuring network.publish_host to the right public address.
See more here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html#advanced-network-settings
note that control plane components like api server, etcd that runs on master node are bound to host network. and hence you see the ip address of the master server.
On the other hand, the apps that you deployed are going to get the ips from the pod subnet range. those vary from cluster node ip's
Try below steps to test dns working or not
deploy nginx.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
labels:
app: nginx
spec:
serviceName: "nginx"
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumes:
- name: www
emptyDir:
kuebctl create -f nginx.yaml
master $ kubectl get po
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 1m
web-1 1/1 Running 0 1m
master $ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 35m
nginx ClusterIP None <none> 80/TCP 2m
master $ kubectl run -i --tty --image busybox:1.28 dns-test --restart=Never --rm
If you don't see a command prompt, try pressing enter.
/ # nslookup nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: nginx
Address 1: 10.40.0.1 web-0.nginx.default.svc.cluster.local
Address 2: 10.40.0.2 web-1.nginx.default.svc.cluster.local
/ #
/ # nslookup web-0.nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx
Address 1: 10.40.0.1 web-0.nginx.default.svc.cluster.local
/ # nslookup web-0.nginx.default.svc.cluster.local
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx.default.svc.cluster.local
Address 1: 10.40.0.1 web-0.nginx.default.svc.cluster.local

NodePort services not available on all nodes

I'm attempting to run a 3-node Kubernetes cluster. I have the cluster up and running sufficiently that I have services running on different nodes. Unfortunately, I don't seem to be able to get NodePort based services to work correctly (as I understand correctness anyway...). My issue is that any NodePort services I define are available externally only on the node where their pod is running, and my understanding is that they should be available externally on any node in the cluster.
One example is a local Jira service, which should be running on port 8082 (internally) and on 32760 externally. Here is the service definition (just the service part):
apiVersion: v1
kind: Service
metadata:
name: jira
namespace: wittlesouth
spec:
ports:
- port: 8082
selector:
app: jira
type: NodePort
Here's the output of kubectl get service --namespace wittle south
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins NodePort 10.100.119.22 <none> 8081:31377/TCP 3d
jira NodePort 10.105.148.66 <none> 8082:32760/TCP 9h
ws-mysql ExternalName <none> mysql.default.svc.cluster.local 3306/TCP 1d
The pod for this service has a HostPort set for 8082. The three nodes in the cluster are nuc1, nuc2, nuc3:
Eric:~ eric$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
nuc1 Ready master 3d v1.9.2
nuc2 Ready <none> 2d v1.9.2
nuc3 Ready <none> 2d v1.9.2
Here are the results of trying to access the Jira instance via both the host and node ports:
Eric:~ eric$ curl https://nuc1.wittlesouth.com:8082/
curl: (7) Failed to connect to nuc1.wittlesouth.com port 8082: Connection refused
Eric:~ eric$ curl https://nuc2.wittlesouth.com:8082/
curl: (7) Failed to connect to nuc2.wittlesouth.com port 8082: Connection refused
Eric:~ eric$ curl https://nuc3.wittlesouth.com:8082/
curl: (51) SSL: no alternative certificate subject name matches target host name 'nuc3.wittlesouth.com'
Eric:~ eric$ curl https://nuc3.wittlesouth.com:32760/
curl: (51) SSL: no alternative certificate subject name matches target host name 'nuc3.wittlesouth.com'
Eric:~ eric$ curl https://nuc2.wittlesouth.com:32760/
^C
Eric:~ eric$ curl https://nuc1.wittlesouth.com:32760/
curl: (7) Failed to connect to nuc1.wittlesouth.com port 32760: Operation timed out
Based on my reading, it appears that cube-proxy is not doing what it is supposed to. I tried reading through the documentation for troubleshooting cube-proxy, it appears to be slightly out of date (when I grep for hostname in iptables-save, it finds nothing). Here is the kubernetes version information:
Eric:~ eric$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.1", GitCommit:"3a1c9449a956b6026f075fa3134ff92f7d55f812", GitTreeState:"clean", BuildDate:"2018-01-04T11:52:23Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
It appears that kube-proxy is running:
eric#nuc2:~$ ps waux | grep kube-proxy
root 1963 0.5 0.1 54992 37556 ? Ssl 21:43 0:02 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
eric 3654 0.0 0.0 14224 1028 pts/0 S+ 21:52 0:00 grep --color=auto kube-proxy
and
Eric:~ eric$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-6vspc 1/1 Running 3 2d
calico-kube-controllers-d669cc78f-b67rc 1/1 Running 5 3d
calico-node-526md 2/2 Running 9 3d
calico-node-5trgt 2/2 Running 3 2d
calico-node-r9ww4 2/2 Running 3 2d
etcd-nuc1 1/1 Running 6 3d
kube-apiserver-nuc1 1/1 Running 7 3d
kube-controller-manager-nuc1 1/1 Running 6 3d
kube-dns-6f4fd4bdf-dt5fp 3/3 Running 12 3d
kube-proxy-8xf4r 1/1 Running 1 2d
kube-proxy-tq4wk 1/1 Running 4 3d
kube-proxy-wcsxt 1/1 Running 1 2d
kube-registry-proxy-cv8x9 1/1 Running 4 3d
kube-registry-proxy-khpdx 1/1 Running 1 2d
kube-registry-proxy-r5qcv 1/1 Running 1 2d
kube-registry-v0-wcs5w 1/1 Running 2 3d
kube-scheduler-nuc1 1/1 Running 6 3d
kubernetes-dashboard-845747bdd4-dp7gg 1/1 Running 4 3d
It appears that cube-proxy is creating iptables entries for my service:
eric#nuc1:/var/lib$ sudo iptables-save | grep hostnames
eric#nuc1:/var/lib$ sudo iptables-save | grep jira
-A KUBE-NODEPORTS -p tcp -m comment --comment "wittlesouth/jira:" -m tcp --dport 32760 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "wittlesouth/jira:" -m tcp --dport 32760 -j KUBE-SVC-MO7XZ6ASHGM5BOPI
-A KUBE-SEP-LP4GHTW6PY2HYMO6 -s 192.168.124.202/32 -m comment --comment "wittlesouth/jira:" -j KUBE-MARK-MASQ
-A KUBE-SEP-LP4GHTW6PY2HYMO6 -p tcp -m comment --comment "wittlesouth/jira:" -m tcp -j DNAT --to-destination 192.168.124.202:8082
-A KUBE-SERVICES ! -s 10.5.0.0/16 -d 10.105.148.66/32 -p tcp -m comment --comment "wittlesouth/jira: cluster IP" -m tcp --dport 8082 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.105.148.66/32 -p tcp -m comment --comment "wittlesouth/jira: cluster IP" -m tcp --dport 8082 -j KUBE-SVC-MO7XZ6ASHGM5BOPI
-A KUBE-SVC-MO7XZ6ASHGM5BOPI -m comment --comment "wittlesouth/jira:" -j KUBE-SEP-LP4GHTW6PY2HYMO6
Unfortunately, I know nothing about iptables at this point, so I don't know if those entries look correct or not. I'm suspicious that my non-default network setting during kubeadm init may be related to this, as I was trying to set up Kubernetes to not use the same IP address range of my network (which is 192.168 based). The kubeadm init statement I used was:
kubeadm init --pod-network-cidr=10.5.0.0/16 --apiserver-cert-extra-sans ['kubemaster.wittlesouth.com','192.168.5.10'
If you've noticed that I'm using calico which defaults to a pod network pool of 192.168.0.0, I modified the pod network pool setting for calico when I created the calico service (not sure if that is related or not).
At this point, I'm concluding either I don't understand how NodePort services are supposed to work, or there is something wrong with my cluster configuration. Any suggestions on next steps to diagnose would be greatly appreciated!
When you define a NodePort service there are actually three ports in play:
The container port: this is the port your pod is actually listening on, and it's only available when directly hitting your container from within the cluster, pod to pod (JIRA's default port would be 8080). You set the targetPort in your service to this port.
The service port: this is the load balanced port the service itself exposes internally in the cluster. With a single pod there's no load balancing at play, but it's still the entry point to your service. The port in your service definition defines this. If you don't specify a targetPort then it assumes port and targetPort are the same.
The node port: The port exposed on each worker node that routes to your service. This is a port typically in the 30000-33000 range (depending on how your cluster if configured). This is the only port that you would be able to access from outside the cluster. This is defined with nodePort.
Assuming that you are running JIRA on the standard port, you would want a service definition something like:
apiVersion: v1
kind: Service
metadata:
name: jira
namespace: wittlesouth
spec:
ports:
- port: 80 # this is the service port, can be anything
targetPort: 8080 # this is the container port (must match the port your pod is listening on)
nodePort: 32000 # if you don't specify this it randomly picks an available port in your NodePort range
selector:
app: jira
type: NodePort
So, if you use that configuration an incoming request to your NodePort service goes: NodePort (32000) -> service (80) -> pod (8080). (Internally it might actually bypass the service, I'm not 100% sure about that, but you can conceptually think about it in this way).
It also appears that you're trying to hit JIRA directly with HTTPS. Did you configure a certificate in your JIRA pod? If so you need to make sure it's a valid cert for nuc1.wittlesouth.com or tell curl to ignore certificate validation errors with curl -k.
For the first part, with HostPort it is pretty much exactly as expected, it should work only on host it is running on and here it does. The fact that NodePort works only on one of the nodes is a problem , as you correctly assume it should work on all the nodes.
As it works on one of them, it looks that your API server and kube-proxy do their work, and it is unlikely to be cause by any of them.
First thing to check is if your calico works fine and if you can connect from all the nodes to the actual pod running your jira. If not, then that is your problem. I suggest running tcpdump both on the node you curl to and on the node that has the pod running to see if packets are reaching the nodes, and how they leave them (specificaly the recieving node that does not respond to curl)