K8s NodePort service is “unreachable by IP” only on 2/4 slaves in the cluster - kubernetes

I created a K8s cluster of 5 VMs (1 master and 4 slaves running Ubuntu 16.04.3 LTS) using kubeadm. I used flannel to set up networking in the cluster. I was able to successfully deploy an application. I, then, exposed it via NodePort service. From here things got complicated for me.
Before I started, I disabled the default firewalld service on master and the nodes.
As I understand from the K8s Services doc, the type NodePort exposes the service on all nodes in the cluster. However, when I created it, the service was exposed only on 2 nodes out of 4 in the cluster. I am guessing that's not the expected behavior (right?)
For troubleshooting, here are some resource specs:
root#vm-vivekse-003:~# kubectl get nodes
NAME STATUS AGE VERSION
vm-deepejai-00b Ready 5m v1.7.3
vm-plashkar-006 Ready 4d v1.7.3
vm-rosnthom-00f Ready 4d v1.7.3
vm-vivekse-003 Ready 4d v1.7.3 //the master
vm-vivekse-004 Ready 16h v1.7.3
root#vm-vivekse-003:~# kubectl get pods -o wide -n playground
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-bootcamp-2457653786-9qk80 1/1 Running 0 2d 10.244.3.6 vm-rosnthom-00f
springboot-helloworld-2842952983-rw0gc 1/1 Running 0 1d 10.244.3.7 vm-rosnthom-00f
root#vm-vivekse-003:~# kubectl get svc -o wide -n playground
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
sb-hw-svc 10.101.180.19 <nodes> 9000:30847/TCP 5h run=springboot-helloworld
root#vm-vivekse-003:~# kubectl describe svc sb-hw-svc -n playground
Name: sb-hw-svc
Namespace: playground
Labels: <none>
Annotations: <none>
Selector: run=springboot-helloworld
Type: NodePort
IP: 10.101.180.19
Port: <unset> 9000/TCP
NodePort: <unset> 30847/TCP
Endpoints: 10.244.3.7:9000
Session Affinity: None
Events: <none>
root#vm-vivekse-003:~# kubectl get endpoints sb-hw-svc -n playground -o yaml
apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-09T06:28:06Z
name: sb-hw-svc
namespace: playground
resourceVersion: "588958"
selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc
uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b
subsets:
- addresses:
- ip: 10.244.3.7
nodeName: vm-rosnthom-00f
targetRef:
kind: Pod
name: springboot-helloworld-2842952983-rw0gc
namespace: playground
resourceVersion: "473859"
uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b
ports:
- port: 9000
protocol: TCP
After some tinkering I realized that on those 2 "faulty" nodes, those services were not available from within those hosts itself.
Node01 (working):
root#vm-vivekse-004:~# curl 127.0.0.1:30847 //<localhost>:<nodeport>
Hello Docker World!!
root#vm-vivekse-004:~# curl 10.101.180.19:9000 //<cluster-ip>:<port>
Hello Docker World!!
root#vm-vivekse-004:~# curl 10.244.3.7:9000 //<pod-ip>:<port>
Hello Docker World!!
Node02 (working):
root#vm-rosnthom-00f:~# curl 127.0.0.1:30847
Hello Docker World!!
root#vm-rosnthom-00f:~# curl 10.101.180.19:9000
Hello Docker World!!
root#vm-rosnthom-00f:~# curl 10.244.3.7:9000
Hello Docker World!!
Node03 (not working):
root#vm-plashkar-006:~# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root#vm-plashkar-006:~# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root#vm-plashkar-006:~# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out
Node04 (not working):
root#vm-deepejai-00b:/# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root#vm-deepejai-00b:/# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root#vm-deepejai-00b:/# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out
Tried netstat and telnet on all 4 slaves. Here's the output:
Node01 (the working host):
root#vm-vivekse-004:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 27808/kube-proxy
root#vm-vivekse-004:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Node02 (the working host):
root#vm-rosnthom-00f:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 11842/kube-proxy
root#vm-rosnthom-00f:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Node03 (the not-working host):
root#vm-plashkar-006:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 7791/kube-proxy
root#vm-plashkar-006:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out
Node04 (the not-working host):
root#vm-deepejai-00b:/# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 689/kube-proxy
root#vm-deepejai-00b:/# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out
Addition info:
From the kubectl get pods output, I can see that the pod is actually deployed on slave vm-rosnthom-00f. I am able to ping this host from all the 5 VMs and curl vm-rosnthom-00f:30847 also works from all the VMs.
I can clearly see that the internal cluster networking is messed up, but I am unsure how to resolve it! iptables -L for all the slaves are identical, and even the Local Loopback (ifconfig lo) is up and running for all the slaves. I'm completely clueless as to how to fix it!

Use a service type NodePort and access the NodePort if the Ipadress of your Master node.
The Service obviously knows on which node a Pod is running and redirect the traffic to one of the pods if you have several instances.
Label your pods and use the corrispondent selectors in the service.
If you get still into issues please post your service and deployment.
To check the connectivity i would suggest to use netcat.
nc -zv ip/service port
if network is ok it responds: open
inside the cluster access the containers like so:
nc -zv servicename.namespace.svc.cluster.local port
Consider always that you have 3 kinds of ports.
Port on which your software is running in side your container.
Port on which you expose that port to the pod. (a pod has one ipaddress, the clusterIp address, which is use by a container on a specific port)
NodePort wich allows you to access the pods ipaddress ports from outside the clusters network.

Either your firewall blocks some connections between nodes or your kube-proxy is not working properly. I guess your services work only on nodes where pods are running on.

If you want to reach the service from any node in the cluster you need fine service type as ClusterIP. Since you defined service type as NodePort, you can connect from the node where service is running.
my above answer was not correct, based on documentation we should be able to connect from any NodeIP:Nodeport. but its not working in my cluster also.
https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types
NodePort: Exposes the service on each Node’s IP at a static port (the
NodePort). A ClusterIP service, to which the NodePort service will
route, is automatically created. You’ll be able to contact the
NodePort service, from outside the cluster, by requesting
:.
One of my node ip forward not set. I was able to connect my service using NodeIP:nodePort
sysctl -w net.ipv4.ip_forward=1

Related

Not able to access Nginx from an external IP even after k8s nodeport service exposed

I am not able to access the nginx server using http://:30602 and also http://:30602
OS: Ubuntu 22
I also checked if any firewall is blocking it.
Using ufw
admin#tst-server:~$ sudo ufw status verbose
Status: inactive
Using netstat
admin#tst-server:~$ netstat -an | grep 22 | grep -i listen
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp6 0 0 :::22 :::* LISTEN
unix 2 [ ACC ] STREAM LISTENING 354787 /run/containerd/s/9a866c6ea3a4fe1976aaed0884400cd59228d43776774cc3fad2d0b9a7c2ed7b
unix 2 [ ACC ] STREAM LISTENING 21722 /run/systemd/private
admin#tst-server:~$ netstat -an | grep 30602 | grep -i listen
Commands used for nginx deployment
Create Deployment
kubectl create deployment nginx --image=nginx
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
myapp 2/2 2 2 8d
nginx 1/1 1 1 9m50s
Create Service
kubectl create service nodeport nginx --tcp=80:80
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8d
nginx NodePort 10.109.112.116 <none> 80:30602/TCP 10m
Test it out
admin#tst-server:~$ hostname
tst-server.com
admin#tst-server:~$ curl tst-server.com:30602
curl: (7) Failed to connect to tst-server.com port 30602 after 10 ms: Connection refused
Got it working by getting the Node IP address for Minikube using following command
$ kubectl cluster-info
and then
curl http://<node_ip>:30008
Upon curl test-server.com:30602 why it redirects to tst-server.kanaaritech.com?
To check whether the node port is working or not you can check once with the node's IP with port 30602.

Unable to curl pod IP using containerPort

Sorry if it's a naive question. Please correct me if my understanding is wrong.
Created POD using this command:
kubectl run nginx --image=nginx --port=8888
My understand of this command, nginx (application) container will be exposed/available at port 8888
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx 1/1 Running 0 10m 10.244.1.2 node01 <none> <none>
curl -v 10.244.1.2:8888 ===> i am wondering why this failed ?
* Trying 10.244.1.2:8888...
* TCP_NODELAY set
* connect to 10.244.1.2 port 8888 failed: Connection refused
* Failed to connect to 10.244.1.2 port 8888: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 10.244.1.2 port 8888: Connection refused
curl -v 10.244.1.2 ===> to my surprise this returned 200 success response
* Trying 10.244.1.2:80...
* TCP_NODELAY set
* Connected to 10.244.1.2 (10.244.1.2) port 80 (#0)
> GET / HTTP/1.1
> Host: 10.244.1.2
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
If the application is still referring default 80 port, I am wondering about the significance of container port 8888 ?
OK, so it may be used to expose the POD to the outside world.
Let's see that, I went ahead and created service for the POD:
kubectl expose pod nginx --port=80 --target-port=8888
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx ClusterIP 10.96.214.161 <none> 80/TCP 13m
$ curl -v 10.96.214.161 ==> here default port (80) didn't work
* Trying 10.96.214.161:80...
* TCP_NODELAY set
* connect to 10.96.214.161 port 80 failed: Connection refused
* Failed to connect to 10.96.214.161 port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 10.96.214.161 port 80: Connection refused
$ curl -v 10.96.214.161:8888 ==> target port didn't work either
* Trying 10.96.214.161:8888...
* TCP_NODELAY set
....waiting forever
Which port do I need to use to make it work? Am I missing anything?
By default, nginx server listen to the port 80. You can see it in their docker image ref.
With kubectl run nginx --image=nginx --port=8888 what you have done here is you have expose another port along with 80. But the server is still listening on the 80 port.
So, try with target port 80. For this reason when you tried with other than port 80 it's not working. Try with set --target-port=8888 to --target-port=80.
Or, If you want to change the server port you need to use configmap along with pod to pass custom config to the server.

Can we setup a k8s bare matal server to run Bind DNS server (named) and have an access to it from the outside on port 53?

I have setup a k8s cluster using 2 bare metal servers (1 master and 1 worker) using kubespray with default settings (kube_proxy_mode: iptables and dns_mode: coredns) and I would like to run a BIND DNS server inside to manage a couple of domain names.
I deployed with helm 3 an helloworld web app for testing. Everything works like a charm (HTTP, HTTPs, Let's Encrypt thought cert-manager).
kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T21:03:42Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.7", GitCommit:"be3d344ed06bff7a4fc60656200a93c74f31f9a4", GitTreeState:"clean", BuildDate:"2020-02-11T19:24:46Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8smaster Ready master 22d v1.16.7
k8sslave Ready <none> 21d v1.16.7
I deployed with an Helm 3 chart an image of my BIND DNS Server (named) in default namespace; with a service exposing the port 53 of the bind app container.
I have tested the DNS resolution with a pod and the bind service; it works well. Here is the test of the bind k8s service from the master node:
kubectl -n default get svc bind -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
bind ClusterIP 10.233.31.255 <none> 53/TCP,53/UDP 4m5s app=bind,release=bind
kubectl get endpoints bind
NAME ENDPOINTS AGE
bind 10.233.75.239:53,10.233.93.245:53,10.233.75.239:53 + 1 more... 4m12s
export SERVICE_IP=`kubectl get services bind -o go-template='{{.spec.clusterIP}}{{"\n"}}'`
nslookup www.example.com ${SERVICE_IP}
Server: 10.233.31.255
Address: 10.233.31.255#53
Name: www.example.com
Address: 176.31.XXX.XXX
So the bind DNS app is deployed and is working fine through the bind k8s service.
For the next step; I followed the https://kubernetes.github.io/ingress-nginx/user-guide/exposing-tcp-udp-services/ documentation to setup the Nginx Ingress Controller (both configmap and service) to handle tcp/udp requests on port 53 and to redirect them to the bind DNS app.
When I test the name resolution from an external computer it does not work:
nslookup www.example.com <IP of the k8s master>
;; connection timed out; no servers could be reached
I digg into k8s configuration, logs, etc. and I found a warning message in kube-proxy logs:
ps auxw | grep kube-proxy
root 19984 0.0 0.2 141160 41848 ? Ssl Mar26 19:39 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=k8smaster
journalctl --since "2 days ago" | grep kube-proxy
<NOTHING RETURNED>
KUBEPROXY_FIRST_POD=`kubectl get pods -n kube-system -l k8s-app=kube-proxy -o go-template='{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}' | head -n 1`
kubectl logs -n kube-system ${KUBEPROXY_FIRST_POD}
I0326 22:26:03.491900 1 node.go:135] Successfully retrieved node IP: 91.121.XXX.XXX
I0326 22:26:03.491957 1 server_others.go:150] Using iptables Proxier.
I0326 22:26:03.492453 1 server.go:529] Version: v1.16.7
I0326 22:26:03.493179 1 conntrack.go:52] Setting nf_conntrack_max to 262144
I0326 22:26:03.493647 1 config.go:131] Starting endpoints config controller
I0326 22:26:03.493663 1 config.go:313] Starting service config controller
I0326 22:26:03.493669 1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I0326 22:26:03.493679 1 shared_informer.go:197] Waiting for caches to sync for service config
I0326 22:26:03.593986 1 shared_informer.go:204] Caches are synced for endpoints config
I0326 22:26:03.593992 1 shared_informer.go:204] Caches are synced for service config
E0411 17:02:48.113935 1 proxier.go:927] can't open "externalIP for ingress-nginx/ingress-nginx:bind-udp" (91.121.XXX.XXX:53/udp), skipping this externalIP: listen udp 91.121.XXX.XXX:53: bind: address already in use
E0411 17:02:48.119378 1 proxier.go:927] can't open "externalIP for ingress-nginx/ingress-nginx:bind-tcp" (91.121.XXX.XXX:53/tcp), skipping this externalIP: listen tcp 91.121.XXX.XXX:53: bind: address already in use
Then I look for who was already using the port 53...
netstat -lpnt | grep 53
tcp 0 0 0.0.0.0:5355 0.0.0.0:* LISTEN 1682/systemd-resolv
tcp 0 0 87.98.XXX.XXX:53 0.0.0.0:* LISTEN 19984/kube-proxy
tcp 0 0 169.254.25.10:53 0.0.0.0:* LISTEN 14448/node-cache
tcp6 0 0 :::9253 :::* LISTEN 14448/node-cache
tcp6 0 0 :::9353 :::* LISTEN 14448/node-cache
A look on the proc 14448/node-cache:
cat /proc/14448/cmdline
/node-cache-localip169.254.25.10-conf/etc/coredns/Corefile-upstreamsvccoredns
So coredns is already handling the port 53 which is normal cos it's the k8s internal DNS service.
In coredns documentation (https://github.com/coredns/coredns/blob/master/README.md) they talk about a -dns.port option to use a distinct port... but when I look into kubespray (which has 3 jinja templates https://github.com/kubernetes-sigs/kubespray/tree/release-2.12/roles/kubernetes-apps/ansible/templates for creating the coredns configmap, services etc. similar to https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#coredns) everything is hardcoded with port 53.
So my question is : Is there a k8s cluster configuration/workaround so I can run my own DNS Server and exposed it to port 53?
Maybe?
Setup the coredns to use a a different port than 53 ? Seems hard and I'm really not sure this makes sense!
I can setup my bind k8s service to expose port 5353 and configure the nginx ingress controller to handle this 5353 port and redirect to the app 53 port. But this would require to setup iptables to route external DSN requests* received on port 53 to my bind k8s service on port 5353 ? What would be the iptables config (INPUT / PREROUTING or FORWARD)? Does this kind of network configuration would breakes coredns?
Regards,
Chris
I suppose Your nginx-ingress doesn't work as expected. You need Load Balancer provider, such as MetalLB, to Your bare metal k8s cluster to receive external connections on ports like 53. And You don't need nginx-ingress to use with bind, just change bind Service type from ClusterIP to LoadBalancer and ensure you got an external IP on this Service. Your helm chart manual may help to switch to LoadBalancer.

Can't Connect to Kubernetes Service from Inside Service Pod?

I create a one-replica zookeeper + kafka cluster with the official kafka chart from the official incubator repo:
helm install --name mykafka -f kafka.yaml incubator/kafka
This gives me two pods:
kubectl get pods
NAME READY STATUS
mykafka-kafka-0 1/1 Running
mykafka-zookeeper-0 1/1 Running
And four services (in addition to the default kubernetes service)
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP
mykafka-kafka ClusterIP 10.108.143.59 <none> 9092/TCP
mykafka-kafka-headless ClusterIP None <none> 9092/TCP
mykafka-zookeeper ClusterIP 10.109.43.48 <none> 2181/TCP
mykafka-zookeeper-headless ClusterIP None <none> 2888/TCP,3888/TCP
If I shell into the zookeeper pod:
> kubectl exec -it mykafka-zookeeper-0 -- /bin/bash
I use the curl tool to test TCP connectivity. I expect a communications error as the server isn't using HTTP, but if curl can't even connect and I have to ctrl-C out, then the TCP connection isn't working.
I can access the local pod through curl localhost:2181:
root#mykafka-zookeeper-0:/# curl localhost:2181
curl: (52) Empty reply from server
I can access other pod through curl mykafka-kafka:9092:
root#mykafka-zookeeper-0:/# curl mykafka-kafka:9092
curl: (56) Recv failure: Connection reset by peer
But I can't access mykafka-zookeeper:2181. That name resolves to the cluster IP, but the attempt to TCP connect hangs until I ctrl-C:
root#mykafka-zookeeper-0:/# curl -v mykafka-zookeeper:2181
* Rebuilt URL to: mykafka-zookeeper:2181/
* Trying 10.109.43.48...
^C
Similarly, I can shell into the kafka pod:
> kubectl exec -it mykafka-kafka-0 -- /bin/bash
Connecting to the Zookeeper pod by the service name works fine:
root#mykafka-kafka-0:/# curl mykafka-zookeeper:2181
curl: (52) Empty reply from server
Connecting to localhost kafka works fine:
root#mykafka-kafka-0:/# curl localhost:9092
curl: (56) Recv failure: Connection reset by peer
But connecting to the Kafka pod by the service name doesn't work and I must ctrl-C the curl attempt:
curl -v mykafka-kafka:9092
* Rebuilt URL to: mykafka-kafka:9092/
* Hostname was NOT found in DNS cache
* Trying 10.108.143.59...
^C
Can anyone explain why using I can only connect to a Kubernetes service from outside the service and not from within the service?
I believe what you're experiencing can be resolved by looking at how your kubelet is set up to run. There is a setting you can toggle when starting up the kubelet called --hairpin-mode. By default this setting is set to the string promiscuous, where a pod can't connect to its own service, but you can change it to be hairpin-veth, which would allow a pod to connect to its own service.
There are a few issues on the topic, but this seems to be referenced the most:
https://github.com/kubernetes/kubernetes/issues/45790

NodePort services not available on all nodes

I'm attempting to run a 3-node Kubernetes cluster. I have the cluster up and running sufficiently that I have services running on different nodes. Unfortunately, I don't seem to be able to get NodePort based services to work correctly (as I understand correctness anyway...). My issue is that any NodePort services I define are available externally only on the node where their pod is running, and my understanding is that they should be available externally on any node in the cluster.
One example is a local Jira service, which should be running on port 8082 (internally) and on 32760 externally. Here is the service definition (just the service part):
apiVersion: v1
kind: Service
metadata:
name: jira
namespace: wittlesouth
spec:
ports:
- port: 8082
selector:
app: jira
type: NodePort
Here's the output of kubectl get service --namespace wittle south
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins NodePort 10.100.119.22 <none> 8081:31377/TCP 3d
jira NodePort 10.105.148.66 <none> 8082:32760/TCP 9h
ws-mysql ExternalName <none> mysql.default.svc.cluster.local 3306/TCP 1d
The pod for this service has a HostPort set for 8082. The three nodes in the cluster are nuc1, nuc2, nuc3:
Eric:~ eric$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
nuc1 Ready master 3d v1.9.2
nuc2 Ready <none> 2d v1.9.2
nuc3 Ready <none> 2d v1.9.2
Here are the results of trying to access the Jira instance via both the host and node ports:
Eric:~ eric$ curl https://nuc1.wittlesouth.com:8082/
curl: (7) Failed to connect to nuc1.wittlesouth.com port 8082: Connection refused
Eric:~ eric$ curl https://nuc2.wittlesouth.com:8082/
curl: (7) Failed to connect to nuc2.wittlesouth.com port 8082: Connection refused
Eric:~ eric$ curl https://nuc3.wittlesouth.com:8082/
curl: (51) SSL: no alternative certificate subject name matches target host name 'nuc3.wittlesouth.com'
Eric:~ eric$ curl https://nuc3.wittlesouth.com:32760/
curl: (51) SSL: no alternative certificate subject name matches target host name 'nuc3.wittlesouth.com'
Eric:~ eric$ curl https://nuc2.wittlesouth.com:32760/
^C
Eric:~ eric$ curl https://nuc1.wittlesouth.com:32760/
curl: (7) Failed to connect to nuc1.wittlesouth.com port 32760: Operation timed out
Based on my reading, it appears that cube-proxy is not doing what it is supposed to. I tried reading through the documentation for troubleshooting cube-proxy, it appears to be slightly out of date (when I grep for hostname in iptables-save, it finds nothing). Here is the kubernetes version information:
Eric:~ eric$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.1", GitCommit:"3a1c9449a956b6026f075fa3134ff92f7d55f812", GitTreeState:"clean", BuildDate:"2018-01-04T11:52:23Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
It appears that kube-proxy is running:
eric#nuc2:~$ ps waux | grep kube-proxy
root 1963 0.5 0.1 54992 37556 ? Ssl 21:43 0:02 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
eric 3654 0.0 0.0 14224 1028 pts/0 S+ 21:52 0:00 grep --color=auto kube-proxy
and
Eric:~ eric$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-6vspc 1/1 Running 3 2d
calico-kube-controllers-d669cc78f-b67rc 1/1 Running 5 3d
calico-node-526md 2/2 Running 9 3d
calico-node-5trgt 2/2 Running 3 2d
calico-node-r9ww4 2/2 Running 3 2d
etcd-nuc1 1/1 Running 6 3d
kube-apiserver-nuc1 1/1 Running 7 3d
kube-controller-manager-nuc1 1/1 Running 6 3d
kube-dns-6f4fd4bdf-dt5fp 3/3 Running 12 3d
kube-proxy-8xf4r 1/1 Running 1 2d
kube-proxy-tq4wk 1/1 Running 4 3d
kube-proxy-wcsxt 1/1 Running 1 2d
kube-registry-proxy-cv8x9 1/1 Running 4 3d
kube-registry-proxy-khpdx 1/1 Running 1 2d
kube-registry-proxy-r5qcv 1/1 Running 1 2d
kube-registry-v0-wcs5w 1/1 Running 2 3d
kube-scheduler-nuc1 1/1 Running 6 3d
kubernetes-dashboard-845747bdd4-dp7gg 1/1 Running 4 3d
It appears that cube-proxy is creating iptables entries for my service:
eric#nuc1:/var/lib$ sudo iptables-save | grep hostnames
eric#nuc1:/var/lib$ sudo iptables-save | grep jira
-A KUBE-NODEPORTS -p tcp -m comment --comment "wittlesouth/jira:" -m tcp --dport 32760 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "wittlesouth/jira:" -m tcp --dport 32760 -j KUBE-SVC-MO7XZ6ASHGM5BOPI
-A KUBE-SEP-LP4GHTW6PY2HYMO6 -s 192.168.124.202/32 -m comment --comment "wittlesouth/jira:" -j KUBE-MARK-MASQ
-A KUBE-SEP-LP4GHTW6PY2HYMO6 -p tcp -m comment --comment "wittlesouth/jira:" -m tcp -j DNAT --to-destination 192.168.124.202:8082
-A KUBE-SERVICES ! -s 10.5.0.0/16 -d 10.105.148.66/32 -p tcp -m comment --comment "wittlesouth/jira: cluster IP" -m tcp --dport 8082 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.105.148.66/32 -p tcp -m comment --comment "wittlesouth/jira: cluster IP" -m tcp --dport 8082 -j KUBE-SVC-MO7XZ6ASHGM5BOPI
-A KUBE-SVC-MO7XZ6ASHGM5BOPI -m comment --comment "wittlesouth/jira:" -j KUBE-SEP-LP4GHTW6PY2HYMO6
Unfortunately, I know nothing about iptables at this point, so I don't know if those entries look correct or not. I'm suspicious that my non-default network setting during kubeadm init may be related to this, as I was trying to set up Kubernetes to not use the same IP address range of my network (which is 192.168 based). The kubeadm init statement I used was:
kubeadm init --pod-network-cidr=10.5.0.0/16 --apiserver-cert-extra-sans ['kubemaster.wittlesouth.com','192.168.5.10'
If you've noticed that I'm using calico which defaults to a pod network pool of 192.168.0.0, I modified the pod network pool setting for calico when I created the calico service (not sure if that is related or not).
At this point, I'm concluding either I don't understand how NodePort services are supposed to work, or there is something wrong with my cluster configuration. Any suggestions on next steps to diagnose would be greatly appreciated!
When you define a NodePort service there are actually three ports in play:
The container port: this is the port your pod is actually listening on, and it's only available when directly hitting your container from within the cluster, pod to pod (JIRA's default port would be 8080). You set the targetPort in your service to this port.
The service port: this is the load balanced port the service itself exposes internally in the cluster. With a single pod there's no load balancing at play, but it's still the entry point to your service. The port in your service definition defines this. If you don't specify a targetPort then it assumes port and targetPort are the same.
The node port: The port exposed on each worker node that routes to your service. This is a port typically in the 30000-33000 range (depending on how your cluster if configured). This is the only port that you would be able to access from outside the cluster. This is defined with nodePort.
Assuming that you are running JIRA on the standard port, you would want a service definition something like:
apiVersion: v1
kind: Service
metadata:
name: jira
namespace: wittlesouth
spec:
ports:
- port: 80 # this is the service port, can be anything
targetPort: 8080 # this is the container port (must match the port your pod is listening on)
nodePort: 32000 # if you don't specify this it randomly picks an available port in your NodePort range
selector:
app: jira
type: NodePort
So, if you use that configuration an incoming request to your NodePort service goes: NodePort (32000) -> service (80) -> pod (8080). (Internally it might actually bypass the service, I'm not 100% sure about that, but you can conceptually think about it in this way).
It also appears that you're trying to hit JIRA directly with HTTPS. Did you configure a certificate in your JIRA pod? If so you need to make sure it's a valid cert for nuc1.wittlesouth.com or tell curl to ignore certificate validation errors with curl -k.
For the first part, with HostPort it is pretty much exactly as expected, it should work only on host it is running on and here it does. The fact that NodePort works only on one of the nodes is a problem , as you correctly assume it should work on all the nodes.
As it works on one of them, it looks that your API server and kube-proxy do their work, and it is unlikely to be cause by any of them.
First thing to check is if your calico works fine and if you can connect from all the nodes to the actual pod running your jira. If not, then that is your problem. I suggest running tcpdump both on the node you curl to and on the node that has the pod running to see if packets are reaching the nodes, and how they leave them (specificaly the recieving node that does not respond to curl)