kubernetes master starts only on tcp6 , how to join a node? - kubernetes

I have a local Kubernetes master started on a tcp6:6443 but not on tcp so how to start a kubeadm join for using the right port?
tcp6 0 0 :::10250 :::* LISTEN -
tcp6 0 0 :::6443 :::* LISTEN -
tcp6 0 0 :::10251 :::* LISTEN -
Starting Nmap 7.01 ( https://nmap.org ) at 2019-09-25 15:40 CEST
Nmap scan report for 10.0.2.15
Host is up (0.000081s latency).
PORT STATE SERVICE
6443/tcp closed unknown

You should run the below command (on master host):
$ kubeadm init --apiserver-advertise-address=<private-ip of master host>
--apiserver-advertise-address parameter - if the node should host a new control plane instance, the IP address the API Server will advertise it's listening on. If not set the default network interface will be used.
Now try to run the join command that was generated in the output of kubeadm init. It should works fine.
Also, what you can check is a firewall running on your master node that should be disabled. It’s blocking incoming traffic.
systemctl stop firewalld

Related

Kubernetes node failed to join master due "Timeout exceeded while awaiting headers error"

I am trying to setup k8s cluster with master and two worker nodes in Digital Ocean.
My Config:
I have created three droplets as follows:
Master: 2cpu, 3GB Mem
Worker Node1: 1cpu, 2GB Mem
Worker Node2: 1cpu, 2GB Mem
I was able to setup master node successfully
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 139m v1.18.3
I am unable to add worker to master.
Command i ran to join:
$ kubeadm join <PUBLIC IP>:6443 --token <token> --discovery-token-ca-cert-hash <hash>
Token had 23h of validity left at the time of executing the above command.
Error that i got:
W0528 14:13:09.920404 25129 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
error execution phase preflight: couldn't validate the identity of the API Server: Get https://PUBLIC_IP:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
To see the stack trace of this error execute with --v=5 or higher
My observations on this issue:
$ netstat -pnltu
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:40389 0.0.0.0:* LISTEN 25074/kubelet
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 25074/kubelet
tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 25478/kube-proxy
tcp 0 0 127.0.0.1:9099 0.0.0.0:* LISTEN 29823/calico-node
tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN 24580/kube-controll
tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN 24742/kube-schedule
tcp6 0 0 :::10250 :::* LISTEN 25074/kubelet
tcp6 0 0 :::10251 :::* LISTEN 24742/kube-schedule
tcp6 0 0 :::6443 :::* LISTEN 24725/kube-apiserve
tcp6 0 0 :::10252 :::* LISTEN 24580/kube-controll
tcp6 0 0 :::10256 :::* LISTEN 25478/kube-proxy
Is it because the API service is listening in IPV6 instead of IPV4?
here is the output of cluster-info:
$ kubectl cluster-info
Kubernetes master is running at https://<PUBLIC_IP>:6443
KubeDNS is running at https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Any help to fix this issue is much appreciated.

Kube Controller Manager CrashLoopBackOff

My kube-controller-managerkeeps staying on CrashLoopBackOff status.
I found this upon looking in the logs of the pod:
failed to create listener: failed to listen on 0.0.0.0:10252: listen tcp 0.0.0.0:10252: bind: address already in use
Then I stumbled upon this article who fortunately was able to find a fix for it. Where he killed the process using the port and restarted his kube-controller-manager pod. https://medium.com/#deepeshtripathi/kubernetes-controller-pod-crashloopbackoff-resolved-16aaa1c27cfc
So I did follow the steps he made. When I have tried to get into the master node to find which process is using this port, I can't see anything that uses it.
root#ip:/# netstat -tunlp | grep 1025
tcp6 0 0 :::10250 :::* LISTEN 1598/kubelet
tcp6 0 0 :::10251 :::* LISTEN 7472/kube-scheduler
tcp6 0 0 :::10255 :::* LISTEN 1598/kubelet
tcp6 0 0 :::10256 :::* LISTEN 5629/kube-proxy
Is there anyone else know any solution on how to fix this?
failed to create listener: failed to listen on 0.0.0.0:10252: listen tcp 0.0.0.0:10252: bind: address already in use
According to the error message port 10252 is in use. So need to stop listening on this port. You can do that by running
fuser -k 10252/tcp

Network connectivity/DNS issues on a GKE 1.10 kubernetes cluster

I'm running into DNS issues on a GKE 1.10 kubernetes cluster. Occasionally pods start without any network connectivity. Restarting the pod tends to fix the issue.
Here's the result of the same few commands inside a container without network, and one with.
BROKEN:
kc exec -it -n iotest app1-b67598997-p9lqk -c userapp sh
/app $ nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve
/app $ cat /etc/resolv.conf
nameserver 10.63.240.10
search iotest.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal google.internal
options ndots:5
/app $ curl -I 10.63.240.10
curl: (7) Failed to connect to 10.63.240.10 port 80: Connection refused
/app $ netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:8001 0.0.0.0:* LISTEN 1/python
tcp 0 0 ::1:50051 :::* LISTEN 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 :::* LISTEN 1/python
WORKING:
kc exec -it -n iotest app1-7d985bfd7b-h5dbr -c userapp sh
/app $ nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve
Name: www.google.com
Address 1: 74.125.206.147 wk-in-f147.1e100.net
Address 2: 74.125.206.105 wk-in-f105.1e100.net
Address 3: 74.125.206.99 wk-in-f99.1e100.net
Address 4: 74.125.206.104 wk-in-f104.1e100.net
Address 5: 74.125.206.106 wk-in-f106.1e100.net
Address 6: 74.125.206.103 wk-in-f103.1e100.net
Address 7: 2a00:1450:400c:c04::68 wk-in-x68.1e100.net
/app $ cat /etc/resolv.conf
nameserver 10.63.240.10
search iotest.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal google.internal
options ndots:5
/app $ curl -I 10.63.240.10
HTTP/1.1 404 Not Found
date: Sun, 29 Jul 2018 15:13:47 GMT
server: envoy
content-length: 0
/app $ netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:15000 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:15001 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:8001 0.0.0.0:* LISTEN 1/python
tcp 0 0 10.60.2.6:56508 10.60.48.22:9091 ESTABLISHED -
tcp 0 0 127.0.0.1:57768 127.0.0.1:50051 ESTABLISHED -
tcp 0 0 10.60.2.6:43334 10.63.255.44:15011 ESTABLISHED -
tcp 0 0 10.60.2.6:15001 10.60.45.26:57160 ESTABLISHED -
tcp 0 0 10.60.2.6:48946 10.60.45.28:9091 ESTABLISHED -
tcp 0 0 127.0.0.1:49804 127.0.0.1:50051 ESTABLISHED -
tcp 0 0 ::1:50051 :::* LISTEN 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 :::* LISTEN 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 ::ffff:127.0.0.1:49804 ESTABLISHED 1/python
tcp 0 0 ::ffff:127.0.0.1:50051 ::ffff:127.0.0.1:57768 ESTABLISHED 1/python
These pods are identical, just one was restarted.
Does anyone have advice about how to analyse and fix this issue?
Some steps to try:
1) ifconfig eth0 or whatever the primary interface is.
Is the interface up? Are the tx and rx packet counts increasing?
2)If interface is up, you can try tcpdump as you are running the nslookup command that you posted. See if the dns request packets are getting sent out.
3) See which node the pod is scheduled on, when network connectivity gets broken. Maybe it is on the same node every time? If yes, are other pods on that node running into similar problem?
I also faced the same problem, and I simply worked around it for now by switching to the 1.9.x GKE version (after spending many hours trying to debug why my app wasn't working).
Hope this helps!

Cannot curl kubelet read-only port

I have a heapster pod running on one of the nodes in my Kubernetes cluster. It is able to get http://<node-with-heapster-pod>:10255/stats/summary just fine, but whenever it runs the same get request on another node, it cannot. When I run curl from within any given node I can access that port, but when I curl any node from another machine I get the following error:
Failed to connect to 128.180.120.229 port 10255: No route to host
The following is the netstat output for all ports on which the kubelet is listening:
netstat -ap | grep -i "listen" | grep "kubelet"
tcp 0 0 localhost:10248 0.0.0.0:* LISTEN 7562/kubelet
tcp6 0 0 [::]:4194 [::]:* LISTEN 7562/kubelet
tcp6 0 0 [::]:10250 [::]:* LISTEN 7562/kubelet
tcp6 0 0 [::]:10255 [::]:* LISTEN 7562/kubelet
unix 2 [ ACC ] STREAM LISTENING 621349 7562/kubelet /var/run/dockershim.sock
I apologize for the messy last column. Any ideas why this may be? My iptables rules are set up to accept all incoming connections, and any node can ping port 10250 fine, just not 10255.
you may not have ip_forward enabled on your system. can you check this settings?
sysctl -n net.ipv4.ip_forward
If anybody still cares, port 10255 is the kubelet's read only port and may or may not be configured. You can confirm this by accessing the worker node in question then looking at the kubelet's startup command.
systemctl status kubelet-worker.service
Some on-prem kubernetes solutions set this to 0 as mentioned below
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
--read-only-port int32 The read-only port for the Kubelet to serve on with no authentication/authorization (set to 0 to disable) (default 10255) (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)

nmap show port is closed even though I opened it

My VPS is running on CentOS 7.2 , I opened a port by firewall-cmd --zone=public --add-port=8006/tcp --permanent and have already type the firewall-cmd --reload command, but when I check the port by nmap, nmap -p 8006 ip-addressxxx, it still shows it is closed. Here is some information may help:
[root#localhost ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2017-04-07 02:06:50 EDT; 3 days ago
Docs: man:firewalld(1)
Main PID: 663 (firewalld)
CGroup: /system.slice/firewalld.service
└─663 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
Apr 07 02:06:50 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
Apr 07 02:06:50 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
Apr 10 02:03:42 localhost.localdomain firewalld[663]: ERROR: ALREADY_ENABLED: 80:tcp
Apr 10 02:03:49 localhost.localdomain firewalld[663]: ERROR: ALREADY_ENABLED: 8006:tcp
.
.
.
[root#localhost ~]# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: ens3
sources:
services: dhcpv6-client ssh
ports: 8009/tcp 80/tcp 8080/tcp 8006/tcp
protocols:
masquerade: no
forward-ports:
sourceports:
icmp-blocks:
rich rules:
.
.
.
[root#localhost ~]# firewall-cmd --list-ports
8009/tcp 80/tcp 8080/tcp 8006/tcp
.
.
.
[root#localhost ~]# netstat -plunt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 992/sshd
tcp6 0 0 :::8009 :::* LISTEN 1027/java
tcp6 0 0 :::3306 :::* LISTEN 1383/mysqld
tcp6 0 0 :::80 :::* LISTEN 1027/java
tcp6 0 0 :::22 :::* LISTEN 992/sshd
tcp6 0 0 127.0.0.1:8006 :::* LISTEN 1027/java
Revisited my answer
The process you have listening on port 8006 is only listening on the loopback interface, 127.0.0.1, it should be listening on 0.0.0.0. See the sshd process in your process list 0.0.0.0:22 it works fine.
Use something like netcat to test. This will open a port on 8006 on the 0.0.0.0 interface, which is open to the world because of your firewall rules
On your VPS Try:
nc -l 8006
and then scan with nmap again and you will see the port is open, provided your firewall rules are in place.
You want to see this in the process list
tcp6 0 0 0.0.0.0:8006 :::* LISTEN 1027/java
and not
tcp6 0 0 127.0.0.1:8006 :::* LISTEN 1027/java