Setting up Kubernetes - API not reachable from Pods - kubernetes

I'm trying to setup a basic Kubernetes cluster on a (Ubuntu 16) VM. I've just followed the getting started docs and would expect a working cluster, but unfortunately, no such luck - no pods can't seem to connect to the Kubenernetes API. Since I'm new to Kubernetes it is very tough for me to find where things are going wrong.
Provision script:
apt-get update && apt-get upgrade -y
apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl docker.io
apt-mark hold kubelet kubeadm kubectl
swapoff -a
sysctl net.bridge.bridge-nf-call-iptables=1
kubeadm init
mkdir -p /home/ubuntu/.kube
cp -i /etc/kubernetes/admin.conf /home/ubuntu/.kube/config
chown -R ubuntu:ubuntu /home/ubuntu/.kube
runuser -l ubuntu -c "kubectl apply -f \"https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')\""
runuser -l ubuntu -c "kubectl taint nodes --all node-role.kubernetes.io/master-"
Installation seems fine.
ubuntu#packer-Ubuntu-16:~$ kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-86c58d9df4-lbp46 0/1 CrashLoopBackOff 7 18m 10.32.0.2 packer-ubuntu-16 <none> <none>
kube-system coredns-86c58d9df4-t8nnn 0/1 CrashLoopBackOff 7 18m 10.32.0.3 packer-ubuntu-16 <none> <none>
kube-system etcd-packer-ubuntu-16 1/1 Running 0 17m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-apiserver-packer-ubuntu-16 1/1 Running 0 18m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-controller-manager-packer-ubuntu-16 1/1 Running 0 17m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-proxy-dwhhf 1/1 Running 0 18m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system kube-scheduler-packer-ubuntu-16 1/1 Running 0 17m 145.100.100.100 packer-ubuntu-16 <none> <none>
kube-system weave-net-sfvz5 2/2 Running 0 18m 145.100.100.100 packer-ubuntu-16 <none> <none>
Question: is it normal that the Kubernetes pods have as IP the ip of eth0 of the host (145.100.100.100)? Seems weird to me, I would expect them to have a virtual IP?
As you can see the coredns pod is crashing, because, well, it cannot reach the API.
This is as I understand it, the service:
ubuntu#packer-Ubuntu-16:~$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 22m
CoreDNS crashing, because API is unreachable:
ubuntu#packer-Ubuntu-16:~$ kubectl logs -n kube-system coredns-86c58d9df4-lbp46
.:53
2018-12-06T12:54:28.481Z [INFO] CoreDNS-1.2.6
2018-12-06T12:54:28.481Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
E1206 12:54:53.482269 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:318: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1206 12:54:53.482363 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:311: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1206 12:54:53.482540 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I tried launching a simple alpine pod/container. And indeed 10.96.0.1 doesn't responds to pings or anything else.
I'm stuck here. I've tried to google a lot but nothing comes up and my understanding is pretty basic. I guess something's up with the networking, but I don't know what (for me it seems suspicious that when doing get pods, the pods show up with the host IP, but perhaps this is normal also?)

I found that the problem is caused by the host's iptables rules.

Related

Kubernetes dashboard: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout

I have a Kubernetes cluster in vagrant (1.14.0) and installed calico.
I have installed the kubernetes dashboard. When I use kubectl proxy to visit the dashboard:
Error: 'dial tcp 192.168.1.4:8443: connect: connection refused'
Trying to reach: 'https://192.168.1.4:8443/'
Here are my pods (dashboard is restarting frequently):
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-cj928 1/1 Running 0 11m
calico-node-4fnb6 1/1 Running 0 18m
calico-node-qjv7t 1/1 Running 0 20m
calico-policy-controller-b9b6749c6-29c44 1/1 Running 1 11m
coredns-fb8b8dccf-jjbhk 1/1 Running 0 20m
coredns-fb8b8dccf-jrc2l 1/1 Running 0 20m
etcd-k8s-master 1/1 Running 0 19m
kube-apiserver-k8s-master 1/1 Running 0 19m
kube-controller-manager-k8s-master 1/1 Running 0 19m
kube-proxy-8mrrr 1/1 Running 0 18m
kube-proxy-cdsr9 1/1 Running 0 20m
kube-scheduler-k8s-master 1/1 Running 0 19m
kubernetes-dashboard-5f7b999d65-nnztw 1/1 Running 3 2m11s
logs of the dasbhoard pod:
2019/03/30 14:36:21 Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service account's configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout
Refer to our FAQ and wiki pages for more information: https://github.com/kubernetes/dashboard/wiki/FAQ
I can telnet from both master and nodes to 10.96.0.1:443.
What is configured wrongly? The rest of the cluster seems to work fine, although I see this logs in kubelet:
failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml"
kubelet seems to run fine on the master.
The cluster was created with this command:
kubeadm init --apiserver-advertise-address="192.168.50.10" --apiserver-cert-extra-sans="192.168.50.10" --node-name k8s-master --pod-network-cidr=192.168.0.0/16
you should define your hostname in /etc/hosts
#hostname
YOUR_HOSTNAME
#nano /etc/hosts
YOUR_IP HOSTNAME
if you set your hostname in your master but it did not work try
# systemctl stop kubelet
# systemctl stop docker
# iptables --flush
# iptables -tnat --flush
# systemctl start kubelet
# systemctl start docker
and you should install dashboard before join worker node
and disable your firewall
and you can check your free ram.
Exclude -- node-name parameter from kubeadm init command
try this command
kubeadm init --apiserver-advertise-address=$(hostname -i) --apiserver-cert-extra-sans="192.168.50.10" --pod-network-cidr=192.168.0.0/16
For me the issue was I needed to create a NetworkPolicy that allowed Egress traffic to the kubernetes API

no endpoints available for service \"kubernetes-dashboard\"

I'm trying to follow GitHub - kubernetes/dashboard: General-purpose web UI for Kubernetes clusters.
deploy/access:
# export KUBECONFIG=/etc/kubernetes/admin.conf
# kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
secret/kubernetes-dashboard-certs created
serviceaccount/kubernetes-dashboard created
role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
deployment.apps/kubernetes-dashboard created
service/kubernetes-dashboard created
# kubectl proxy
Starting to serve on 127.0.0.1:8001
curl:
# curl http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "no endpoints available for service \"kubernetes-dashboard\"",
"reason": "ServiceUnavailable",
"code": 503
}#
Please advise.
per #VKR
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-576cbf47c7-56vg7 0/1 ContainerCreating 0 57m
kube-system coredns-576cbf47c7-sn2fk 0/1 ContainerCreating 0 57m
kube-system etcd-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kube-apiserver-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kube-controller-manager-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kube-proxy-2hhf7 1/1 Running 0 6m57s
kube-system kube-proxy-lzfcx 1/1 Running 0 7m35s
kube-system kube-proxy-rndhm 1/1 Running 0 57m
kube-system kube-scheduler-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kubernetes-dashboard-77fd78f978-g2hts 0/1 Pending 0 2m38s
$
logs:
$ kubectl logs kubernetes-dashboard-77fd78f978-g2hts -n kube-system
$
describe:
$ kubectl describe pod kubernetes-dashboard-77fd78f978-g2hts -n kube-system
Name: kubernetes-dashboard-77fd78f978-g2hts
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=77fd78f978
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/kubernetes-dashboard-77fd78f978
Containers:
kubernetes-dashboard:
Image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-gp4l7 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
kubernetes-dashboard-token-gp4l7:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-gp4l7
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m39s (x21689 over 20h) default-scheduler 0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
$
It would appear that you are attempting to deploy Kubernetes leveraging kubeadm but have skipped the step of Installing a pod network add-on (CNI). Notice the warning:
The network must be deployed before any applications. Also, CoreDNS will not start up before a network is installed. kubeadm only supports Container Network Interface (CNI) based networks (and does not support kubenet).
Once you do this, the CoreDNS pods should come up healthy. This can be verified with:
kubectl -n kube-system -l=k8s-app=kube-dns get pods
Then the kubernetes-dashboard pod should come up healthy as well.
you could refer to https://github.com/kubernetes/dashboard#getting-started
Also, I see "https" in your link
Please try this link instead
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
I had the same problem. In the end it turned out as a Calico Network configuration problem. But step by step...
First I checked if the Dashboard Pod was running:
kubectl get pods --all-namespaces
The result for me was:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-bcc6f659f-j57l9 1/1 Running 2 19h
kube-system calico-node-hdxp6 0/1 CrashLoopBackOff 13 15h
kube-system calico-node-z6l56 0/1 Running 68 19h
kube-system coredns-74ff55c5b-8l6m6 1/1 Running 2 19h
kube-system coredns-74ff55c5b-v7pkc 1/1 Running 2 19h
kube-system etcd-got-virtualbox 1/1 Running 3 19h
kube-system kube-apiserver-got-virtualbox 1/1 Running 3 19h
kube-system kube-controller-manager-got-virtualbox 1/1 Running 3 19h
kube-system kube-proxy-q99s5 1/1 Running 2 19h
kube-system kube-proxy-vrpcd 1/1 Running 1 15h
kube-system kube-scheduler-got-virtualbox 1/1 Running 2 19h
kubernetes-dashboard dashboard-metrics-scraper-7b59f7d4df-qc9ms 1/1 Running 0 28m
kubernetes-dashboard kubernetes-dashboard-74d688b6bc-zrdk4 0/1 CrashLoopBackOff 9 28m
The last line indicates, that the dashboard pod could not have been started (status=CrashLoopBackOff).
And the 2nd line shows that the calico node has problems. Most likely the root cause is Calico.
Next step is to have a look at the pod log (change namespace / name as listed in YOUR pods list):
kubectl logs kubernetes-dashboard-74d688b6bc-zrdk4 -n kubernetes-dashboard
The result for me was:
2021/03/05 13:01:12 Starting overwatch
2021/03/05 13:01:12 Using namespace: kubernetes-dashboard
2021/03/05 13:01:12 Using in-cluster config to connect to apiserver
2021/03/05 13:01:12 Using secret token for csrf signing
2021/03/05 13:01:12 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
Hm - not really helpful. After searching for "dial tcp 10.96.0.1:443: i/o timeout" I found this information, where it says ...
If you follow the kubeadm instructions to the letter ... Which means install docker, kubernetes (kubeadm, kubectl, & kubelet), and calico with the Kubeadm hosted instructions ... and your computer nodes have a physical ip address in the range of 192.168.X.X then you will end up with the above mentioned non-working dashboard. This is because the node ip addresses clash with the internal calico ip addresses.
https://github.com/kubernetes/dashboard/issues/1578#issuecomment-329904648
Yes, in deed I do have a physical IP in the range of 192.168.x.x - like many others might have as well. I wish Calico would check this during setup.
So let's move the pod network to a different IP range:
You should use a classless reserved IP range for Private Networks like
10.0.0.0/8 (16.777.216 addresses)
172.16.0.0/12 (1.048.576 addresses)
192.168.0.0/16 (65.536 addresses). Otherwise Calico will terminate with an error saying "Invalid CIDR specified in CALICO_IPV4POOL_CIDR" ...
sudo kubeadm reset
sudo rm /etc/cni/net.d/10-calico.conflist
sudo rm /etc/cni/net.d/calico-kubeconfig
export CALICO_IPV4POOL_CIDR=172.16.0.0
export MASTER_IP=192.168.100.122
sudo kubeadm init --pod-network-cidr=$CALICO_IPV4POOL_CIDR/12 --apiserver-advertise-address=$MASTER_IP --apiserver-cert-extra-sans=$MASTER_IP
mkdir -p $HOME/.kube
sudo rm -f $HOME/.kube/config
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
sudo chown $(id -u):$(id -g) /etc/kubernetes/kubelet.conf
wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml -O calico.yaml
sudo sed -i "s/192.168.0.0\/16/$CALICO_IPV4POOL_CIDR\/12/g" calico.yaml
sudo sed -i "s/192.168.0.0/$CALICO_IPV4POOL_CIDR/g" calico.yaml
kubectl apply -f calico.yaml
Now we test if all calico pods are running:
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-bcc6f659f-ns7kz 1/1 Running 0 15m
kube-system calico-node-htvdv 1/1 Running 6 15m
kube-system coredns-74ff55c5b-lqwpd 1/1 Running 0 17m
kube-system coredns-74ff55c5b-qzc87 1/1 Running 0 17m
kube-system etcd-got-virtualbox 1/1 Running 0 17m
kube-system kube-apiserver-got-virtualbox 1/1 Running 0 17m
kube-system kube-controller-manager-got-virtualbox 1/1 Running 0 18m
kube-system kube-proxy-6xr5j 1/1 Running 0 17m
kube-system kube-scheduler-got-virtualbox 1/1 Running 0 17m
Looks good. If not check CALICO_IPV4POOL_CIDR by editing the node config: KUBE_EDITOR="nano" kubectl edit -n kube-system ds calico-node
Let's apply the kubernetes-dashboard and start the proxy:
export KUBECONFIG=$HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml
kubectl proxy
Now I can load http://127.0.0.1:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
if you are using helm,
check if kubectl proxy is running
then goto
http://localhost:8001/api/v1/namespaces/default/services/https:kubernetes-dashboard:https/proxy
two tips in above link:
use helm to install, the namespaces will be /default (not /kubernetes-dashboard
need add https after /https:kubernetes-dashboard:
better way is
helm delete kubernetes-dashboard
kubectl create namespace kubernetes-dashboard
helm install -n kubernetes-dashboard kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard
then goto
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:https/proxy
then you can easily follow creating-sample-user to get token to login
i was facing the same issue, so i followed the official docs and then went to https://github.com/kubernetes/dashboard url, there is another way using helm on this link https://artifacthub.io/packages/helm/k8s-dashboard/kubernetes-dashboard
after installing helm and run this 2 commands
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard
it worked but on default namespace on this link
http://localhost:8001/api/v1/namespaces/default/services/https:kubernetes-dashboard:https/proxy/#/workloads?namespace=default

how do i get the minikube nodes in a local cluster

Im trying to set up a local cluster using VM and minikube, as Id been reading its only possible to use it for local purposes, but id like to join a secondary machine, and im searching a way to create the join and hash.
You can easily do it in case your minikube machine is using VirtualBox.
Start the minikube:
$ minikube start --vm-driver="virtualbox"
Check the versions of kubeadm, kubelet and kubectl in minikube and print join command:
$ kubectl version
$ minikube ssh
$ kubelet --version
$ kubeadm token create --print-join-command
Create a new VM in VirtualBox. I've used Vagrant to create Ubuntu 16lts VM for this test. Check that the minikube and the new VM are in the same host-only VM network.
You can use anything that suits you best, but the packages installation procedure would be different for different Linux distributions.
(On the new VM.) Add repository with Kubernetes:
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update
(On the new VM.)Install the same version of kubelet kubeadm and other tools on the new VM (1.10.0 in my case)
$ apt-get -y install ebtables ethtool docker.io apt-transport-https kubelet=1.10.0-00 kubeadm=1.10.0-00
(On the new VM.)Use your join command from the step 2. IP address should be from the VM Host-Only-Network. Only having Nat networks didn't work well in my case.
$ kubeadm join 192.168.xx.yy:8443 --token asdfasf.laskjflakflsfla --discovery-token-ca-cert-hash sha256:shfkjshkfjhskjfskjdfhksfh...shdfk
(On the main host) Add network solution to the cluster:
$ kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
(On the main host) Check your nodes and pods using kubectl:
$ kubectl get nodes:
NAME STATUS ROLES AGE VERSION
minikube Ready master 1h v1.10.0
ubuntu-xenial Ready <none> 36m v1.10.0
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system calico-etcd-982l8 1/1 Running 0 10m 10.0.2.15 minikube
kube-system calico-kube-controllers-79dccdc4cc-66zxm 1/1 Running 0 10m 10.0.2.15 minikube
kube-system calico-node-9sgt5 1/2 Running 13 10m 10.0.2.15 ubuntu-xenial
kube-system calico-node-qtpg2 2/2 Running 0 10m 10.0.2.15 minikube
kube-system etcd-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system heapster-6hmhs 1/1 Running 0 1h 172.17.0.4 minikube
kube-system influxdb-grafana-69s5s 2/2 Running 0 1h 172.17.0.5 minikube
kube-system kube-addon-manager-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-apiserver-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-controller-manager-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-dns-86f4d74b45-tzc4r 3/3 Running 0 1h 172.17.0.2 minikube
kube-system kube-proxy-vl5mq 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-proxy-xhv8s 1/1 Running 2 35m 10.0.2.15 ubuntu-xenial
kube-system kube-scheduler-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kubernetes-dashboard-5498ccf677-7gf4j 1/1 Running 0 1h 172.17.0.3 minikube
kube-system storage-provisioner 1/1 Running 0 1h 10.0.2.15 minikube
This isn't possible with minikube. With minikube, the operating domain is a single laptop or local machine. You can't join an additional node, you'll need to build a whole cluster using something like kubeadm

Cannot connect to kubernetes pod from master: i/o timeout

I configured kubernetes cluster with one master and one node, the machines that run master and node aren't in the same network. For networking I installed calico and all the pods are running. For testing the cluster I used get shell example and when I run the following command from master machine:
kubectl exec -it shell-demo -- /bin/bash
I received the error:
Error from server: error dialing backend: dial tcp 10.138.0.2:10250: i/o timeout
The ip 10.138.0.2 is on eth0 interface on the node machine.
What configuration do I need to make to access the pod from master?
EDIT
kubectl get all --all-namespaces -o wide output:
default shell-demo 1/1 Running 0 10s 192.168.4.2 node-1
kube-system calico-node-7wlqw 2/2 Running 0 49m 10.156.0.2 instance-1
kube-system calico-node-lnk6d 2/2 Running 0 35s 10.132.0.2 node-1
kube-system coredns-78fcdf6894-cxgc2 1/1 Running 0 50m 192.168.0.5 instance-1
kube-system coredns-78fcdf6894-gwwjp 1/1 Running 0 50m 192.168.0.4 instance-1
kube-system etcd-instance-1 1/1 Running 0 49m 10.156.0.2 instance-1
kube-system kube-apiserver-instance-1 1/1 Running 0 49m 10.156.0.2 instance-1
kube-system kube-controller-manager-instance-1 1/1 Running 0 49m 10.156.0.2 instance-1
kube-system kube-proxy-b64b5 1/1 Running 0 50m 10.156.0.2 instance-1
kube-system kube-proxy-xxkn4 1/1 Running 0 35s 10.132.0.2 node-1
kube-system kube-scheduler-instance-1 1/1 Running 0 49m 10.156.0.2 instance-1
Thanks!
Before checking your status on Master .Please verify below things.
Please run below commands to check cluster info :
setenforce 0
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --permanent --add-port=2379-2380/tcp
firewall-cmd --permanent --add-port=10250/tcp
firewall-cmd --permanent --add-port=10251/tcp
firewall-cmd --permanent --add-port=10252/tcp
firewall-cmd --permanent --add-port=10255/tcp
firewall-cmd --reload
modprobe br_netfilter
echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
Run above command on both Master and worker node.
Then run below commands to check node status.
kubectl get nodes
I had this issue too. Don't know if you're on Azure, but I am, and I solved this by deleting the tunnelfront pod and letting Kubernetes restart it:
kubectl -n kube-system delete po -l component=tunnel
which is a solution I got from here
we had the same problem, and in the end we found that we have 2 Nic per host, and they have 2 different IPs, and the route is also messed up. so when this timeout happens, check your networking setup, make sure your network is healthy and that should give you some good clue there.

How to fix weave-net CrashLoopBackOff for the second node?

I have got 2 VMs nodes. Both see each other either by hostname (through /etc/hosts) or by ip address. One has been provisioned with kubeadm as a master. Another as a worker node. Following the instructions (http://kubernetes.io/docs/getting-started-guides/kubeadm/) I have added weave-net. The list of pods looks like the following:
vagrant#vm-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-vm-master 1/1 Running 0 3m
kube-system kube-apiserver-vm-master 1/1 Running 0 5m
kube-system kube-controller-manager-vm-master 1/1 Running 0 4m
kube-system kube-discovery-982812725-x2j8y 1/1 Running 0 4m
kube-system kube-dns-2247936740-5pu0l 3/3 Running 0 4m
kube-system kube-proxy-amd64-ail86 1/1 Running 0 4m
kube-system kube-proxy-amd64-oxxnc 1/1 Running 0 2m
kube-system kube-scheduler-vm-master 1/1 Running 0 4m
kube-system kubernetes-dashboard-1655269645-0swts 1/1 Running 0 4m
kube-system weave-net-7euqt 2/2 Running 0 4m
kube-system weave-net-baao6 1/2 CrashLoopBackOff 2 2m
CrashLoopBackOff appears for each worker node connected. I have spent several ours playing with network interfaces, but it seems the network is fine. I have found similar question, where the answer advised to look into the logs and no follow up. So, here are the logs:
vagrant#vm-master:~$ kubectl logs weave-net-baao6 -c weave --namespace=kube-system
2016-10-05 10:48:01.350290 I | error contacting APIServer: Get https://100.64.0.1:443/api/v1/nodes: dial tcp 100.64.0.1:443: getsockopt: connection refused; trying with blank env vars
2016-10-05 10:48:01.351122 I | error contacting APIServer: Get http://localhost:8080/api: dial tcp [::1]:8080: getsockopt: connection refused
Failed to get peers
What I am doing wrong? Where to go from there?
I ran in the same issue too. It seems weaver wants to connect to the Kubernetes Cluster IP address, which is virtual. Just run this to find the cluster ip:
kubectl get svc. It should give you something like this:
$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 100.64.0.1 <none> 443/TCP 2d
Weaver picks up this IP and tries to connect to it, but worker nodes does not know anything about it. Simple route will solve this issue. On all your worker nodes, execute:
route add 100.64.0.1 gw <your real master IP>
this happens with a single node setup, too. I tried several things like reapplying the configuration and recreation, but the most stable way at the moment is to perform a full tear down (as described in docs) and put the cluster up again.
I use these scripts for relaunching the cluster:
down.sh
#!/bin/bash
systemctl stop kubelet;
docker rm -f -v $(docker ps -q);
find /var/lib/kubelet | xargs -n 1 findmnt -n -t tmpfs -o TARGET -T | uniq | xargs -r umount -v;
rm -r -f /etc/kubernetes /var/lib/kubelet /var/lib/etcd;
up.sh
#!/bin/bash
systemctl start kubelet
kubeadm init
# kubectl taint nodes --all dedicated- # single node!
kubectl create -f https://git.io/weave-kube
edit: I would also give other Pod networks a try, like Calico, if this is a weave related issue
The most common causes for this may be:
- presence of a firewall (e.g. firewalld on CentOS)
- network configuration (e.g. default NAT interface on VirtualBox)
Currently kubeadm is still alpha, and this is one of the issues that has already been reported by many of the alpha testers. We are looking into fixing this by documenting the most common problems, such documentation is going to be ready closer to beta version.
Right there exists a VirtualBox+Vargant+Ansible for Ubunutu and CentOS reference implementation that provides solutions for firewall, SELinux and VirtualBox NAT issues.
/usr/local/bin/weave reset
was the fix for me - Hope its useful - and yes make sure selinux is set to disabled
and firewalld is not running (on redhat / centos) releases
kube-system weave-net-2vlvj 2/2 Running 3 11d
kube-system weave-net-42k6p 1/2 Running 3 11d
kube-system weave-net-wvsk5 2/2 Running 3 11d