Kubectl connectivity issue - kubernetes

I installed first ectd, kubeapiserver and kubelet using systemd service. The services are running fine and listening to all required ports.
When I run kubectl cluster-info , I get below output
Kubernetes master is running at http://localhost:8080
When I run kubectl get componentstatuses, then I get below output
etcd-0 Healthy {"health": "true"}
But running kubectl get nodes , I get below error
Error from server (ServerTimeout): the server cannot complete the requested operation at this time, try again later (get nodes)
Can anybody help me out on this.

For the message:
:~# k get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0 Healthy {"health":"true"}
--------
Modify the following files on all master nodes:
$ sudo vim /etc/kubernetes/manifests/kube-scheduler.yaml
Comment or delete the line:
- --port=0
in (spec->containers->command->kube-scheduler)
$ sudo vim /etc/kubernetes/manifests/kube-controller-manager.yaml
Comment or delete the line:
- --port=0
in (spec->containers->command->kube-controller-manager)
Then restart kubelet service:
$ sudo systemctl restart kubelet.service

Your missing kubeconfig file. kubectl looks config file in this location $HOME/.kube/config
Part of install you can copy config file like this on master node.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

What is the status of controller manager and scheduler. Do you see them listed as Healthy when you run the below command
kubectl get cs

Related

Flannel is crashing for Slave node

I am getting this result for flannel service on my slave node. Flannel is running fine on master node.
kube-system kube-flannel-ds-amd64-xbtrf 0/1 CrashLoopBackOff 4 3m5s
Kube-proxy running on the slave is fine but not the flannel pod.
I have a master and a slave node only. At first its say running, then it goes to error and finally, crashloopbackoff.
godfrey#master:~$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-amd64-jszwx 0/1 CrashLoopBackOff 4 2m17s 192.168.152.104 slave3 <none> <none>
kube-system kube-proxy-hxs6m 1/1 Running 0 18m 192.168.152.104 slave3 <none> <none>
I am also getting this from the logs:
I0515 05:14:53.975822 1 main.go:390] Found network config - Backend type: vxlan
I0515 05:14:53.975856 1 vxlan.go:121] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
E0515 05:14:53.976072 1 main.go:291] Error registering network: failed to acquire lease: node "slave3" pod cidr not assigned
I0515 05:14:53.976154 1 main.go:370] Stopping shutdownHandler...
I could not find a solution so far. Help appreciated.
As solution came from OP, I'm posting answer as community wiki.
As reported by OP in the comments, he didn't passed the podCIDR during kubeadm init.
The following command was used to see that the flannel pod was in "CrashLoopBackoff" state:
sudo kubectl get pods --all-namespaces -o wide
To confirm that podCIDR was not passed to flannel pod kube-flannel-ds-amd64-ksmmh that was in CrashLoopBackoff state.
$ kubectl logs kube-flannel-ds-amd64-ksmmh
kubeadm init --pod-network-cidr=172.168.10.0/24 didn't pass the podCIDR to the slave nodes as expected.
Hence to solve the problem, kubectl patch node slave1 -p '{"spec":{"podCIDR":"172.168.10.0/24"}}' command had to be used to pass podCIDR to each slave node.
Please see this link: coreos.com/flannel/docs/latest/troubleshooting.html and section "Kubernetes Specific"
The described cluster configuration doesn't look correct in two aspects:
First of all, PodCIDR reasonable minimum subnet size is /16. Each Kubernetes node usually gets /24 subnet because it can run up to 100 pods.
PodCIDR and ServicesCIDR (default: "10.96.0.0/12") must not interfere with your existing LAN network and with each other.
So, correct kubeadm command would look like:
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
In your case PodCIDR subnet is only /24 and it was assigned to master node. Slave node didn't get its own /24 subnet, so Flannel Pod showed the error in the logs:
Error registering network: failed to acquire lease: node "slave3" pod cidr not assigned
Assigning the same subnet to several nodes manually will lead to the other connectivity problems.
You can find more details on Kubernetes IP subnets in GKE documentation.
The second problem is the IP subnet number.
Recent Calico network addon versions are able to detect the correct Pod subnet based on kubeadm parameter --pod-network-cidr. Older version was using predefined subnet 192.168.0.0/16 and you had to adjust it in its YAML file in the Deaemonset specification :
- name: CALICO_IPV4POOL_CIDR
value: "192.168.0.0/16"
Flannel is still requires default subnet ( 10.244.0.0/16 ) to be specified for kubeadm init.
To use custom subnet for your cluster, Flannel "installation" YAML file should be adjusted before applying to the cluster.
...
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
...
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
...
So the following should work for any version of Kubernetes and Calico:
$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Latest Calico version
$ kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# or specific version, v3.14 in this case, which is also latest at the moment
# kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
Same for Flannel:
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# For Kubernetes v1.7+
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# For older versions of Kubernetes:
# For RBAC enabled clusters:
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-legacy.yml
$
There are many other network addons. You can find the list in the documentation:
Cluster Networking
Installing Addons

Can not login to kubernetes dashboard dial tcp 172.17.0.6:8443: connect: connection refused

I deployed Kubernetes v1.15.2 dashboard successfully. Checking the cluster info:
$ kubectl cluster-info
Kubernetes master is running at http://172.19.104.231:8080
kubernetes-dashboard is running at http://172.19.104.231:8080/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
When I access the dashboard, the result is:
[root#ops001 ~]# curl -L http://172.19.104.231:8080/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Error: 'dial tcp 172.17.0.6:8443: connect: connection refused'
Trying to reach: 'https://172.17.0.6:8443/'
This is dashboard status:
[root#ops001 ~]# kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-74d7cc788-mk9c7 1/1 Running 0 92m
What should I do to access the dashboard? When I using proxy to access dashboard UI:
$ kubectl proxy --address='localhost' --port=8086 --accept-hosts='^*$'
Starting to serve on 127.0.0.1:8086
the result is:
[root#ops001 ~]# curl -L http://127.0.0.1:8086/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Error: 'dial tcp 172.17.0.6:8443: connect: connection refused'
Trying to reach: 'https://172.17.0.6:8443/'
What should I do to fix this problem?
I finally find the problem is kubernetes dashboard pod container does not communicate with the proxy nginx container. Because the proxy container is deployed before kubernetes flannel,and not in the same network.Trying to add proxy nginx container to flannel network would solve the problem.Check current flannel network:
[root#ops001 conf.d]# cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.30.0.0/16
FLANNEL_SUBNET=172.30.224.1/21
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
generate start parameter of docker:
./mk-docker-opts.sh -d /run/docker_opts.env -c
check the parameter:
[root#ops001 conf.d] cat /run/docker_opts.env
DOCKER_OPTS=" --bip=172.30.224.1/21 --ip-masq=false --mtu=1450"
add paramter to docker service:
# vim /lib/systemd/system/docker.service
EnvironmentFile=/run/docker_opts.env
ExecStart=/usr/bin/dockerd $DOCKER_OPTS -H fd://
restart docker and the container would join the flannel network,could communicate with each other:
systemctl daemon-reload
systemctl restart docker
hope this may help you!

Failed to install rook on k8s cluster

I am trying to create a rook cluster inside k8s cluster.
Set up - 1 master node, 1 worker node
These are the steps I have followed
Master node:
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
sudo sysctl net.bridge.bridge-nf-call-iptables=1
sudo sysctl net.bridge.bridge-nf-call-ip6tables=1
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/32a765fd19ba45b387fdc5e3812c41fff47cfd55/Documentation/kube-flannel.yml
kubeadm token create --print-join-command
Worker node:
kubeadm join {master_ip_address}:6443 --token {token} --discovery-token-ca-cert-hash {hash} --apiserver-advertise-address={worker_private_ip}
Master node - Install rook - (reference - https://rook.github.io/docs/rook/master/ceph-quickstart.html):
kubectl create -f ceph/common.yaml
kubectl create -f ceph/operator.yaml
kubectl create -f ceph/cluster-test.yaml
Error while creating rook-ceph-operator pod:
(combined from similar events): Failed create pod sandbox: rpc error: code =
Unknown desc = failed to set up sandbox container "4a901f12e5af5340f2cc48a976e10e5c310c01a05a4a47371f766a1a166c304f"
network for pod "rook-ceph-operator-fdfbcc5c5-jccc9": networkPlugin cni failed to
set up pod "rook-ceph-operator-fdfbcc5c5-jccc9_rook-ceph" network: failed to set bridge addr:
"cni0" already has an IP address different from 10.244.1.1/24
Can anybody help me with this issue?
This issue start if you did kubeadm reset and after that kubeadm init reinitialize Kubernetes.
kubeadm reset
systemctl stop kubelet
systemctl stop docker
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ifconfig docker0 down
After this start docker and kubelet and kubeadm again.
Work around
You can also try this way as simple easy solution
ip link delete cni0
ip link delete flannel.1
that depends on which network you are using inside k8s.

About : CreateContainerError

i installed K8S cluster in my laptop, it was running fine in the beginning but when i restarted my laptop then some services were not running.
kube-system coredns-5c98db65d4-9nm6m 0/1 Error 594 12d
kube-system coredns-5c98db65d4-qwkk9 0/1 CreateContainerError
kube-system kube-scheduler-kubemaster 0/1 CreateContainerError
I searched online for solution but could not get appropriate answer ,
please help me resolve this issue
I encourage you to look for official kubernetes documentation. Remember that your kubemaster should have at least fallowing resources: 2CPUs or more, 2GB or more of RAM.
Firstly install docker and kubeadm (as a root user) on each machine.
Initialize kubeadm (on master):
kubeadm init <args>
For example for Calico to work correctly, you need to pass --pod-network-cidr=192.168.0.0/16 to kubeadm init:
kubeadm init --pod-network-cidr=192.168.0.0/16
Install a pod network add-on (depends on what you would like to use). You can install a pod network add-on with the following command:
kubectl apply -f <add-on.yaml>
e.g. for Calico:
kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml
To start using your cluster, you need to run on master the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You can now join any number of machines by running the following on each node as root:
kubeadm join <master-ip>:<master-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>
By default, tokens expire after 24 hours. If you are joining a node to the cluster after the current token has expired, you can create a new token by running the following command on the control-plane node:
kubeadm token create
Please, let me know if it works for you.
Did you check the status of docker and kubelet services.? if not, please run below commands and verify that services are up and running.
systemctl status docker kubelet

Unable to start MiniKube: Stuck on Starting Cluster Components

I am new to minikube. I followed the below steps to install minikube on oracle linux 7.5 (kernel 3.10.0-327.28.3.el7.x86_64)
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && sudo install minikube-linux-amd64 /usr/local/bin/minikube
After installing i ran the minikube
sudo minikube start --vm-driver=none
Minikube is falied on
sudo minikube start --vm-driver=none
Starting local Kubernetes v1.12.4 cluster...
Starting VM...
Waiting for SSH to be available...
Detecting the provisioner...
Setting Docker configuration on the remote daemon...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Stopping extra container runtimes...
Starting cluster components...
E0105 13:00:41.436961 19330 start.go:343] Error starting cluster: timed out waiting to elevate kube-system RBAC privileges: Temporary Error: creating clusterrolebinding: Post https://192.168.99.100:8443/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindings?timeout=1m0s: net/http: TLS handshake timeout
Temporary Error: creating clusterrolebinding: Post https://192.168.99.100:8443/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindings?timeout=1m0s: net/http: TLS handshake timeout
Temporary Error: creating clusterrolebinding: Post https://192.168.99.100:8443/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindings?timeout=1m0s: net/http: TLS handshake timeout
Temporary Error: creating clusterrolebinding: Post https://192.168.99.100:8443/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindings?timeout=1m0s: net/http: TLS handshake timeout
Temporary Error: creating clusterrolebinding: Post https://192.168.99.100:8443/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindings?timeout=1m0s: net/http: TLS handshake timeout
I checked logs also and found that it is stuck in some loop and retrying but i am unable to understand
I0105 12:24:24.522907 19330 utils.go:224] > Your Kubernetes master has initialized successfully!
I0105 12:24:24.522916 19330 utils.go:224] > To start using your cluster, you need to run the following as a regular user:
I0105 12:24:24.522919 19330 utils.go:224] > mkdir -p $HOME/.kube
I0105 12:24:24.522925 19330 utils.go:224] > sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
I0105 12:24:24.522929 19330 utils.go:224] > sudo chown $(id -u):$(id -g) $HOME/.kube/config
I0105 12:24:24.522934 19330 utils.go:224] > You should now deploy a pod network to the cluster.
I0105 12:24:24.522944 19330 utils.go:224] > Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
I0105 12:24:24.522950 19330 utils.go:224] > https://kubernetes.io/docs/concepts/cluster-administration/addons/
I0105 12:24:24.522957 19330 utils.go:224] > You can now join any number of machines by running the following on each node
I0105 12:24:24.522959 19330 utils.go:224] > as root:
I0105 12:24:24.522972 19330 utils.go:224] > kubeadm join localhost:8443 --token 5apexw.uv7nfpirz4on2e33 --discovery-token-ca-cert-hash sha256:6dcf73220b8bc229269bb8c6a350592fe6b0cd068ef8f336163cc5b3a384990e
I0105 12:24:45.792762 19330 utils.go:117] sleeping 500ms
I0105 12:24:46.292883 19330 utils.go:106] retry loop 1
I0105 12:25:07.561761 19330 utils.go:117] sleeping 500ms
I0105 12:25:08.062042 19330 utils.go:106] retry loop 2
I0105 12:25:29.330434 19330 utils.go:117] sleeping 500ms
I0105 12:25:29.830625 19330 utils.go:106] retry loop 3
I0105 12:25:51.096835 19330 utils.go:117] sleeping 500ms
I0105 12:25:51.597099 19330 utils.go:106] retry loop 4
I suppose you did it in AWS.
Remove all and recreate from the scratch. Just reproduced and all works fine.
Remove all:
minikube delete
rm -rf ~/.minikube
My steps from the begging(under root):
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && sudo install minikube-linux-amd64 /usr/local/bin/minikube
sudo yum install docker-engine
systemctl enable docker.service
systemctl start docker.service
minikube start --vm-driver=none
Result:
======================================== Starting local Kubernetes v1.12.4 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Stopping extra container runtimes...
Starting cluster components...
Verifying kubelet health ...
Verifying apiserver health ...Kubectl is now configured to use the
cluster.
===================
WARNING: IT IS RECOMMENDED NOT TO RUN THE NONE DRIVER ON PERSONAL
WORKSTATIONS The 'none' driver will run an insecure kubernetes
apiserver as root that may leave the host vulnerable to CSRF attacks
When using the none driver, the kubectl config and credentials
generated will be root owned and will appear in the root home
directory. You will need to move the files to the appropriate location
and then set the correct permissions. An example of this is below:
sudo mv /root/.kube $HOME/.kube # this will write over any previous configuration
sudo chown -R $USER $HOME/.kube
sudo chgrp -R $USER $HOME/.kube
sudo mv /root/.minikube $HOME/.minikube # this will write over any previous configuration
sudo chown -R $USER $HOME/.minikube
sudo chgrp -R $USER $HOME/.minikube
This can also be done automatically by setting the env var
CHANGE_MINIKUBE_NONE_USER=true Loading cached images from config file.
Everything looks great. Please enjoy minikube!