Kuberetes V1.6.2 flannel not running on master node - kubernetes

I am running 2 node cluster using vagrant, configured with kubeadm command. When I setup the cluster flannel was running on all three nodes. Now i don't see flannel running in master node. because of this overlay network is not working from master node.
Used this yaml files to configure flannel.
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# kubectl get pods --all-namespaces -o wide |grep fla
kube-system kube-flannel-ds-0d3bn 2/2 Running 0 20m 192.168.15.102 node-01
kube-system kube-flannel-ds-86bzs 2/2 Running 0 20m 192.168.15.103 node-02
# k get nodes -o wide
NAME STATUS AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION
master01 Ready 26d v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-514.10.2.el7.x86_64
node-01 Ready 26d v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-514.10.2.el7.x86_64
node-02 Ready 26d v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-514.10.2.el7.x86_64
How can I start the flannel pod in my master node?

I see you using RBAC, maybe there are not enough rights at a node.
Try creating a clusterrolebinding with the necessary rights
$ kubectl create clusterrolebinding nodeName --clusterrole=system:node --
user=nodeName
or can use cluster-admin for testing

Related

There are 2 networking component installed on node master, Weave and Calico. how can I completely remove Calico from my kubernetes cluster?

Weave has overlap with host's IP address and its pod stuck in CrashLoopBackOff state. There is a need to remove Calico first as I have no clue about working 2 Networking module on master!
emo#master:~$ sudo kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-dw6ch 0/1 ContainerCreating 0
kube-system coredns-64897985d-xr6br 0/1 ContainerCreating 0
kube-system etcd-master 1/1 Running 26 (14m ago)
kube-system kube-apiserver-master 1/1 Running 26 (12m ago)
kube-system kube-controller-manager-master 1/1 Running 4 (20m ago)
kube-system kube-proxy-g98ph 1/1 Running 3 (20m ago)
kube-system kube-scheduler-master 1/1 Running 4 (20m ago)
kube-system weave-net-56n8k 1/2 CrashLoopBackOff 76 (54s ago)
tigera-operator tigera-operator-b876f5799-sqzf9 1/1 Running 6 (5m57s ago)
master:
emo#master:~$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane,master 6d19h v1.23.5 192.168.71.132 <none> Ubuntu 20.04.3 LTS 5.4.0-81-generic containerd://1.5.5
You may need to re-build your cluster after cleaning it up.
First, run kubectl delete for all the manifests you have applied to configure calico and weave. (e.g. kubectl delete -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml)
Then run kubeadm reset and run /etc/cni/net.d/ to delete all of your cni configurations. After that, you also need to reboot the server to delete some old records of ip link, or manually remove them by ip link delete {name}.
Now the new installation should be done well.

Rook ceph broken on kubernetes?

Using Ceph v1.14.10, Rook v1.3.8 on k8s 1.16 on-premise. After 10 days without any trouble, we decided to drain some nodes, then, all moved pods cant attach to their PV any more, look like Ceph cluster is broken:
My ConfigMap rook-ceph-mon-endpoints is referencing 2 missing mon pod IPs:
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.115.0.129:6789","10.115.0.4:6789","10.115.0.132:6789"]}]
But
kubectl -n rook-ceph get pod -l app=rook-ceph-mon -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
rook-ceph-mon-e-56b849775-4g5wg 1/1 Running 0 6h42m 10.115.0.2 XXXX <none> <none>
rook-ceph-mon-h-fc486fb5c-8mvng 1/1 Running 0 6h42m 10.115.0.134 XXXX <none> <none>
rook-ceph-mon-i-65666fcff4-4ft49 1/1 Running 0 30h 10.115.0.132 XXXX <none> <none>
Is it normal or I must run a kind of "reconciliation" task to update the CM with new mon pod IPs ?
(could be related to https://github.com/rook/rook/issues/2262)
I had to manualy update:
secret rook-ceph-config
cm rook-ceph-mon-endpoints
cm rook-ceph-csi-config
As #travisn said:
The operator owns updating that configmap and secret. It's not expected to update them manually unless there is some disaster recovery situation as described at https://rook.github.io/docs/rook/v1.4/ceph-disaster-recovery.html.

Inquiring pod and service subnets from inside Kubernetes cluster

How can one inquire the Kubernetes pod and service subnets in use (e.g. 10.244.0.0/16 and 10.96.0.0/12 respectively) from inside a Kubernetes cluster in a portable and simple way?
For instance, kubectl get cm -n kube-system kubeadm-config -o yaml reports podSubnet and serviceSubnet. But this is not fully portable because a cluster may have been set up by another means than kubeadm.
kubectl get cm -n kube-system kube-proxy -o yaml reports clusterCIDR (i.e. pod subnet) and kubectl get pod -n kube-system kube-apiserver-master1 -o yaml reports the value
passed as command-line option --service-cluster-ip-range to kube-apiserver (i.e. service subnet). master1 stands for the name of any control plane node. But this seems a bit complex.
Is there a better way available e.g. with the Kubernetes 1.17 API?
I don't think it would be possible to obtain what you want in a portable and simple way.
If you don't specify Cidr's parameters it will assign default one.
As you have many ways to run kubernetes as unmanaged clusters like kubeadm, minikbue, k3s, micork8s or managed like Cloud providers (GKE, Azure, AWS) it's hard to find one way to list all cidrs in all environments. Another obstacle can be versions of Kubernetes or CNI.
In Kubernetes 1.17 Release notes you can find information that
Deprecate the default service IP CIDR. The previous default was 10.0.0.0/24 which will be removed in 6 months/2 releases. Cluster admins must specify their own desired value, by using --service-cluster-ip-range on kube-apiserver.
As example of Kubeadm: $ kubeadm init --pod-network-cidr 10.100.0.0/12 --service-cidr 10.99.0.0/12
There are a few ways to get this pod and service-cidr:
$ kubectl cluster-info dump | grep -E '(service-cluster-ip-range|cluster-cidr)'
"--service-cluster-ip-range=10.99.0.0/12",
"--cluster-cidr=10.100.0.0/12",
$ kubeadm config view | grep Subnet
podSubnet: 10.100.0.0/12
serviceSubnet: 10.99.0.0/12
But if you will check all pods in this cluster, some pods are starting with 192.168.190.X or 192.168.137.X
$ kubectl get pods -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx 1/1 Running 0 62m 192.168.190.129 kubeadm-worker <none> <none>
kube-system calico-kube-controllers-77c5fc8d7f-9n6m5 1/1 Running 0 118m 192.168.137.66 kubeadm-master <none> <none>
kube-system calico-node-2kx2v 1/1 Running 0 117m 10.128.0.4 kubeadm-worker <none> <none>
kube-system calico-node-8xqd9 1/1 Running 0 118m 10.128.0.3 kubeadm-master <none> <none>
kube-system coredns-66bff467f8-sgmkw 1/1 Running 0 120m 192.168.137.65 kubeadm-master <none> <none>
kube-system coredns-66bff467f8-t84ht 1/1 Running 0 120m 192.168.137.67 kubeadm-master <none> <none>
If you will describe any CNI pods you can find another CIDRs:
CALICO_IPV4POOL_CIDR: 192.168.0.0/16
For GKE example you will have:
node CIDRs
$ kubectl describe node | grep CIDRs
PodCIDRs: 10.52.1.0/24
PodCIDRs: 10.52.0.0/24
PodCIDRs: 10.52.2.0/24
$ gcloud container clusters describe cluster-2 --zone=europe-west2-b | grep Cidr
clusterIpv4Cidr: 10.52.0.0/14
clusterIpv4Cidr: 10.52.0.0/14
clusterIpv4CidrBlock: 10.52.0.0/14
servicesIpv4Cidr: 10.116.0.0/20
servicesIpv4CidrBlock: 10.116.0.0/20
podIpv4CidrSize: 24
servicesIpv4Cidr: 10.116.0.0/20
Honestly I don't think there is an easy and portable way to list all podCidrs and serviceCidrs in one simple command.

unable to upgrade connection: pod does not exist

I have a error when i want to access to my pod:
error: unable to upgrade connection: pod does not exist
it's a cluster with 3 nodes, below some details. Thanks in advance
root#kubm:~/deploy/nginx# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubm Ready master 37h v1.17.0 10.0.2.15 <none> Ubuntu 16.04.6 LTS 4.4.0-150-generic docker://19.3.5
kubnode Ready <none> 37h v1.17.0 10.0.2.15 <none> Ubuntu 16.04.6 LTS 4.4.0-150-generic docker://19.3.5
kubnode2 Ready <none> 37h v1.17.0 10.0.2.15 <none> Ubuntu 16.04.6 LTS 4.4.0-150-generic docker://19.3.5
root#kubm:~/deploy/nginx# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-59c9f8dff-v7dvg 1/1 Running 0 16h 10.244.2.3 kubnode2 <none> <none>
root#kubm:~/deploy/nginx# kubectl exec -it nginx-59c9f8dff-v7dvg -- /bin/bash
**error: unable to upgrade connection: pod does not exist**
I had the same issue running a cluster with Vagrant and Virtualbox the first time.
Adding KUBELET_EXTRA_ARGS=--node-ip=x.x.x.x where x.x.x.x is your VM's IP in /etc/default/kubelet (this can be part of the provisioning script for example) and then restarting kubelet (systemctl restart kubelet) fixes the issues.
This is the recommended way to add extra runtime arguments to kubelet as you can see in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf. Alternatively you can also edit the kubelet config file under /etc/kubernetes/kubelet.conf
The 10.0.2.15 IP address is the default for virtualbox NAT
If you deploy a VM using a vagrantfile, your eth0 adapter will use the 10.0.2.15 IP address and the eth1 adapter will be assigned an other IP address.
K8s uses the eth0 adapter to route packets between pods.
I had the same issue and the problem was POD status as "ImagePullBackOff". Due to this, it was throwing error
error: unable to upgrade connection: container not found ("nginx")
[mayur#mayur_cloudtest ~]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-598b589c46-zcr5d 0/1 ImagePullBackOff 0 116s
[mayur#mayur_cloudtest ~]$
[mayur#mayur_cloudtest ~]$
[mayur#mayur_cloudtest ~]$ kubectl exec -it nginx-598b589c46-zcr5d -- /bin/bash
error: unable to upgrade connection: container not found ("nginx")
I use below command to get into a pod.
kubectl exec -i -t <pod-name> -- /bin/bash
Note -i and -t flag have a space on the command..
If you have multi-container pod you should pass container name with -c flag or it will by default connect to first container in POD.
ubuntu#cluster-master:~$ kubectl exec -i -t nginx -- /bin/bash
root#nginx:/# whoami
root
root#nginx:/# date
Tue Jan 7 14:12:29 UTC 2020
root#nginx:/#
Refer help section of command kubectl exec --help

how do i get the minikube nodes in a local cluster

Im trying to set up a local cluster using VM and minikube, as Id been reading its only possible to use it for local purposes, but id like to join a secondary machine, and im searching a way to create the join and hash.
You can easily do it in case your minikube machine is using VirtualBox.
Start the minikube:
$ minikube start --vm-driver="virtualbox"
Check the versions of kubeadm, kubelet and kubectl in minikube and print join command:
$ kubectl version
$ minikube ssh
$ kubelet --version
$ kubeadm token create --print-join-command
Create a new VM in VirtualBox. I've used Vagrant to create Ubuntu 16lts VM for this test. Check that the minikube and the new VM are in the same host-only VM network.
You can use anything that suits you best, but the packages installation procedure would be different for different Linux distributions.
(On the new VM.) Add repository with Kubernetes:
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update
(On the new VM.)Install the same version of kubelet kubeadm and other tools on the new VM (1.10.0 in my case)
$ apt-get -y install ebtables ethtool docker.io apt-transport-https kubelet=1.10.0-00 kubeadm=1.10.0-00
(On the new VM.)Use your join command from the step 2. IP address should be from the VM Host-Only-Network. Only having Nat networks didn't work well in my case.
$ kubeadm join 192.168.xx.yy:8443 --token asdfasf.laskjflakflsfla --discovery-token-ca-cert-hash sha256:shfkjshkfjhskjfskjdfhksfh...shdfk
(On the main host) Add network solution to the cluster:
$ kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
(On the main host) Check your nodes and pods using kubectl:
$ kubectl get nodes:
NAME STATUS ROLES AGE VERSION
minikube Ready master 1h v1.10.0
ubuntu-xenial Ready <none> 36m v1.10.0
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system calico-etcd-982l8 1/1 Running 0 10m 10.0.2.15 minikube
kube-system calico-kube-controllers-79dccdc4cc-66zxm 1/1 Running 0 10m 10.0.2.15 minikube
kube-system calico-node-9sgt5 1/2 Running 13 10m 10.0.2.15 ubuntu-xenial
kube-system calico-node-qtpg2 2/2 Running 0 10m 10.0.2.15 minikube
kube-system etcd-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system heapster-6hmhs 1/1 Running 0 1h 172.17.0.4 minikube
kube-system influxdb-grafana-69s5s 2/2 Running 0 1h 172.17.0.5 minikube
kube-system kube-addon-manager-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-apiserver-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-controller-manager-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-dns-86f4d74b45-tzc4r 3/3 Running 0 1h 172.17.0.2 minikube
kube-system kube-proxy-vl5mq 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kube-proxy-xhv8s 1/1 Running 2 35m 10.0.2.15 ubuntu-xenial
kube-system kube-scheduler-minikube 1/1 Running 0 1h 10.0.2.15 minikube
kube-system kubernetes-dashboard-5498ccf677-7gf4j 1/1 Running 0 1h 172.17.0.3 minikube
kube-system storage-provisioner 1/1 Running 0 1h 10.0.2.15 minikube
This isn't possible with minikube. With minikube, the operating domain is a single laptop or local machine. You can't join an additional node, you'll need to build a whole cluster using something like kubeadm