Unable to connect to k8s cluster using master/worker IP - kubernetes

I am trying to install a Kubernetes cluster with one master node and two worker nodes.
I acquired 3 VMs for this purpose running on Ubuntu 21.10. In the master node, I installed kubeadm:1.21.4, kubectl:1.21.4, kubelet:1.21.4 and docker-ce:20.4.
I followed this guide to install the cluster. The only difference was in my init command where I did not mention the --control-plane-endpoint. I used calico CNI v3.19.1 and docker for CRI Runtime.
After I installed the cluster, I deployed minio pod and exposed it as a NodePort.
The pod got deployed in the worker node (10.72.12.52) and my master node IP is 10.72.12.51).
For the first two hours, I am able to access the login page via all three IPs (10.72.12.51:30981, 10.72.12.52:30981, 10.72.13.53:30981). However, after two hours, I lost access to the service via 10.72.12.51:30981 and 10.72.13.53:30981. Now I am only able to access the service from the node on which it is running (10.72.12.52).
I have disabled the firewall and added calico.conf file inside /etc/NetworkManager/conf.d with the following content:
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico
What am I missing in the setup that might cause this issue?

This is a community wiki answer posted for better visibility. Feel free to expand it.
As mentioned by #AbhinavSharma the problem was solved by switching from Calico to Flannel CNI.
More information regarding Flannel itself can be found here.

Related

Kubernetes - Calico CrashLoopBack on Containers

I have just started experimenting with K8S a few days back, try to learn K8S with specific emphasis on networking, service mesh etc.
I am running 2 worker nodes and 1 master on VMs with Centos 7 and K8S, installed with kubeadm.
Default CNI of Flannel. Install was OK and everything except the networking was working. I could deploy containers etc, so a lot of control plane was working.
However, networking not working correctly, even container to container in same worker node. I checked all the usual suspects, the veths, IPs, MACs, briges on a single worker and everything seemed to check out... e.g. MACs where on the correct bridges i.e. cni0, IP address assignments etc. Even when pinging from busybox to busybox, I would see the ARP caches being populated but pings not working still.... disabled all FWs, IP forwarding enabled etc. Not an expert of IPtables but looked OK..... also when logged into the worker node shell I could ping the busybox containers, but they could not ping each other....
One question I have at this point, why is the docker0 bridge still present even when flannel is installed can I delete it or are there some dependencies associated with it ? I did not notice the veths for the containers were showing connected to docker0 bridge but docker bride0 was down... however I followed this website and it show a different way of validating and show veths connected to cni0, which is very confusing and frustrating.....
I gave up Flannel as I was just using Flannel to experiment and decided to try out Calico....
I followed install procedures from Calico site... was not entirely clear on the tidy up procedures for Flannel, not sure where this is documented?... this is where it went from bad to worse...
I started getting crash loops on calico containers and coredns stuck creating, reporting liveliness issues on calico ....and this is where I am stuck......... and would like some help.......
I am have read and tried many things on web and may have fixed some issues as there may be many in play, but would really appreciate any help....
=== install info and some output...
enter image description here
enter image description here
enter image description here
enter image description here
enter image description here
Some questions...
The Container creating for the coredns..... is this dependent on successful install of Calico... are the issues related.... or should coredns install work independent of the CNI install ?
The Container creating for the coredns..... is this dependent on successful install of Calico... are the issues related.... or should coredns install work independent of the CNI install ?
Yes, it is. You need to install a CNI to have coredns working.
When you setup your cluster with kubeadm there's is a flag called --pod-network-cidr, depending on which CNI you intend to use, you need to specify the range with this flag.
For example, by default, Calico use the range 192.168.0.0/16 and Flannel use the range 10.244.0.0/16.
I have a guide how to setup a cluster using kubeadm, maybe it help you.
Please note, if you want to replace the CNI without delete the entire cluster, extras steps need to be taken in order to "cleanup" the firewall rules from the older CNI.
See here how to replace flannel to calico, for example.
And here how to migrate from flannel to calico.

Facing issue while trying to install Kubernetes [ubuntu18 image on openstack]

Since yesterday, I am struggling with this strange issue: node "kmaster" not found.
I tried with multiple combinations of installing kubernetes on jetstream instance.
using calico in ubuntu
using flannel in centos
and few other ways
I looked it online and found many people have the same issue:
https://github.com/kubernetes/kubernetes/issues/61277
If someone ran into a similar issue, then please let me know what steps are needed to be taken to resolve it.
Thanks.
I would recommend to bootstrap Kubernetes cluster from scratch and share with you some helpful links with steps how to proceed:
Kubernetes cluster install on Ubuntu with Calico CNI
Kubernetes cluster install on Centos with Flannel CNI
Keep in mind to fulfill system requirements before you start with kubeadm installation procedure.
You can also take a look at the general kubeadm installation or runtime troubleshooting guide.
I have found my solution for this. I was having issue running kubernetes cluster because the kubernetes components are distributed on multiple servers. Once I created the master node and slave(worker) node on the same machine, the issue got resolved.
The steps that I took to resolve the issue:
1. on slave/worker machine, run this command: kubeadm reset
2. on master node, generate the token by running this command: kubeadm generate token.
3. use the token generated in master machine on the slave node, so that the node machine can join the kubernetes cluster.
Cheers!!

Kubernetes double IP in different nodes

after using k8s on GKE for a couple of months I've decided to install my own cluster. Now, I've two ubuntu VMs and one of them is the kube-master and the second one is a node. Cluster run as I expect and I can see nodes (kube-master is a node also) when run kubectl get nodes. I've launched one pod each VMs bat I'm experiencing an issue both pod have the same IP. Can anybody help me to resolve this issue? I'm using flannel as network plugin atm.
Thanks in advance
Update
I've found the solution, thanks kubernetes group on slack. I didn't install the [cni][1]plugin so the kubelet didn't know the subnetwork status. I installed the plugin using this guide, made a configuration file following that. Restarted the kubelete service finally I saw the cluster was work as I expected.

kubeadm init on CentOS 7 using AWS as cloud provider enters a deadlock state

I am trying to install Kubernetes 1.4 on a CentOS 7 cluster on AWS (the same happens with Ubuntu 16.04, though) using the new kubeadm tool.
Here's the output of the command kubeadm init --cloud-provider aws on the master node:
# kubeadm init --cloud-provider aws
<cmd/init> cloud provider "aws" initialized for the control plane. Remember to set the same cloud provider flag on the kubelet.
<master/tokens> generated token: "980532.888de26b1ef9caa3"
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
The issue is that the control plane does not become ready and the command seems to enter a deadlock state. I also noticed that if the --cloud-provider flag is not provided, pulling images from Amazon EC2 Container Registry does not work, and when creating a service with type LoadBalancer an Elastic Load Balancer is not created.
Has anyone run kubeadm using aws as cloud provider?
Let me know if any further information is needed.
Thanks!
I launched a cluster with kubeadm on AWS recently (kubernetes 1.5.1), and it was stuck on same step as your does. To solve it I had to add "--api-advertise-addresses=LOCAL-EC2-IP", it didn't work with external IP (which kubeadm probably fetches itself, when not specified other IP). So it's either a network connectivity issue (try temporarily a 0.0.0.0/0 security group rule on that master instance), or something else... In my case was a network issue, it wasn't able to connect to itself using its own external IP :)
Regarding PV and ELB integrations, I actually did launch a "PersistentVolumeClaim" with my MongoDB cluster and it works (it created the volume and attached to one of the slave nodes)
here is it for example:
PV created and attached to slave node
So latest version of kubeadm that ships with kubernetes 1.5.1 should work for you too!
One thing to note: you must have proper IAM role permission to create resources (assign your master node, IAM role with something like "EC2 full access" during testing, you can tune it later to allow only the few needed actions)
Hope it helps.
The documentation (as of now) clearly states the following in the limitations:
The cluster created here doesn’t have cloud-provider integrations, so for example won’t work with (for example) Load Balancers (LBs) or Persistent Volumes (PVs). To easily obtain a cluster which works with LBs and PVs Kubernetes, try the “hello world” GKE tutorial or one of the other cloud-specific installation tutorials.
http://kubernetes.io/docs/getting-started-guides/kubeadm/
There are a couple of possibilities I am aware of here -:
1) In older kubeadm versions selinux blocks access at this point
2) If you are behind a proxy you will need to add the usual to the kubeadm environment -:
HTTP_PROXY
HTTPS_PROXY
NO_PROXY
Plus, which I have not seen documented anywhere -:
KUBERNETES_HTTP_PROXY
KUBERNETES_HTTPS_PROXY
KUBERNETES_NO_PROXY

Orchestrating containers

I'm trying to use the Kubernetes to deploy Docker Container and I found this tutorial.
So according to this tutorial, what is the prerequisites?
They said that "services that are typically on a separate Kubernetes master system and two or more Kubernetes node systems are all running on a single system."
But I don't understand how we run both master and nodes on a single system (for example I have one instance EC2 with IP address 52.192.x.x)
That is a guide about running Kubernetes specifically on RedHat Atomic nodes. There are lots of guides about running Kubernetes on other types of nodes; see the Creating a Kubernetes Cluster page on docs.k8s.io.
One of the guides on the Kubernetes site shows how to run a local docker-based cluster, which should also work for you on a single node in the cloud.