Steps of Kubernetes CNI when using Flannel - kubernetes

I have been setting up Kubernets with kubeadm and I have used Flannel to setup the pod network. The setup basically worked but I have been running into all kinds of problems (and bugs) and now I am trying to gain a better understanding of the different steps involved in network setup process (e.g. CNI and flannel).
From an end-user/admin perspective I simply pass --pod-network-cidr with some network argument to kubeadm and then later I apply a pod configuration for flannel using kubectl. Kubernetes will then start a flannel pod on each of my nodes. Assuming everything worked, flannel should then use the container network interfaces (CNI) of Kubernetes to setup the pod network.
As a result of this process I should get a pod network which includes the following:
A cni0 bridge.
A flannel.x interface.
iptables entries to route between the host and the pod network.
The following files and binaries seem to be involved in the setup:
kubectl reads a CNI configuration such as /etc/cni/net.d/10-flannel.conflist and invokes the CNI plugin described in the config file.
Somehow a folder /var/lib/cni is being created which seems to contain configuration files for the network setup.
A CNI plugin such as /opt/cni/bin/flannel is run, I don't yet understand what it does.
What am I missing on this list and how does (2.) fit into these steps. How does /var/lib/cni get created and which program is responsible for this?

As I see from code of CNI:
var (
CacheDir = "/var/lib/cni"
)
this folder used as cache dir for CNI and looks like created by CNI plugin.
Here you can find detailed documentation about CNI.
What is CNI?
CNI (Container Network Interface), a Cloud Native Computing Foundation project, consists of a specification and libraries for writing plugins to configure network interfaces in Linux containers, along with a number of supported plugins. CNI concerns itself only with network connectivity of containers and removing allocated resources when the container is deleted. Because of this focus, CNI has a wide range of support and the specification is simple to implement.
All projects like Calico, Flannel use CNI as a base. That's why they called CNI-plugins
Here you can find documentation about how kubernetes interact with CNI.

Related

Static ip adress for kubernetes pods with calico cni

I'm currently using 10.222.0.0/16 network for my pods on a single node cluster test environment.
When I reboot the machine or redeploy pods they get the first ip address which has not been used previously. I want to prevent this from happening by assigning static ips for pods with calico.
How can I achieve this?
Generally that approach would be against the dynamic nature of Kubernetes' IP layer. However, there is a solution found in the Project Calico docs:
Choose the IP address for a pod instead of allowing Calico to choose
automatically.
Bear in mind that:
You must be using the Calico IPAM.
If you are not sure, ssh to one of your Kubernetes nodes and examine
the CNI configuration.
cat /etc/cni/net.d/10-calico.conflist
Look for the entry:
"ipam": {
"type": "calico-ipam"
},
If it is present, you are using the Calico IPAM. If the IPAM is set to
something else, or the 10-calico.conflist file does not exist, you
cannot use these features in your cluster.

Unable to connect to k8s cluster using master/worker IP

I am trying to install a Kubernetes cluster with one master node and two worker nodes.
I acquired 3 VMs for this purpose running on Ubuntu 21.10. In the master node, I installed kubeadm:1.21.4, kubectl:1.21.4, kubelet:1.21.4 and docker-ce:20.4.
I followed this guide to install the cluster. The only difference was in my init command where I did not mention the --control-plane-endpoint. I used calico CNI v3.19.1 and docker for CRI Runtime.
After I installed the cluster, I deployed minio pod and exposed it as a NodePort.
The pod got deployed in the worker node (10.72.12.52) and my master node IP is 10.72.12.51).
For the first two hours, I am able to access the login page via all three IPs (10.72.12.51:30981, 10.72.12.52:30981, 10.72.13.53:30981). However, after two hours, I lost access to the service via 10.72.12.51:30981 and 10.72.13.53:30981. Now I am only able to access the service from the node on which it is running (10.72.12.52).
I have disabled the firewall and added calico.conf file inside /etc/NetworkManager/conf.d with the following content:
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico
What am I missing in the setup that might cause this issue?
This is a community wiki answer posted for better visibility. Feel free to expand it.
As mentioned by #AbhinavSharma the problem was solved by switching from Calico to Flannel CNI.
More information regarding Flannel itself can be found here.

Kubernetes - Calico CrashLoopBack on Containers

I have just started experimenting with K8S a few days back, try to learn K8S with specific emphasis on networking, service mesh etc.
I am running 2 worker nodes and 1 master on VMs with Centos 7 and K8S, installed with kubeadm.
Default CNI of Flannel. Install was OK and everything except the networking was working. I could deploy containers etc, so a lot of control plane was working.
However, networking not working correctly, even container to container in same worker node. I checked all the usual suspects, the veths, IPs, MACs, briges on a single worker and everything seemed to check out... e.g. MACs where on the correct bridges i.e. cni0, IP address assignments etc. Even when pinging from busybox to busybox, I would see the ARP caches being populated but pings not working still.... disabled all FWs, IP forwarding enabled etc. Not an expert of IPtables but looked OK..... also when logged into the worker node shell I could ping the busybox containers, but they could not ping each other....
One question I have at this point, why is the docker0 bridge still present even when flannel is installed can I delete it or are there some dependencies associated with it ? I did not notice the veths for the containers were showing connected to docker0 bridge but docker bride0 was down... however I followed this website and it show a different way of validating and show veths connected to cni0, which is very confusing and frustrating.....
I gave up Flannel as I was just using Flannel to experiment and decided to try out Calico....
I followed install procedures from Calico site... was not entirely clear on the tidy up procedures for Flannel, not sure where this is documented?... this is where it went from bad to worse...
I started getting crash loops on calico containers and coredns stuck creating, reporting liveliness issues on calico ....and this is where I am stuck......... and would like some help.......
I am have read and tried many things on web and may have fixed some issues as there may be many in play, but would really appreciate any help....
=== install info and some output...
enter image description here
enter image description here
enter image description here
enter image description here
enter image description here
Some questions...
The Container creating for the coredns..... is this dependent on successful install of Calico... are the issues related.... or should coredns install work independent of the CNI install ?
The Container creating for the coredns..... is this dependent on successful install of Calico... are the issues related.... or should coredns install work independent of the CNI install ?
Yes, it is. You need to install a CNI to have coredns working.
When you setup your cluster with kubeadm there's is a flag called --pod-network-cidr, depending on which CNI you intend to use, you need to specify the range with this flag.
For example, by default, Calico use the range 192.168.0.0/16 and Flannel use the range 10.244.0.0/16.
I have a guide how to setup a cluster using kubeadm, maybe it help you.
Please note, if you want to replace the CNI without delete the entire cluster, extras steps need to be taken in order to "cleanup" the firewall rules from the older CNI.
See here how to replace flannel to calico, for example.
And here how to migrate from flannel to calico.

What combination of firewall rules adapted for kubernetes with flannel as a CNI

I have been trying to find the right firewall rules to apply on a kubernetes kubeadm cluster. with flannel as CNI.
I opened these ports:
6443/tcp, 2379/tcp, 2380/tcp, 8285/udp, 8472/udp, 10250/tcp, 10251/tcp, 10252/tcp, 10255/tcp, 30000-32767/tcp.
But I always end up with a service that cannot reach other services, or myself not able to reach the dashboard unless I disable the firewall. I always start with a fresh cluster.
kubernetes version 1.15.4.
Is there any source that list suitable rules to apply on cluster created by kubeadm with flannel running inside containers ?
As stated in Kubeadm system requeriments:
Full network connectivity between all machines in the cluster (public or private network is fine)
It's a very common practice is to put all custom rules on the Gateway(ADC) or into Cloud Security Groups, to you prevent conflicting rules.
Then you have to Ensure IP Tables tooling does not use the NFTables backend.
Nftables backend is not compatible with the current Kubeadm packages: it causes duplicated firewall rules and breaks kube-proxy.
And ensure required ports are open between all machines of the Cluster.
Other security measures should be deployed through other components, like:
Network Policy (Depending on the network providers)
Ingress
RBAC
And Others.
Also check the articles about Securing a Cluster and Kubernetes Security - Best Practice Guide.

How kubernetes decides which network plugin to call for IPAM?

I am trying to understand how kubernetes knows whom to call to get IP address to the pod? Is it mentioned in the ConfigMap?
Can you share any pointers to learn more on this?
I think it has been explained pretty well in this article:
Automating Kubernetes Networking with CNI
Kubernetes uses CNI plug-ins to orchestrate networking. Every time a
POD is initialized or removed, the default CNI plug-in is called with
the default configuration.
CNI plugin will create a pseudo interface and will attach the relevant underlay network also setting up the IP and routes which are mapped to the Pod namespace. When it gets the information about deployed container it will become responsible for IP address and iptables rules and routing on the node.
The process itself varies on different CNI - so topics like how iptables rules are created and how routing information is exchanged by nodes etc.
It is a lot of writing and it has been already written so I will just link the pointers as you requested:
Calico IPAM
Calico:
How do I configure the Pod IP range?
When using Calico IPAM, IP addresses are assigned from IP Pools.
By default, all enabled IP Pool are used. However, you can specify
which IP Pools to use for IP address management in the CNI network
config, or on a per-Pod basis using Kubernetes annotations.
Flannel networking with IPAM section