Calico IP-in-IP connectivity issues with nested containers on Kubernetes

Calico IP-in-IP connectivity issues with nested containers on Kubernetes - kubernetes

I am implementing a cluster-api controller using Kubernetes as the infrastructure provider - that is, I am trying to run Kubernetes Nodes as Kubernetes Pods and form a cluster within a cluster.
I have this working apart from network connectivity between Pods of the inner cluster (running on Pods of the infrastructure cluster), but I am stuck as to what the issue is.
I am running on GKE, using their default CNI implementation. I am then attempting to use Calico for an overlay implementation of the inner cluster, using IP-in-IP encapsulation so the Nodes of the infrastructure cluster do not need to know how to route inner cluster Pod IPs.
I am creating the infrastructure cluster as follows (the UBUNTU image is needed for the ipip kernel module required by Calico's IP-in-IP encapsulation.
gcloud container clusters create management-cluster --image-type=UBUNTU
I then deploy a number of nginx Pods to the inner cluster. If they land on the same inner cluster Node, they can connect to eachother. If they land on separate inner cluster Nodes they cannot, so I assume this means the IP-in-IP tunnel isn't working properly, but I am not sure why. This fails even if the inner cluster Nodes (Pods) land on the same infrastructure (outer cluster) Node. Pod and Service CIDR ranges of the two clusters do not overlap.
I realise this is not a supported use case for Calico, but I cannot see a reason why it is not possible and would like to get it working. Do the outer cluster Nodes need to support forwarding IP-in-IP packets? They are configured to forward IPv4 packets, but I am not sure if that is enough.
I guess more information is required to give a concrete reason for why this isn't working, but I am not too sure what that would be at this point and would be grateful for any direction.

It was necessary to allow ipencap protocol on the GKE nodes:
iptables -C FORWARD -p ipencap -j ACCEPT || iptables -A FORWARD -p ipencap -j ACCEPT

Related

In which cases do we need container networks in kubernetes while we already have kubernetes Service?

Why do we need point-to-point connection between pods while we have workloads abstraction and networking mechanism (Service/kube-proxy/Ingress etc.) over it?
What is the default CNI?
REDACTED: I was confused about this question because I felt like I haven't installed any of popular CNI plugins when I was installing Kubernetes. It turns out Kubernetes defaults to kubenet
Btw, I see a lot of overlap features between Istio and container networks. IMO they could achieve identical objectives. The only difference is that Istio is high-level and CNI is low-level and more efficient, is that correct?
REDACTED:Interestingly, istio has it's own CNI

Kubernetes networking has some requirements:
pods on a node can communicate with all pods on all nodes without NAT
agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node
pods in the host network of a node can communicate with all pods on all nodes without NAT
and CNI(Container Network Interface) setup a standard interface, all implements(calico, flannel) need follow it.
So it aims to resolve the kubernetes networking.
The SVC is different, it's supplied a virtual address to proxy the pods, sine pods is ephemeral and its ip will changing but the address of svc is immutable.
For the istio, it's another thing, it make the connection between microservice as infrastructure and pull out this part from business code (think about spring cloud).

why do we need point-to-point connection between pods while we have workloads abstraction and networking mechanism(Service/kube-proxy/Ingress etc.) over it?
In general, you will find everything about networking in a cluster in this documentation. You can find more information about pod networking:
Every Pod gets its own IP address. This means you do not need to explicitly create links between Pods and you almost never need to deal with mapping container ports to host ports. This creates a clean, backwards-compatible model where Pods can be treated much like VMs or physical hosts from the perspectives of port allocation, naming, service discovery, load balancing, application configuration, and migration.
Kubernetes imposes the following fundamental requirements on any networking implementation (barring any intentional network segmentation policies):
pods on a node can communicate with all pods on all nodes without NAT
agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node
Note: For those platforms that support Pods running in the host network (e.g. Linux):
pods in the host network of a node can communicate with all pods on all nodes without NAT
Then you are asking:
what is the default cni?
There is no single default CNI in a kubernetes cluster. It depends on what type you meet, where and how you set up the cluster etc. As you can see reading this doc about implementing networking model there are many CNI's available in Kubernetes.
Istio is a completely different tool for something else. You can't compare them like that. Istio is a service mesh tool.
Istio extends Kubernetes to establish a programmable, application-aware network using the powerful Envoy service proxy. Working with both Kubernetes and traditional workloads, Istio brings standard, universal traffic management, telemetry, and security to complex deployments.

What route do service requests pass through Kubernetes?

Let's say that a Service in a Kubernetes cluster is mapped to a group of cloned containers that will fulfill requests made for that service from the outside world.
What are the steps in the journey that a request from the outside world will make into the Kubernetes cluster, then through the cluster to the designated container, and then back through the Kubernetes cluster out to the original requestor in the outside world?
The documentation indicates that kube-controller-manager includes the Endpoints controller, which joins services to Pods. But I have not found specific documentation illustrating the steps in the journey that each request makes through a Kubernetes cluster.
This is important because it affects how one might design security for services, including the configuration of routing around the control plane.

Assuming you are using mostly the defaults:
Packet comes in to your cloud load balancer of choice.
It gets forwarded to a random node in the cluster.
It is received by the kernel and run through iptables.
Iptables defines a mapping rule to forward the packet to a container IP.
Unless it randomly happens to be on the same box, it then goes through your CNI network, usually some kind of overlay possibly with a wrapping and unwrapping.
It eventually gets to the container IP, and then is delivered to whatever the process inside the container is.
The Services and Endpoints system is what creates and manages the iptables rules and the cloud load balancers so that the LB knows the right node IPs and the iptables rules know the right container IPs.

Kubernetes (on GKE) external connection through NAT for specific pods

I have a Kubernetes cluster on GKE that is configured to use a CE instance as an external NAT. However, I only want to route specific pods in the GKE cluster through the external NAT. Is this possible and how would I go about configuring this?

There isn't really an easy way to do with but it is possible.
1) You need to make sure to use VPC-Native
2) Make sure all your pods that will NAT are on the same node by taking advantage of advanced scheduling in k8s
3) Find the pod CIDR of that node using 'kubeclt describe no [node_name] | grep PodCIDR'
4) Create a custom route that sends all traffic from that CIDR through the NAT
This is not proven to work, to be honest. I know from dealing with some Cloud NAT issues that even using VPC Native pods, the pod IP is still sometimes going through SNAT on the node and thus takes the nodes IP.
You could use the nodes internal IP instead of the pod CIDR on the node, this could possibly cause communication issues with the Master, but not necessarily.
Finally, keep in mind that this is not ideal since the node could end up being recreated and the IP or the pod CIDR may change.

CIDR Address and advertise-address defining in Kubernetes Installation

I am trying to install Kubernetes in my on-premise server Ubuntu 16.04. And referring following documentation ,
https://medium.com/#Grigorkh/install-kubernetes-on-ubuntu-1ac2ef522a36
After installing kubelete kubeadm and kubernetes-cni I found that to initiate kubeadm with following command,
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.133.15.28 --kubernetes-version stable-1.8
Here I am totally confused about why we are setting cidr and api server advertise address. I am adding few confusion from Kubernetes here,
Why we are specifying CIDR and --apiserver-advertise-address here?
How I can find these two address for my server?
And why flannel is using in Kubernetes installation?
I am new to this containerization and Kubernetes world.

Why we are specifying CIDR and --apiserver-advertise-address here?
And why flannel is using in kubernetes installation?
Kubernetes using Container Network Interface for creating a special virtual network inside your cluster for communication between pods.
Here is some explanation "why" from documentation:
Kubernetes imposes the following fundamental requirements on any networking implementation (barring any intentional network segmentation policies):
all containers can communicate with all other containers without NAT
all nodes can communicate with all containers (and vice-versa) without NAT
the IP that a container sees itself as is the same IP that others see it as
Kubernetes applies IP addresses at the Pod scope - containers within a Pod share their network namespaces - including their IP address. This means that containers within a Pod can all reach each other’s ports on localhost. This does imply that containers within a Pod must coordinate port usage, but this is no different than processes in a VM. This is called the “IP-per-pod” model.
So, Flannel is one of the CNI which can be used for create network which will connect all your pods and CIDR option define a subnet for that network. There are many alternative CNI with similar functions.
If you want to get more details about how network working in Kubernetes you can read by link above or, as example, here.
How I can find these two address for my server?
API server advertise address has to be only one and static. That address using by all components to communicate with API server. Unfortunately, Kubernetes has no support of multiple API server addresses per master.
But, you can still use as many addresses on your server as you want, but only one of them you can define as --apiserver-advertise-address. The only one request for it - it has to be accessible from all your nodes in cluster.

request service from minion only forward to local deployed pod in that minion

I am working on a POC, and i find out some strange behavior after setting up my kubernetes cluster
In fact, i am working on a topology of one master and two minions.
When i tried to make up 2 pods into each minion and expose a service for them, it turned out that when i try to request the service from the master, nothing is returned (any response from 2 pods) and when i try to request the service from a minion, only the pod deployed in that minion respond but the other no.

This can heavily depend on how your cluster is provisioned.
For starters, you need to validate how networking is set up and if it works as kubernetes expects. Said short, if you launch two pods (on separate nodes), they should get IPs from their dedicated per node ranges, and be able to route that between nodes. You can use some small(ish) base image (alpine/debian/ubuntu etc.), with something like sleep 1d , exec into them interactively with bash and simply ping one from the other. If it does not work, your network setup is broken.
Make sure you test between pods, not directly from node host OS. In some configurations node is unable to access service IPs due to routing concerns, but pod-to-pod works fine (seen this in some flannel configurations)
Also, your networking is probably provided by some overlay network solution like flannel, weave, calico etc. so check their respective logs for signs of problems.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse