I am trying to understand Kubernetes and how it works under the hood. As I understand it each pod gets its own IP address. What I am not sure about is what kind of IP address that is.
Is it something that the network admins at my company need to pass out? Or is an internal kind of IP address that is not addressable on the full network?
I have read about network overlays (like Project Calico) and I assume they play a role in this, but I can't seem to find a page that explains the connection. (I think my question is too remedial for the internet.)
Is the IP address of a Pod a full IP address on my network (just like a Virtual Machine would have)?
Kubernetes clusters
Is the IP address of a Pod a full IP address on my network (just like a Virtual Machine would have)?
The thing with Kubernetes is that it is not a service like e.g. a Virtual Machine, but a cluster that has it's own networking functionality and management, including IP address allocation and network routing.
Your nodes may be virtual or physical machines, but they are registered in the NodeController, e.g. for health check and most commonly for IP address management.
The node controller is a Kubernetes master component which manages various aspects of nodes.
The node controller has multiple roles in a node’s life. The first is assigning a CIDR block to the node when it is registered (if CIDR assignment is turned on).
Cluster Architecture - Nodes
IP address management
Kubernetes Networking depends on the Container Network Interface (CNI) plugin your cluster is using.
A CNI plugin is responsible for ... It should then assign the IP to the interface and setup the routes consistent with the IP Address Management section by invoking appropriate IPAM plugin.
It is common that each node is assigned an CIDR range of IP-addresses that the nodes then assign to pods that is scheduled on the node.
GKE network overview describes it well on how it work on GKE.
Each node has an IP address assigned from the cluster's Virtual Private Cloud (VPC) network.
Each node has a pool of IP addresses that GKE assigns Pods running on that node (a /24 CIDR block by default).
Each Pod has a single IP address assigned from the Pod CIDR range of its node. This IP address is shared by all containers running within the Pod, and connects them to other Pods running in the cluster.
Each Service has an IP address, called the ClusterIP, assigned from the cluster's VPC network.
Kubernetes Pods are going to receive a real IP address like how's happening with Docker ones due to the brdige network interface: the real hard stuff to understand is basically the Pod to Pod connection between different nodes and that's a black magic performed via kube-proxy with the help of iptables/nftables/IPVS (according to which component you're running in the node).
A different story regarding IP addresses assigned to a Service of ClusterIP kind: in fact, it's a Virtual IP used to transparently redirect to endpoints as needed.
Kubernetes networking could look difficult to understand but we're lucky because Tim Hockin provided a really good talk named Life of a Packet that will provide you a clear overview of how it works.
Related
I'm aware that what I'm about to ask is probably an extremely basic networking question, but I'm not even sure the best way to word this to do a Google search that returns anything useful.
I'm working through a Kubernetes tutorial which suggests to get the IP address of one of my services with this command:
minikube service web --url
This returns to me http://192.168.64.2:30169. The part that confuses me is that my computer's IP address is 192.168.64.1. It seems like there are actually two IP addresses on my local network that "live" on my laptop, but when I try to access the service address from another machine on my home network the service does not load.
Question
What is really going on when each service is assigned a different IP address? What network is that on and why can I only access those services from my laptop?
I think "http://192.168.64.2" could be the IP address of the VM that minikube is running. You can check it using minikube ip. 30169 could refer to the port where the service is accessible if the service is of NodePort type. Also, these IPs are locally exposed on your computer so they may not be accessible from another computer.
Each service is assigned a virtual IP address inside the kubernetes cluster. They can be accessed from outside the cluster if they are of NodePort type. You can access them on any node of the cluster (hence multiple different IP addresses) or you could loadbalance across those node by adding a Loadbalancer or Ingress.
Consider a microservice X which is containerized and deployed in a kubernetes cluster. X communicates with a Payment Gateway PG. However, the payment gateway requires a static IP for services contacting it as it maintains a whitelist of IP addresses which are authorized to access the payment gateway. One way for X to contact PG is through a third party proxy server like QuotaGuard which will provide a static IP address to service X which can be whitelisted by the Payment Gateway.
However, is there an inbuilt mechanism in kubernetes which can enable a service deployed in a kube-cluster to obtain a static IP address?
there's no mechanism in Kubernetes for this yet.
other possible solutions:
if nodes of the cluster are in a private network behind a NAT then just add your network's default gateway to the PG's whitelist.
if whitelist can accept a cidr apart from single IPs (like 86.34.0.0/24 for example) then add your cluster's network cidr to the whitelist
If every node of the cluster has a public IP and you can't add a cidr to the whitelist then it gets more complicated:
a naive way would be to add ever node's IP to the whitelist, but it doesn't scale above tiny clusters few just few nodes.
if you have access to administrating your network, then even though nodes have pubic IPs, you can setup a NAT for the network anyway that targets only packets with PG's IP as a destination.
if you don't have administrative access to the network, then another way is to allocate a machine with a static IP somewhere and make it act as a proxy using iptables NAT similarly like above again. This introduces a single point of failure though. In order to make it highly available, you could deploy it on a kubernetes cluster again with few (2-3) replicas (this can be the same cluster where X is running: see below). The replicas instead of using their node's IP to communicate with PG would share a VIP using keepalived that would be added to PG's whitelist. (you can have a look at easy-keepalived and either try to use it directly or learn from it how it does things). This requires high privileges on the cluster: you need be able to grant to pods of your proxy NET_ADMIN and NET_RAW capabilities in order for them to be able to add iptables rules and setup a VIP.
update:
While waiting for builds and deployments during last few days, I've polished my old VIP-iptables scripts that I used to use as a replacement for external load-balancers on bare-metal clusters, so now they can be used as well to provide egress VIP as described in the last point of my original answer. You can give them a try: https://github.com/morgwai/kevip
There are two answers to this question: for the pod IP itself, it depends on your CNI plugin. Some allow it with special pod annotations. However most CNI plugins also involve a NAT when talking to the internet so the pod IP being static on the internal network is kind of moot, what you care about is the public IP the connection ends up coming from. So the second answer is "it depends on how your node networking and NAT is set up". This is usually up to the tool you used to deploy Kubernetes (or OpenShift in your case I guess). With Kops it's pretty easy to tweak the VPC routing table.
I have deployed a Kubernetes cluster to GCP. For this cluster, I added some deployments. Those deployments are using external resources that protected with security policy to reject connection from unallow IP address.
So, in order to pod to connect the external resource, I need manually allow the node (who hosting the pod) IP address.
It's also possible to me to allow range of IP address, where one of my nodes are expected to be running.
Untill now, I just find their internal IP addresses range. It looks like this:
Pod address range 10.16.0.0/14
The question is how to find the range of external IP addresses for my nodes?
Let's begin with the IPs that are assigned to Nodes:
When we create a Kubernetes cluster, GCP in the backend creates compute engines machines with a specific internal and external IP address.
In your case, just go to the compute engine section of the Google Cloud Console and capture all the external IPs of the VM whose initials starts with gke-(*) and whitelist it.
Talking about the range, as such in GCP only the internal IP ranges are known and external IP address are randomly assigned from a pool of IPs hence you need to whitelist it one at a time.
To get the pod description and IPs run kubectl describe pods.
If you go to the compute engine instance page it shows the instances which make the cluster. it shows the external ips on the right side. For the the ip of the actual pods use the Kubectl command.
I am trying to install Kubernetes in my on-premise server Ubuntu 16.04. And referring following documentation ,
https://medium.com/#Grigorkh/install-kubernetes-on-ubuntu-1ac2ef522a36
After installing kubelete kubeadm and kubernetes-cni I found that to initiate kubeadm with following command,
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.133.15.28 --kubernetes-version stable-1.8
Here I am totally confused about why we are setting cidr and api server advertise address. I am adding few confusion from Kubernetes here,
Why we are specifying CIDR and --apiserver-advertise-address here?
How I can find these two address for my server?
And why flannel is using in Kubernetes installation?
I am new to this containerization and Kubernetes world.
Why we are specifying CIDR and --apiserver-advertise-address here?
And why flannel is using in kubernetes installation?
Kubernetes using Container Network Interface for creating a special virtual network inside your cluster for communication between pods.
Here is some explanation "why" from documentation:
Kubernetes imposes the following fundamental requirements on any networking implementation (barring any intentional network segmentation policies):
all containers can communicate with all other containers without NAT
all nodes can communicate with all containers (and vice-versa) without NAT
the IP that a container sees itself as is the same IP that others see it as
Kubernetes applies IP addresses at the Pod scope - containers within a Pod share their network namespaces - including their IP address. This means that containers within a Pod can all reach each other’s ports on localhost. This does imply that containers within a Pod must coordinate port usage, but this is no different than processes in a VM. This is called the “IP-per-pod” model.
So, Flannel is one of the CNI which can be used for create network which will connect all your pods and CIDR option define a subnet for that network. There are many alternative CNI with similar functions.
If you want to get more details about how network working in Kubernetes you can read by link above or, as example, here.
How I can find these two address for my server?
API server advertise address has to be only one and static. That address using by all components to communicate with API server. Unfortunately, Kubernetes has no support of multiple API server addresses per master.
But, you can still use as many addresses on your server as you want, but only one of them you can define as --apiserver-advertise-address. The only one request for it - it has to be accessible from all your nodes in cluster.
If I'm running processes in 2 pods that communicate with each other over tcp (addressing each other through Kubernetes services) and the pods are scheduled to the same node will the communication take place over the network or will Kubernetes know to use the loopback device?
In a kubernetes cluster, a pod could be scheduled in any node in the cluster. The another pod which wants to access it should not ideally know where this pod is running or its POD IP address. Kubernetes provides a basic service discovery mechanism by providing DNS names to the kubernetes services (which are associated with pods). When a pod wants to talk to another pod, it should use the DNS name (e.g. svc1.namespace1.svc.cluster.local)
loopback is not mentioned in "community/contributors/design-proposals/network/networking"
Because every pod gets a "real" (not machine-private) IP address, pods can communicate without proxies or translations. The pod can use well-known port numbers and can avoid the use of higher-level service discovery systems like DNS-SD, Consul, or Etcd.
When any container calls ioctl(SIOCGIFADDR) (get the address of an interface), it sees the same IP that any peer container would see them coming from — each pod has its own IP address that other pods can know.
By making IP addresses and ports the same both inside and outside the pods, we create a NAT-less, flat address space. Running "ip addr show" should work as expected. This would enable all existing naming/discovery mechanisms to work out of the box, including self-registration mechanisms and applications that distribute IP addresses.
We should be optimizing for inter-pod network communication.
Using IP was already mentioned last year in "Kubernetes - container communication within a pod using names instead of 'localhost'?"