Inquiring pod and service subnets from inside Kubernetes cluster - kubernetes

How can one inquire the Kubernetes pod and service subnets in use (e.g. 10.244.0.0/16 and 10.96.0.0/12 respectively) from inside a Kubernetes cluster in a portable and simple way?
For instance, kubectl get cm -n kube-system kubeadm-config -o yaml reports podSubnet and serviceSubnet. But this is not fully portable because a cluster may have been set up by another means than kubeadm.
kubectl get cm -n kube-system kube-proxy -o yaml reports clusterCIDR (i.e. pod subnet) and kubectl get pod -n kube-system kube-apiserver-master1 -o yaml reports the value
passed as command-line option --service-cluster-ip-range to kube-apiserver (i.e. service subnet). master1 stands for the name of any control plane node. But this seems a bit complex.
Is there a better way available e.g. with the Kubernetes 1.17 API?

I don't think it would be possible to obtain what you want in a portable and simple way.
If you don't specify Cidr's parameters it will assign default one.
As you have many ways to run kubernetes as unmanaged clusters like kubeadm, minikbue, k3s, micork8s or managed like Cloud providers (GKE, Azure, AWS) it's hard to find one way to list all cidrs in all environments. Another obstacle can be versions of Kubernetes or CNI.
In Kubernetes 1.17 Release notes you can find information that
Deprecate the default service IP CIDR. The previous default was 10.0.0.0/24 which will be removed in 6 months/2 releases. Cluster admins must specify their own desired value, by using --service-cluster-ip-range on kube-apiserver.
As example of Kubeadm: $ kubeadm init --pod-network-cidr 10.100.0.0/12 --service-cidr 10.99.0.0/12
There are a few ways to get this pod and service-cidr:
$ kubectl cluster-info dump | grep -E '(service-cluster-ip-range|cluster-cidr)'
"--service-cluster-ip-range=10.99.0.0/12",
"--cluster-cidr=10.100.0.0/12",
$ kubeadm config view | grep Subnet
podSubnet: 10.100.0.0/12
serviceSubnet: 10.99.0.0/12
But if you will check all pods in this cluster, some pods are starting with 192.168.190.X or 192.168.137.X
$ kubectl get pods -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx 1/1 Running 0 62m 192.168.190.129 kubeadm-worker <none> <none>
kube-system calico-kube-controllers-77c5fc8d7f-9n6m5 1/1 Running 0 118m 192.168.137.66 kubeadm-master <none> <none>
kube-system calico-node-2kx2v 1/1 Running 0 117m 10.128.0.4 kubeadm-worker <none> <none>
kube-system calico-node-8xqd9 1/1 Running 0 118m 10.128.0.3 kubeadm-master <none> <none>
kube-system coredns-66bff467f8-sgmkw 1/1 Running 0 120m 192.168.137.65 kubeadm-master <none> <none>
kube-system coredns-66bff467f8-t84ht 1/1 Running 0 120m 192.168.137.67 kubeadm-master <none> <none>
If you will describe any CNI pods you can find another CIDRs:
CALICO_IPV4POOL_CIDR: 192.168.0.0/16
For GKE example you will have:
node CIDRs
$ kubectl describe node | grep CIDRs
PodCIDRs: 10.52.1.0/24
PodCIDRs: 10.52.0.0/24
PodCIDRs: 10.52.2.0/24
$ gcloud container clusters describe cluster-2 --zone=europe-west2-b | grep Cidr
clusterIpv4Cidr: 10.52.0.0/14
clusterIpv4Cidr: 10.52.0.0/14
clusterIpv4CidrBlock: 10.52.0.0/14
servicesIpv4Cidr: 10.116.0.0/20
servicesIpv4CidrBlock: 10.116.0.0/20
podIpv4CidrSize: 24
servicesIpv4Cidr: 10.116.0.0/20
Honestly I don't think there is an easy and portable way to list all podCidrs and serviceCidrs in one simple command.

Related

how pods manage the IP address?

I would like to know how exactly pods get an IP address, and how they distribute the pods to agent and master.
I have 1 master node and 2 agent nodes. my pods all are running well, but I am curious how the pods get an IP address.
some pods have IP cluster nodes, meanwhile, some have an ethernet IP address. I run Nginx and Metallb for the load balancer. Disable Traefik and Klipper.
if we can see the agent-03 has 2 IP addresses run on
root:/# kubectl get pods -A -o wide
ingress nginx-dep-fdcd8sdfs-gj5gff 1/1 Running 0 46h 10.42.0.80 master <none> <none>
ingress nginx-dep-fdcd8sdfs-dn80n 1/1 Running 0 46h 10.42.0.79 master <none> <none>
ingress nginx-doc-7cc85c5899-sdh55 1/1 Running 0 44h 10.42.0.82 master <none> <none>
ingress nginx-doc-7cc85c5899-gjghs 1/1 Running 0 44h 10.42.0.83 master <none> <none>
prometheus prometheus-node-exporter-6tl8t 1/1 Running 0 47h 192.168.1.3 agent-03 <none> <none>
ingress ingress-controller-nginx-ingress-controller-rqs8n 1/1 Running 5 47h 192.168.1.3 agent-03 <none> <none>
prometheus prometheus-kube-prometheus-operator-68fbcb6d67-8qsnf 1/1 Running 1 46h 10.42.2.52 agent-03 <none> <none>
ingress nginx-doc-7cc85c5899-b77j6 1/1 Running 0 43h 10.42.2.57 agent-03 <none> <none>
metallb-system speaker-sk4pz 1/1 Running 1 47h 192.168.1.3 agent-03 <none> <none>
in my pod's shows agent-03 run Nginx-doc use IP cluster while metal use IP ethernet, or it depends on what service are running in pods?
ingress nginx-doc-7cc85c5899-b77j6 1/1 Running 0 43h 10.42.2.57 agent-03 <none> <none>
metallb-system speaker-sk4pz 1/1 Running 1 47h 192.168.1.3 agent-03 <none> <none>
and I can see master has 2 Nginx-doc pods running, which means when I deploy 3 Nginx-doc one agent will not get any Nginx-doc because it has been taken by the master. and it is not divided equally.
If I miss configuring which part do I need to fix.
Based on your internal plugin your POD will get the IPs. Which again will be the internal IPs mostly.
There are different types of Network interfaces, we can use CNI as per need : https://kubernetes.io/docs/concepts/cluster-administration/networking/
POD gets exposed by the service. There are different types of services. Cluster IP, Node Port, Load Balancer. https://kubernetes.io/docs/concepts/services-networking/service/
in my pod's shows agent-03 run Nginx-doc use IP cluster while metal
use IP ethernet, or it depends on what service are running in pods?
Could be possible due to the service type you are using due to that IP is different and using ethernet.
If your service type is LoadBalancer using MetalLb which means that the service is exposed using the IP, not like internal IP that PODs have mostly.
kubectl get svc -n <namespace name> and check
and I can see master has 2 Nginx-doc pods running, which means when I
deploy 3 Nginx-doc one agent will not get any Nginx-doc because it has
been taken by the master. and it is not divided equally.
There is no guarantee on that, K8s put and assign pods based on score.
You can read more about score at here : https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/
If you want to fix your POD on a specific node, suppose you are running the GPU with Node your should schedule on that Node to use GPU in that case you can use.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
Pod's IP address is provided by CNI driver from range that was specified when cluster was created using --pod-network-cidr, see here.
Some CNI implementations can implement additional behavior.
In your particular case I believe that pods in question are started using hostNetwork: true in their PodSpec, which gives them access to host network

Rook ceph broken on kubernetes?

Using Ceph v1.14.10, Rook v1.3.8 on k8s 1.16 on-premise. After 10 days without any trouble, we decided to drain some nodes, then, all moved pods cant attach to their PV any more, look like Ceph cluster is broken:
My ConfigMap rook-ceph-mon-endpoints is referencing 2 missing mon pod IPs:
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.115.0.129:6789","10.115.0.4:6789","10.115.0.132:6789"]}]
But
kubectl -n rook-ceph get pod -l app=rook-ceph-mon -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
rook-ceph-mon-e-56b849775-4g5wg 1/1 Running 0 6h42m 10.115.0.2 XXXX <none> <none>
rook-ceph-mon-h-fc486fb5c-8mvng 1/1 Running 0 6h42m 10.115.0.134 XXXX <none> <none>
rook-ceph-mon-i-65666fcff4-4ft49 1/1 Running 0 30h 10.115.0.132 XXXX <none> <none>
Is it normal or I must run a kind of "reconciliation" task to update the CM with new mon pod IPs ?
(could be related to https://github.com/rook/rook/issues/2262)
I had to manualy update:
secret rook-ceph-config
cm rook-ceph-mon-endpoints
cm rook-ceph-csi-config
As #travisn said:
The operator owns updating that configmap and secret. It's not expected to update them manually unless there is some disaster recovery situation as described at https://rook.github.io/docs/rook/v1.4/ceph-disaster-recovery.html.

Interpod Communication

We have the following in GoogleCloud Kubernetes:
3 REST API Pods which take POSTs and send them to clients that are connected via Websocket.
If one of those Pods is posted on, we want to send this post to all other pods.
The question is: How / Where can we find the IPs of the other Pods?
You can find the pod IP using the below and try hitting the IP but suggest you expose a service to do this.
kubectl get po -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod1 1/1 Running 0 98m 10.42.0.16 worker3 <none> <none>
pod6 1/1 Running 0 87m 10.44.0.26 worker1 <none> <none>
To expose a service:
kubectl expose pod/pod1 -n test --name=svc --port=80 --target-port=80
kubectl run bb --image=busybox -n test -it --rm -- sh
If you don't see a command prompt, try pressing enter.
wget -O- svc:80
Connecting to svc:80 (10.101.174.245:80)
writing to stdout
<html><body><h1>It works!</h1></body></html>

Kuberetes V1.6.2 flannel not running on master node

I am running 2 node cluster using vagrant, configured with kubeadm command. When I setup the cluster flannel was running on all three nodes. Now i don't see flannel running in master node. because of this overlay network is not working from master node.
Used this yaml files to configure flannel.
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# kubectl get pods --all-namespaces -o wide |grep fla
kube-system kube-flannel-ds-0d3bn 2/2 Running 0 20m 192.168.15.102 node-01
kube-system kube-flannel-ds-86bzs 2/2 Running 0 20m 192.168.15.103 node-02
# k get nodes -o wide
NAME STATUS AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION
master01 Ready 26d v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-514.10.2.el7.x86_64
node-01 Ready 26d v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-514.10.2.el7.x86_64
node-02 Ready 26d v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-514.10.2.el7.x86_64
How can I start the flannel pod in my master node?
I see you using RBAC, maybe there are not enough rights at a node.
Try creating a clusterrolebinding with the necessary rights
$ kubectl create clusterrolebinding nodeName --clusterrole=system:node --
user=nodeName
or can use cluster-admin for testing

How to fix weave-net CrashLoopBackOff for the second node?

I have got 2 VMs nodes. Both see each other either by hostname (through /etc/hosts) or by ip address. One has been provisioned with kubeadm as a master. Another as a worker node. Following the instructions (http://kubernetes.io/docs/getting-started-guides/kubeadm/) I have added weave-net. The list of pods looks like the following:
vagrant#vm-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-vm-master 1/1 Running 0 3m
kube-system kube-apiserver-vm-master 1/1 Running 0 5m
kube-system kube-controller-manager-vm-master 1/1 Running 0 4m
kube-system kube-discovery-982812725-x2j8y 1/1 Running 0 4m
kube-system kube-dns-2247936740-5pu0l 3/3 Running 0 4m
kube-system kube-proxy-amd64-ail86 1/1 Running 0 4m
kube-system kube-proxy-amd64-oxxnc 1/1 Running 0 2m
kube-system kube-scheduler-vm-master 1/1 Running 0 4m
kube-system kubernetes-dashboard-1655269645-0swts 1/1 Running 0 4m
kube-system weave-net-7euqt 2/2 Running 0 4m
kube-system weave-net-baao6 1/2 CrashLoopBackOff 2 2m
CrashLoopBackOff appears for each worker node connected. I have spent several ours playing with network interfaces, but it seems the network is fine. I have found similar question, where the answer advised to look into the logs and no follow up. So, here are the logs:
vagrant#vm-master:~$ kubectl logs weave-net-baao6 -c weave --namespace=kube-system
2016-10-05 10:48:01.350290 I | error contacting APIServer: Get https://100.64.0.1:443/api/v1/nodes: dial tcp 100.64.0.1:443: getsockopt: connection refused; trying with blank env vars
2016-10-05 10:48:01.351122 I | error contacting APIServer: Get http://localhost:8080/api: dial tcp [::1]:8080: getsockopt: connection refused
Failed to get peers
What I am doing wrong? Where to go from there?
I ran in the same issue too. It seems weaver wants to connect to the Kubernetes Cluster IP address, which is virtual. Just run this to find the cluster ip:
kubectl get svc. It should give you something like this:
$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 100.64.0.1 <none> 443/TCP 2d
Weaver picks up this IP and tries to connect to it, but worker nodes does not know anything about it. Simple route will solve this issue. On all your worker nodes, execute:
route add 100.64.0.1 gw <your real master IP>
this happens with a single node setup, too. I tried several things like reapplying the configuration and recreation, but the most stable way at the moment is to perform a full tear down (as described in docs) and put the cluster up again.
I use these scripts for relaunching the cluster:
down.sh
#!/bin/bash
systemctl stop kubelet;
docker rm -f -v $(docker ps -q);
find /var/lib/kubelet | xargs -n 1 findmnt -n -t tmpfs -o TARGET -T | uniq | xargs -r umount -v;
rm -r -f /etc/kubernetes /var/lib/kubelet /var/lib/etcd;
up.sh
#!/bin/bash
systemctl start kubelet
kubeadm init
# kubectl taint nodes --all dedicated- # single node!
kubectl create -f https://git.io/weave-kube
edit: I would also give other Pod networks a try, like Calico, if this is a weave related issue
The most common causes for this may be:
- presence of a firewall (e.g. firewalld on CentOS)
- network configuration (e.g. default NAT interface on VirtualBox)
Currently kubeadm is still alpha, and this is one of the issues that has already been reported by many of the alpha testers. We are looking into fixing this by documenting the most common problems, such documentation is going to be ready closer to beta version.
Right there exists a VirtualBox+Vargant+Ansible for Ubunutu and CentOS reference implementation that provides solutions for firewall, SELinux and VirtualBox NAT issues.
/usr/local/bin/weave reset
was the fix for me - Hope its useful - and yes make sure selinux is set to disabled
and firewalld is not running (on redhat / centos) releases
kube-system weave-net-2vlvj 2/2 Running 3 11d
kube-system weave-net-42k6p 1/2 Running 3 11d
kube-system weave-net-wvsk5 2/2 Running 3 11d