How to restore accidentally deleted a kube-proxy DaemonSet in a Kubernetes cluster? - kubernetes

I accidentally deleted kube-proxy daemonset by using command: kubectl delete -n kube-system daemonset kube-proxy which should run kube-proxy pods in my cluster, what the best way to restore it?
That's how it should look

Kubernetes allows you to reinstall kube-proxy by running the following command which install the kube-proxy addon components via the API server.
$ kubeadm init phase addon kube-proxy --kubeconfig ~/.kube/config --apiserver-advertise-address string
This will generate the output as
[addons] Applied essential addon: kube-proxy
The IP address the API Server will advertise it's listening on. If not set the default network interface will be used.
Hence kube-proxy will be reinstalled in the cluster by creating a DaemonSet and launching the pods.

kube-proxy daemon got created at the time of cluster creation, so you need to write your own manifest for daemon-set unless you have a backup to restore it from there.

Related

Kubernetes Nginx Ingress controller Readiness Probe failed

I am trying to setup my very first Kubernetes cluster and it seems to have setup fine until nginx-ingress controller.
Here is my cluster information:
Nodes: three RHEL7 and one RHEL8 nodes
Master is running on RHEL7
Kubernetes server version: 1.19.1
Networking used: flannel
coredns is running fine.
selinux and firewall are disabled on all nodes
Here are my all pods running in kube-system
I then followed instructions on following page to install nginx ingress controller: https://docs.nginx.com/nginx-ingress-controller/installation/installation-with-manifests/
Instead of deployment, I decided to use daemon-set since I am going to have only few nodes running in my kubernetes cluster.
After following the instructions, pod on my RHEL8 is constantly failing with the following error:
Readiness probe failed: Get "http://10.244.3.2:8081/nginx-ready": dial
tcp 10.244.3.2:8081: connect: connection refused Back-off restarting
failed container
Here is the screenshot shows that RHEL7 pods are working just fine and RHEL8 is failing:
All nodes are setup exactly the same way and there is no difference.
I am very new to Kubernetes and don't know much internals of it. Can someone please point me on how can I debug and fix this issue? I am really willing to learn from issues like this.
This is how I provisioned RHEL7 and RHEL8 nodes
Installed docker version: 19.03.12, build 48a66213fe
Disabled firewalld
Disabled swap
Disabled SELinux
To enable iptables to see bridged traffic, set net.bridge.bridge-nf-call-ip6tables = 1 and net.bridge.bridge-nf-call-iptables = 1
Added hosts entry for all the nodes involved in Kubernetes cluster so that they can find each other without hitting DNS
Added IP address of all nodes in Kubernetes cluster on /etc/environment for no_proxy so that it doesn't hit corporate proxy
Verified docker driver to be "systemd" and NOT "cgroupfs"
Reboot server
Install kubectl, kubeadm, kubelet as per kubernetes guide here at: https://kubernetes.io/docs/tasks/tools/install-kubectl/
Start and enable kubelet service
Initialize master by executing the following:
kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12
Apply node-selector patch for mixed OS scheduling
wget https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/flannel/l2bridge/manifests/node-selector-patch.yml
kubectl patch ds/kube-proxy --patch "$(cat node-selector-patch.yml)" -n=kube-system
Apply flannel CNI
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Modify net-conf.json section of kube-flannel.yml for a type "host-gw"
kubectl apply -f kube-flannel.yml
Apply node selector patch
kubectl patch ds/kube-flannel-ds-amd64 --patch "$(cat node-selector-patch.yml)" -n=kube-system
Thanks
According to kubernetes documentation the list of supported host operating systems is as follows:
Ubuntu 16.04+
Debian 9+
CentOS 7
Red Hat Enterprise Linux (RHEL) 7
Fedora 25+
HypriotOS v1.0.1+
Flatcar Container Linux (tested with 2512.3.0)
This article mentioned that there are network issues on RHEL 8:
(2020/02/11 Update: After installation, I keep facing pod network issue which is like deployed pod is unable to reach external network
or pods deployed in different workers are unable to ping each other
even I can see all nodes (master, worker1 and worker2) are ready via
kubectl get nodes. After checking through the Kubernetes.io official website, I observed the nfstables backend is not compatible with the
current kubeadm packages. Please refer the following link in “Ensure
iptables tooling does not use the nfstables backend”.
The simplest solution here is to reinstall the node on supported operating system.

Kubernetes: Replace Flannel with Calico

I am new to Kubernetes and I would like to try different CNI.
In my current Cluster, I am using Flannel
Now, I would like to use Calico but I cannot find a proper guide to clean up Flannel and install Calico.
Could you please point out the correct procedure?
Thanks
Calico provides a migration tool that performs a rolling update of the nodes in the cluster. At the end, you will have a fully-functional Calico cluster using VXLAN networking between pods.
From the documentation we have:
Procedure
1 - First, install Calico.
kubectl apply -f https://docs.projectcalico.org/v3.10/manifests/flannel-migration/calico.yaml
Then, install the migration controller to initiate the migration.
kubectl apply -f https://docs.projectcalico.org/v3.10/manifests/flannel-migration/migration-job.yaml
Once applied, you will see nodes begin to update one at a time.
2 - To monitor the migration, run the following command.
kubectl get jobs -n kube-system flannel-migration
The migration controller may be rescheduled several times during the migration when the node hosting it is upgraded. The installation is complete when the output of the above command shows 1/1 completions. For example:
NAME COMPLETIONS DURATION AGE
flannel-migration 1/1 2m59s 5m9s
3 - After completion, delete the migration controller with the following command.
kubectl delete -f https://docs.projectcalico.org/v3.10/manifests/flannel-migration/migration-job.yaml
To know more about it: Migrating a cluster from flannel to Calico
This article describes how migrate an existing Kubernetes cluster with flannel networking to use Calico networking.

Something seems to be catching TCP traffic to pods

I'm trying to deploy Kubernetes with Calico (IPIP) with Kubeadm. After deployment is done I'm deploying Calico using these manifests
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
Before applying it, I'm editing CALICO_IPV4POOL_CIDR and setting it to 10.250.0.0/17 as well as using command kubeadm init --pod-cidr 10.250.0.0/17.
After few seconds CoreDNS pods (for example getting addr 10.250.2.2) starts restarting with error 10.250.2.2:8080 connection refused.
Now a bit of digging:
from any node in cluster ping 10.250.2.2 works and it reaches pod (tcpdump in pod net namespace shows it).
from different pod (on different node) curl 10.250.2.2:8080 works well
from any node to curl 10.250.2.2:8080 fails with connection refused
Because it's coredns pod it listens on 53 both udp and tcp, so I've tried netcat from nodes
nc 10.250.2.2 53 - connection refused
nc -u 10.250.2.2 55 - works
Now I've tcpdump each interface on source node for port 8080 and curl to CoreDNS pod doesn't even seem to leave node... sooo iptables?
I've also tried weave, canal and flannel, all seem to have same issue.
I've ran out of ideas by now...any pointers please?
Seems to be a problem with Calico implementation, CoreDNS Pods are sensitive on the CNI network Pods successful functioning.
For proper CNI network plugin implementation you have to include --pod-network-cidr flag to kubeadm init command and afterwards apply the same value to CALICO_IPV4POOL_CIDR parameter inside calico.yml.
Moreover, for a successful Pod network installation you have to apply some RBAC rules in order to make sufficient permissions in compliance with general cluster security restrictions, as described in official Kubernetes documentation:
For Calico to work correctly, you need to pass
--pod-network-cidr=192.168.0.0/16 to kubeadm init or update the calico.yml file to match your Pod network. Note that Calico works on
amd64 only.
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
In your case I would switched to the latest Calico versions at least from v3.3 as given in the example.
If you've noticed that you run Pod network plugin installation properly, please take a chance and update the question with your current environment setup and Kubernetes components versions with a health statuses.

How to set kube-proxy settings using kubectl on AKS

I keep reading documentation that gives parameters for kube-proxy, but does not explain how where these parameters are supposed to be used. I create my cluster using az aks create with the azure-cli program, then I get credentials and use kubectl. So far everything I've done has involved yaml for services and deployments and such, but I can't figure out where all this kube-proxy stuff fits into all of this.
I've googled for days. I've opened question issues on github with AKS. I've asked on the kubernetes slack channel, but nobody has responded.
The kube-proxy on all your Kubernetes nodes runs as a Kubernetes DaemonSet and its configuration is stored on a Kubernetes ConfigMap. To make any changes or add/remove options you will have to edit the kube-proxy DaemonSet or ConfigMap on the kube-system namespace.
$ kubectl -n kube-system edit daemonset kube-proxy
or
$ kubectl -n kube-system edit configmap kube-proxy
For a reference on the kube-proxy command line options you can refer to here.

How to restart kube-proxy in Kubernetes 1.2 (GKE)

As of Kubernetes 1.2, kube-proxy is now a pod running in the kube-system namespace.
The old init script /etc/init.d/kube-proxy has been removed.
Aside from simply resetting the GCE instance, is there a good way to restart kube-proxy?
I just added an annotation to change the proxy mode, and I need to restart kube-proxy for my change to take effect.
The kube-proxy is run as an addon pod, meaning the Kubelet will automatically restart it if it goes away. This means you can restart the kube-proxy pod by simply deleting it:
$ kubectl delete pod --namespace=kube-system kube-proxy-${NODE_NAME}
Where $NODE_NAME is the node you want to restart the proxy on (this is assuming a default configuration, otherwise kubectl get pods --kube-system should include the list of kube-proxy pods).
If the restarted kube-proxy is missing your annotation change, you may need to update the manifest file, usually found in /etc/kubernetes/manifests on the node.