Kubernetes Multi Master setup failed - kubernetes

I am currently trying to setup multi master setup on enterprise environment.
OS: RHEL 7.5
Kubernetes Versions: 1.22.1
Documentation followed to setup master node & LB nodes:
https://www.lisenet.com/2021/install-and-configure-a-multi-master-ha-kubernetes-cluster-with-kubeadm-haproxy-and-keepalived-on-centos-7/
While running kubeadm init command we are getting the error and stopped installation.
Can you please help us to fix the issues
enter code hereI0917 11:05:08.005343 26417 waitcontrolplane.go:89] [wait-control-plane] Waiting for the API server to be healthy
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
couldn't initialize a Kubernetes cluster

Related

Failed to start minikube on Monterey

I tried to install minikube via below steps on my Monterey:
1.brew install docker
2.download virtualbox 6.1 from https://www.virtualbox.org/ and install
3.curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64 sudo install minikube-darwin-amd64 /usr/local/bin/minikube
(it seems homebrew no longer supports virtualbox and minikube?)
then, I tried to start minikube:
minikube start --driver=virtualbox
but got below errors:
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID'
stderr:
W0707 14:15:11.219566 7860 initconfiguration.go:120] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/var/run/cri-dockerd.sock". Please update your configuration!
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
💡 Suggestion: Check output of 'journalctl -xeu kubelet', try passing --extra-config=kubelet.cgroup-driver=systemd to minikube start
🍿 Related issue: https://github.com/kubernetes/minikube/issues/4172
any advice?
Below worked for me today
minikube delete --all
minikube start --force-systemd
Source: https://github.com/kubernetes/minikube/issues/14561

While creating kubernetes cluster with kubeadm with loadbalancer dns and port in aws getting Error

I'm trying to create high availability cluster for that I did following procedure
created Ec2 instance and added that to loadbalancer
By using that loadbalancer port and ip I gave following command.
sudo kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --upload-certs
but master node is not created
getting this error
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
when I give systemctl status kubelet
I'm getting ip not found error
can someone please help

After K8s Master reboot, apiserver throws error “x509: certificate has expired or is not yet valid”

I have multimaster kubernetes cluster setup using Kubespray.
I ran an application using helm on it which increased the load on master drastically. this made master almost inaccessible. After that I shutdown masters one by one and increased RAM and CPU on them. But after rebooting, both apiserver and scheduler pods are failing to start. They are in "CreateContainerError" state.
APIserver is logging lot of errors with the message x509: certificate has expired or is not yet valid.
There are other threads for this error and most of them suggest to fix apiserver or cluster certificates. But this is newly setup cluster and certificates are valid till 2020.
Here are some details of my cluster.
CentOS Linux release: 7.6.1810 (Core)
Docker version: 18.06.1-ce, build e68fc7a
Kubernetes Version
Client Version: v1.13.2
Server Version: v1.13.2
It is very much possible that during shutdown/reboot, docker containers for apiserver and scheduler got exited with non zero exit status like 255. I sugggest you to first remove all containers with non zero exit status using docker rm command. Do this on all masters, rather on worker nodes as well.
By default, kubernetes starts new pods for all services (apiserver, shceduler, controller-manager, dns, pod network etc.) after a reboot. You can see newly started containers for these services using docker commands
example:
docker ps -a | grep "kube-apiserver" OR
docker ps -a | grep "kube-scheduler"
after removing exited containers, I believe, new pods for apiserver and scheduler should run properly in the cluster and should be in "Running" status.

Unable to see join nodes in Kubernetes master

This is my worker node:
root#ivu:~# kubeadm join 10.16.70.174:6443 --token hl36mu.0uptj0rp3x1lfw6n --discovery-token-ca-cert-hash sha256:daac28160d160f938b82b8c720cfc91dd9e6988d743306f3aecb42e4fb114f19 --ignore-preflight-errors=swap
[preflight] Running pre-flight checks.
[WARNING Swap]: running with swap on is not supported. Please disable swap
[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[discovery] Trying to connect to API Server "10.16.70.174:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.16.70.174:6443"
[discovery] Requesting info from "https://10.16.70.174:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.16.70.174:6443"
[discovery] Successfully established connection with API Server "10.16.70.174:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
While checking in master nodes using command kubectl get nodes, I can only able to see master:
ivum01#ivum01-HP-Pro-3330-SFF:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ivum01-hp-pro-3330-sff Ready master 36m v1.10.0
For question answer:
docker kubelet kubeadm kubectl installed fine
kubectl get node can not see the current added node; of cause kubectl get pods --all-namespaces has no result for this node;
docker which in current node has no report for kubeadm command(means no k8s images pull, no running container for that)
must import is kubelet not running in work node
run kubelet output:
Failed to get system container stats for "/user.slice/user-1000.slice/session-1.scope": failed to get cgroup stats for "/user.slice/user-1000.slice/session-1.scope": failed to get container info for "/user.slice/user-1000.slice/session-1.scope": unknown container "/user.slice/user-1000.slice/session-1.scope"
same as this issue said
tear down and reset cluster(kubeadm reset) and redo that has no problem in my case.
I had this problem and it was solved by ensuring that cgroup driver on the worker nodes also were set properly.
check with:
docker info | grep -i cgroup
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
set it with:
sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
then restart kubelet service and rejoin the cluster:
systemctl daemon-reload
systemctl restart kubelet
kubeadm reset
kubeadm join ...
Info from docs: https://kubernetes.io/docs/tasks/tools/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node

"Mirror pod not available" errors - kubernetes HA setup

I have a multi-master kubernetes v1.1.3 setup. I use the hyperkube binary to run kubelet on the master and have the scheduler and controller manager running in pods with the podmaster acting as the elector. The master's are loadbalanced using HAproxy, and the kubelet on the minions uses that loadbalanced IP as their apiserver endpoint. I am using docker v1.8.2.
I am not having any errors in use, I can deploy pods that are running without issue across the minions and everything seems happy.
However, I am getting constant error spam (every 10 seconds) from the kubelet on the master:
[err] [kubelet] E0201 21:10:54 kubelet.go:1361] Mirror pod not available
[err] [kubelet] E0201 21:10:54 kubelet.go:1361] Mirror pod not available
[err] [kubelet] E0201 21:10:54 kubelet.go:1361] Mirror pod not available
[err] [kubelet] E0201 21:10:54 kubelet.go:1361] Mirror pod not available
[err] [kubelet] E0201 21:11:04 kubelet.go:1361] Mirror pod not available
I am concerned that this is symptomatic of something set up improperly that I am missing. Is this anything to be concerned of? If not, is there anything I could configure to silence the error spam?
edit: here are the options I am running kubelet with:
kubelet on master (mostly standard):
hyperkube kubelet --v=2 --config=/etc/kubernetes/manifests --kubeconfig=<path-to-config> --port=10250 --address=0.0.0.0 --cadvisor-port=4194 --allow-privileged=true
all pod manifests: http://pastebin.com/MJfQz78r
It seems to be something related to deprecated feature static pod - http://kubernetes.io/v1.1/docs/admin/static-pods.html
Source code - https://github.com/kubernetes/kubernetes/blob/v1.1.4/pkg/kubelet/kubelet.go#L1361
IMHO you can ignore it. I didn't find this feature in 1.2alpha7 source code. Maybe you can decrease your log level --v=0 https://github.com/golang/glog
Thank you for your manifests.