rke --debug up --config cluster.yml
fails with health checks on etcd hosts with error:
DEBU[0281] [etcd] failed to check health for etcd host [x.x.x.x]: failed to get /health for host [x.x.x.x]: Get "https://x.x.x.x:2379/health": remote error: tls: bad certificate
Checking etcd healthchecks
for endpoint in $(docker exec etcd /bin/sh -c "etcdctl member list | cut -d, -f5"); do
echo "Validating connection to ${endpoint}/health";
curl -w "\n" --cacert $(docker exec etcd printenv ETCDCTL_CACERT) --cert $(docker exec etcd printenv ETCDCTL_CERT) --key $(docker exec etcd printenv ETCDCTL_KEY) "${endpoint}/health";
done
Running on that master node
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
Validating connection to https://x.x.x.x:2379/health
{"health":"true"}
you can run it manually and see if it responds correctly
curl -w "\n" --cacert /etc/kubernetes/ssl/kube-ca.pem --cert /etc/kubernetes/ssl/kube-etcd-x-x-x-x.pem --key /etc/kubernetes/ssl/kube-etcd-x-x-x-x-key.pem https://x.x.x.x:2379/health
Checking my self signed certificates hashes
# md5sum /etc/kubernetes/ssl/kube-ca.pem
f5b358e771f8ae8495c703d09578eb3b /etc/kubernetes/ssl/kube-ca.pem
# for key in $(cat /home/kube/cluster.rkestate | jq -r '.desiredState.certificatesBundle | keys[]'); do echo $(cat /home/kube/cluster.rkestate | jq -r --arg key $key '.desiredState.certificatesBundle[$key].certificatePEM' | sed '$ d' | md5sum) $key; done | grep kube-ca
f5b358e771f8ae8495c703d09578eb3b - kube-ca
versions on my master node
Debian GNU/Linux 10
rke version v1.3.1
docker version Version: 20.10.8
kubectl v1.21.5
v1.21.5-rancher1-1
I think my cluster.rkestate gone bad, are there any other locations where rke tool checks for certificates?
Currently I cannot do anything with this production cluster, and want to avoid downtime. I experimented on testing cluster different scenarios, I could do as last resort to recreate the cluster from scratch, but maybe I can still fix it...
rke remove && rke up
rke util get-state-file helped me to reconstruct bad cluster.rkestate file
and I was able to successfully rke up and add new master node to fix whole situation.
The problem can be solved by doing the following steps:
Remove kube_config_cluster.yml file where you run rke up command. (Since some data are missing in your K8s nodes)
Remove cluster.rkestate file.
Re-run rke up command.
I have a k8s cluster mounted in a Amazon EC2 instance, and i want configure the CI with GitLab. To do that, GitLab requested me the Kubernetes API URL.
I ran kubectl cluster-info to get the requested information and i can see 3 rows:
Kubernetes master https://10.10.1.253:6443
coredns https://10.10.1.253:6443
kubernetes-dashboard https://10.10.1.253:6443
I suppose that need the Kubernetes master URL but, is a private IP. How i can expose the API correctly ?
Any ideas ?
For better security keep the IPs of the kubernetes master nodes private and use LoadBalancer provided by AWS to expose the Kubernetes API Server. You could also configure TLS termination at the LoadBalancer.
use the kubectl config view to get the server address, it will looks like server: https://172.26.2.101:6443.
First you need to define your public ip of the master node or the load balancer if any as a DNS Alternative. You can do this by,
remove current apiserver certificates
sudo rm /etc/kubernetes/pki/apiserver.*
generate new certificates
sudo kubeadm init phase certs apiserver --apiserver-cert-extra-sans=<public_ip>
Then, you have to capture your admin key, cert and the ca cert from the .kube/config file
client-key-data:
echo -n "LS0...Cg==" | base64 -d > admin.key
client-certificate-data:
echo -n "LS0...Cg==" | base64 -d > admin.crt
certificate-authority-data:
echo -n "LS0...Cg==" | base64 -d > ca.crt
Now you can request your api through curl, example below to request pods info
curl https://<public_ip>:6443/api/v1/pods \
--key admin.key \
--cert admin.crt \
--cacert ca.crt
And of course make sure you allowed required ports
I want to setup an etcd cluster runnin on multiple nodes. I have running 2 unbuntu 18.04 machines running on a Hyper-V terminal.
I followed this guide on the official kubernetes site:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/setup-ha-etcd-with-kubeadm/
Therefore, I changed the several scripts and executed this scripts on HOST0 and HOST1
export HOST0=192.168.101.90
export HOST1=192.168.101.91
mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/
ETCDHOSTS=(${HOST0} ${HOST1} ${HOST2})
NAMES=("infra0" "infra1")
for i in "${!ETCDHOSTS[#]}"; do
HOST=${ETCDHOSTS[$i]}
NAME=${NAMES[$i]}
cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
apiVersion: "kubeadm.k8s.io/v1beta2"
kind: ClusterConfiguration
etcd:
local:
serverCertSANs:
- "${HOST}"
peerCertSANs:
- "${HOST}"
extraArgs:
initial-cluster: ${NAMES[0]}=https://${ETCDHOSTS[0]}:2380,${NAMES[1]}=https://${ETCDHOSTS[1]}:2380
initial-cluster-state: new
name: ${NAME}
listen-peer-urls: https://${HOST}:2380
listen-client-urls: https://${HOST}:2379
advertise-client-urls: https://${HOST}:2379
initial-advertise-peer-urls: https://${HOST}:2380
EOF
done
After that, I executed this command on HOST0
kubeadm init phase certs etcd-ca
I created all the nessecary on HOST0
# cleanup non-reusable certificates
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.k
kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
cp -R /etc/kubernetes/pki /tmp/${HOST1}/
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
# No need to move the certs because they are for HOST0
# clean up certs that should not be copied off this host
find /tmp/${HOST1} -name ca.key -type f -delete
After that, I copied the files to the second ETCTD node (HOST1). Before that I created a root user mbesystem
USER=mbesystem
HOST=${HOST1}
scp -r /tmp/${HOST}/* ${USER}#${HOST}:
ssh ${USER}#${HOST}
USER#HOST $ sudo -Es
root#HOST $ chown -R root:root pki
root#HOST $ mv pki /etc/kubernetes/
I'll check all the files were there on HOST0 and HOST1.
On HOST0 I started the etcd cluster using:
kubeadm init phase etcd local --config=/tmp/192.168.101.90/kubeadmcfg.yaml
On Host1 I started using:
kubeadm init phase etcd local --config=/home/mbesystem/kubeadmcfg.yaml
After I executed:
docker run --rm -it \
--net host \
-v /etc/kubernetes:/etc/kubernetes k8s.gcr.io/etcd:3.4.3-0 etcdctl \
--cert /etc/kubernetes/pki/etcd/peer.crt \
--key /etc/kubernetes/pki/etcd/peer.key \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--endpoints https://192.168.101.90:2379 endpoint health --cluster
I discovered my cluster is not healty, I'll received a connection refused.
I can't figure out what went wrong. Any help will be appreciated.
I've looked into it, reproduced what was in the link that you provided: Kubernetes.io: Setup ha etcd with kubeadm and managed to make it work.
Here is some explanation/troubleshooting steps/tips etc.
First of all etcd should be configured with odd number of nodes. What I mean by that is it should be created as 3 or 5 nodes cluster.
Why an odd number of cluster members?
An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state. For a cluster with n members, quorum is (n/2)+1. For any odd-sized cluster, adding one node will always increase the number of nodes necessary for quorum. Although adding a node to an odd-sized cluster appears better since there are more machines, the fault tolerance is worse since exactly the same number of nodes may fail without losing quorum but there are more nodes that can fail. If the cluster is in a state where it can't tolerate any more failures, adding a node before removing nodes is dangerous because if the new node fails to register with the cluster (e.g., the address is misconfigured), quorum will be permanently lost.
-- Github.com: etcd documentation
Additionally here are some troubleshooting steps:
Check if docker is running You can check it by running command (on systemd installed os):
$ systemctl show --property ActiveState docker
Check if etcd container is running properly with:
$ sudo docker ps
Check logs of etcd container if it's running with:
$ sudo docker logs ID_OF_CONTAINER
How I've managed to make it work:
Assuming 2 Ubuntu 18.04 servers with IP addresses of:
10.156.0.15 and name: etcd-1
10.156.0.16 and name: etcd-2
Additionally:
SSH keys configured for root access
DNS resolution working for both of the machines ($ ping etcd-1)
Steps:
Pre-configuration before the official guide.
I did all of the below configuration with the usage of root account
Configure the kubelet to be a service manager for etcd.
Create configuration files for kubeadm.
Generate the certificate authority.
Create certificates for each member
Copy certificates and kubeadm configs.
Create the static pod manifests.
Check the cluster health.
Pre-configuration before the official guide
Pre-configuration of this machines was done with this StackOverflow post with Ansible playbooks:
Stackoverflow.com: 3 kubernetes clusters 1 base on local machine
You can also follow official documentation: Kubernetes.io: Install kubeadm
Configure the kubelet to be a service manager for etcd.
Run below commands on etcd-1 and etcd-2 with root account.
cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
[Service]
ExecStart=
# Replace "systemd" with the cgroup driver of your container runtime. The default value in the kubelet is "cgroupfs".
ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd
Restart=always
EOF
$ systemctl daemon-reload
$ systemctl restart kubelet
Create configuration files for kubeadm.
Create your configuration file on your etcd-1 node.
Here is modified script that will create kubeadmcfg.yaml for only 2 nodes:
export HOST0=10.156.0.15
export HOST1=10.156.0.16
# Create temp directories to store files that will end up on other hosts.
mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/
ETCDHOSTS=(${HOST0} ${HOST1})
NAMES=("etcd-1" "etcd-2")
for i in "${!ETCDHOSTS[#]}"; do
HOST=${ETCDHOSTS[$i]}
NAME=${NAMES[$i]}
cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
apiVersion: "kubeadm.k8s.io/v1beta2"
kind: ClusterConfiguration
etcd:
local:
serverCertSANs:
- "${HOST}"
peerCertSANs:
- "${HOST}"
extraArgs:
initial-cluster: ${NAMES[0]}=https://${ETCDHOSTS[0]}:2380,${NAMES[1]}=https://${ETCDHOSTS[1]}:2380
initial-cluster-state: new
name: ${NAME}
listen-peer-urls: https://${HOST}:2380
listen-client-urls: https://${HOST}:2379
advertise-client-urls: https://${HOST}:2379
initial-advertise-peer-urls: https://${HOST}:2380
EOF
done
Take a special look on:
export HOSTX on the top of the script. Paste the IP addresses of your machines there.
NAMES=("etcd-1" "etcd-2"). Paste the names of your machines (hostname) there.
Run this script from root account and check if it created files in /tmp/IP_ADDRESS directory.
Generate the certificate authority
Run below command from root account on your etcd-1 node:
$ kubeadm init phase certs etcd-ca
Create certificates for each member
Below is a part of the script which is responsible for creating certificates for each member of etcd cluster. Please modify the HOST0 and HOST1 variables.
#!/bin/bash
HOST0=10.156.0.15
HOST1=10.156.0.16
kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
cp -R /etc/kubernetes/pki /tmp/${HOST1}/
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
# No need to move the certs because they are for HOST0
Run above script from root account and check if there is pki directory inside /tmp/10.156.0.16/.
There shouldn't be any pki directory inside /tmp/10.156.0.15/ as it's already in place.
Copy certificates and kubeadm configs.
Copy your kubeadmcfg.yaml of etcd-1 from /tmp/10.156.0.15 to root directory with:
$ mv /tmp/10.156.0.15/kubeadmcfg.yaml /root/
Copy the content of /tmp/10.156.0.16 from your etcd-1 to your etcd-2 node to /root/ directory:
$ scp -r /tmp/10.156.0.16/* root#10.156.0.16:
After that check if files copied correctly, have correct permissions and copy pki folder to /etc/kubernetes/ with command on etcd-2:
$ mv /root/pki /etc/kubernetes/
Create the static pod manifests.
Run below command on etcd-1 and etcd-2:
$ kubeadm init phase etcd local --config=/root/kubeadmcfg.yaml
All should be running now.
Check the cluster health.
Run below command to check cluster health on etcd-1.
docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes k8s.gcr.io/etcd:3.4.3-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints https://10.156.0.15:2379 endpoint health --cluster
Modify:
--endpoints https://10.156.0.15:2379 with correct IP address of etcd-1
It should give you a message like this:
https://10.156.0.15:2379 is healthy: successfully committed proposal: took = 26.308693ms
https://10.156.0.16:2379 is healthy: successfully committed proposal: took = 26.614373ms
Above message concludes that the etcd is working correctly but please be aware of an even number of nodes.
Please let me know if you have any questions to that.
My kubernetes PKI expired (API server to be exact) and I can't find a way to renew it. The error I get is
May 27 08:43:51 node1 kubelet[8751]: I0527 08:43:51.922595 8751 server.go:417] Version: v1.14.2
May 27 08:43:51 node1 kubelet[8751]: I0527 08:43:51.922784 8751 plugins.go:103] No cloud provider specified.
May 27 08:43:51 node1 kubelet[8751]: I0527 08:43:51.922800 8751 server.go:754] Client rotation is on, will bootstrap in background
May 27 08:43:51 node1 kubelet[8751]: E0527 08:43:51.925859 8751 bootstrap.go:264] Part of the existing bootstrap client certificate is expired: 2019-05-24 13:24:42 +0000 UTC
May 27 08:43:51 node1 kubelet[8751]: F0527 08:43:51.925894 8751 server.go:265] failed to run Kubelet: unable to load bootstrap
kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
The documentation on https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/ describes how to renew but it only works if the API server is not expired. I have tried to do a
kubeadm alpha cert renew all
and do a reboot but that just made the entire cluster fail so I did a rollback to a snapshot (my cluster is running on VMware).
The cluster is running and all containers seem to work but I can't access it via kubectl so I can't really deploy or query.
My kubernetes version is 1.14.2.
So the solution was to (first a backup)
$ cd /etc/kubernetes/pki/
$ mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} ~/
$ kubeadm init phase certs all --apiserver-advertise-address <IP>
$ cd /etc/kubernetes/
$ mv {admin.conf,controller-manager.conf,kubelet.conf,scheduler.conf} ~/
$ kubeadm init phase kubeconfig all
$ reboot
then
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
that did the job for me and thanks for your hints :)
This topic is also discussed in:
https://github.com/kubernetes/kubeadm/issues/581
after 1.15 kubeadm upgrade automatically will renewal the certificates for you!
also 1.15 added a command to check cert expiration in kubeadm
Kubernetes: expired certificate
Kubernetes v1.15 provides docs for "Certificate Management with kubeadm":
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
Check certificate expiration:
kubeadm alpha certs check-expiration
Automatic certificate renewal:
kubeadm renews all the certificates during control plane upgrade.
Manual certificate renewal:
You can renew your certificates manually at any time with the kubeadm alpha certs renew command.
This command performs the renewal using CA (or front-proxy-CA) certificate and key stored in /etc/kubernetes/pki.
Overall for Kubernetes v1.14 I find this procedure the most helpful:
https://stackoverflow.com/a/56334732/1147487
I was using Kubernetes v15.1 and updated my certificates as explained above, but I still got the same error.
The /etc/kubernetes/kubelet.conf was still referring to the expired/old "client-certificate-data".
After some research I found out that kubeadm is not updating the /etc/kubernetes/kubelet.conf file if the certificate-renew was not set to true. So please be aware of a bug of kubeadm below version 1.17 (https://github.com/kubernetes/kubeadm/issues/1753).
kubeadm only upgrades if the cluster upgrade was done with certificate-renewal=true. So I manually had to delete the /etc/kubernetes/kubelet.conf and regenerated it with kubeadm init phase kubeconfig kubelet which finally fixed my problem.
Try to do cert renewal via kubeadm init phase certs command.
You can check certs expiration via the following command:
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text
openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -text
First, ensure that you have most recent backup of k8s certificates inventory /etc/kubernetes/pki/*.
Delete apiserver.* and apiserver-kubelet-client.* cert files in /etc/kubernetes/pki/ directory.
Spawn a new certificates via kubeadm init phase certs command:
sudo kubeadm init phase certs apiserver
sudo kubeadm init phase certs apiserver-kubelet-client
Restart kubelet and docker daemons:
sudo systemctl restart docker; sudo systemctl restart kubelet
You can find more related information in the official K8s documentation.
This will update all certs under /etc/kubernetes/ssl
kubeadm alpha certs renew all --config=/etc/kubernetes/kubeadm-config.yaml
and do this to restart server commpenont:
kill -s SIGHUP $(pidof kube-apiserver)
kill -s SIGHUP $(pidof kube-controller-manager)
kill -s SIGHUP $(pidof kube-scheduler)
[root#nrchbs-slp4115 ~]# kubectl get apiservices |egrep metrics
v1beta1.metrics.k8s.io kube-system/metrics-server True 125m
[root#nrchbs-slp4115 ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 20d
metrics-server ClusterIP 10.99.2.11 <none> 443/TCP 125m
[root#nrchbs-slp4115 ~]# kubectl get ep -n kube-system
NAME ENDPOINTS AGE
kube-controller-manager <none> 20d
kube-dns 10.244.0.5:53,10.244.0.6:53,10.244.0.5:53 + 3 more... 20d
kube-scheduler <none> 20d
metrics-server 10.244.2.97:443 125m
[root#nrchbs-slp4115 ~]#
To help anyone else with Multi-Master setup as I was searching for the answer after the first master has been updated on the second master I did this I found this from another question:
kubeadm only upgrades if the cluster upgrade was done with certificate-renewal=true. So I manually had to delete the /etc/kubernetes/kubelet.conf and regenerated it with kubeadm init phase kubeconfig kubelet which finally fixed my problem.
I use a config.yaml to configure the Masters so for me, the answer was:
sudo -i
mkdir -p ~/k8s_backup/etcd
cd /etc/kubernetes/pki/
mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} ~/k8s_backup
cd /etc/kubernetes/pki/etcd
mv {healthcheck-client.crt,healthcheck-client.key,peer.crt,peer.key,server.crt,server.key} ~/k8s_backup/etcd/
kubeadm init phase certs all --ignore-preflight-errors=all --config /etc/kubernetes/config.yaml
cd /etc/kubernetes
mv {admin.conf,controller-manager.conf,kubelet.conf,scheduler.conf} ~/k8s_backup
kubeadm init phase kubeconfig all --config /etc/kubernetes/config.yaml --ignore-preflight-errors=all
For good measure I reboot
shutdown now -r
The best solution is as Kim Nielsen wrote.
$ cd /etc/kubernetes/pki/
$ mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} ~/
$ kubeadm init phase certs all --apiserver-advertise-address <IP>
$ cd /etc/kubernetes/
$ mv {admin.conf,controller-manager.conf,kubelet.conf,scheduler.conf} ~/
$ kubeadm init phase kubeconfig all
$ reboot
With the next command, you can check when the new certifications will be expired:
$ kubeadm alpha certs check-expiration --config=/etc/kubernetes/kubeadm-config.yaml
or
$ kubeadm certs check-expiration --config=/etc/kubernetes/kubeadm-config.yaml
However, if you have more than one master, you need to copy these files on them.
Log in to the second master and do (backup):
$ cd /etc/kubernetes/pki/
$ mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} ~/
$ cd /etc/kubernetes/
$ mv {admin.conf,controller-manager.conf,kubelet.conf,scheduler.conf} ~/
Then log in to the first master (where you created new certificates) and do the next commands (node2 should be changed with IP address of second master machine):
$ rsync /etc/kubernetes/pki/*.crt -e ssh root#node2:/etc/kubernetes/pki/
$ rsync /etc/kubernetes/pki/*.key -e ssh root#node2:/etc/kubernetes/pki/
$ rsync /etc/kubernetes/*.conf -e ssh root#node2:/etc/kubernetes/
In my case, I've got same issue after certificate renewal. My cluster was built using Kubespray. My kubelet stopped working and it was saying that I do not have /etc/kubernetes/bootstrap-kubelet.conf file so I looked what this config does.
--bootstrap-kubeconfig string
| Path to a kubeconfig file that will be used to get client certificate for kubelet. If the file specified by --kubeconfig does not exist, the bootstrap kubeconfig is used to request a client certificate from the API server. On success, a kubeconfig file referencing the generated client certificate and key is written to the path specified by --kubeconfig. The client certificate and key file will be stored in the directory pointed by --cert-dir.
I understood that this file might not be needed.
A note that I renewed k8s 1.19 with:
kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew apiserver
kubeadm alpha certs renew front-proxy-client
... and that was not sufficient.
The solution was
cp -r /etc/kubernetes /etc/kubernetes.backup
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf
kubeadm alpha kubeconfig user --client-name system:node:YOUR_MASTER_HOSTNAME_IS_HERE --org system:nodes > /etc/kubernetes/kubelet.conf
kubectl --kubeconfig /etc/kubernetes/admin.conf get nodes
I am able to solve this issue with below steps:
openssl req -new -key <existing.key> -sub "/CN=system:node:<HOST_NAME>/O=system:nodes" -out new.csr
<existing.key> - use it from kubelet.conf
<HOST_NAME> - hostname in lower case ( refer expired certificate with command openssl ex: openssl x509 -in old.pem -text -noout)
openssl x509 -req -in new.csr -CA <ca.crt> -CAkey <ca.key> -CAcreateserial -out new.crt -days 365
<ca.crt> - from master node
<ca.key> - from
new.crt - is the renewed certificate replace with expired certificate.
Once replaced , restart kubelet and docker or any other container service.
kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) >/etc/kubernetes/kubelet.conf
Renew the kubelet.conf helps me solve this issue.
Running under Ubuntu I used kubeadm init to setup my cluster (master node) and copied over the /etc/kubernetes/admin.conf $HOME/.kube/config and all was well when using kubectl.
However after a reboot my master node has had an IP address change which is not the same as what is in $HOME/.kube/config so now I can no longer connect kubectl
So how do I regenerate the admin.conf now that I have a new IP address? Running kubeadm init will just kill everything which is not what I want.
I found this solution on the internet and it works for me:
systemctl stop kubelet docker
cd /etc/
mv kubernetes kubernetes-backup
mv /var/lib/kubelet /var/lib/kubelet-backup
mkdir -p kubernetes
cp -r kubernetes-backup/pki kubernetes
rm kubernetes/pki/{apiserver.*,etcd/peer.*}
systemctl start docker
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd
#Run "kubeadm reset" on all nodes if was this error "error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists"
cp kubernetes/admin.conf ~/.kube/config
kubectl get nodes --sort-by=.metadata.creationTimestamp
kubectl delete node $(kubectl get nodes -o jsonpath='{.items[(#.status.conditions[0].status=="Unknown")].metadata.name}')
kubectl get pods --all-namespaces
After These, Join your Slaves to Master.
Reference: https://medium.com/#juniarto.samsudin/ip-address-changes-in-kubernetes-master-node-11527b867e88
The following command can be used to regenerate admin.conf
kubeadm alpha phase kubeconfig admin --apiserver-advertise-address <new_ip>
However, if you use an IP instead of a hostname, your API-server certificate will be invalid. So, either regenerate your certs ( kubeadm alpha phase certs renew apiserver ), use hostnames instead of IPs or add the insecure --insecure-skip-tls-verify flag when using kubectl
You do not want to use kubeadm reset. That will reset everything and you would have to start configuring your cluster again.
Well, in your scenario, please have a look on the steps below:
nano /etc/hosts (update your new IP against YOUR_HOSTNAME)
nano /etc/kubernetes/config (configuration settings related to your cluster) here in this file look for the following params and update accordingly
KUBE_MASTER="--master=http://YOUR_HOSTNAME:8080"
KUBE_ETCD_SERVERS="--etcd-servers=http://YOUR_HOSTNAME:2379" #2379 is default port
nano /etc/etcd/etcd.conf (conf related to etcd)
KUBE_ETCD_SERVERS="--etcd-servers=http://YOUR_HOSTNAME/WHERE_EVER_ETCD_HOSTED:2379"
2379 is default port for etcd. and you can have multiple etcd servers defined here comma separated
Restart kubelet, apiserver, etcd services.
It is good to use hostname instead of IP to avoid such scenarios.
Hope it helps!