I've just installed a kubernetes testinstallation directly on my fedora laptop using this guide.
After starting kube2sky I've noticed I can't connect to the kubernetes api since the certificates are required. kubernetes-ro is deprecated and no longer available on my machine, so I get the following errors:
E0627 15:58:07.145720 1 reflector.go:133] Failed to list *api.Service: Get https://10.254.0.1:443/api/v1beta3/services: x509: failed to load system roots and no roots provided
E0627 15:58:07.146844 1 reflector.go:133] Failed to list *api.Endpoints: Get https://10.254.0.1:443/api/v1beta3/endpoints: x509: failed to load system roots and no roots provided
How can I setup the certificates?
This has been a common problem for folks that aren't running on setups that use salt to automatically configure system secrets on the master node (as GCE does). This has been fixed at head and should be fixed in the next release.
In the mean time, you can manually create a secret for the DNS service that contains a kubeconfig file for kube2sky to connect to the master. You can see how this is done on GCE by looking at the create-kubeconfig-secret function in kube-addons.sh (when called with the username "system:dns"). The name of the resulting secret should be token-system-dns.
Related
I have deployed DigitalOcean Chatwoot. Now I need to add the certificate created in the cluster config so that it can be accessed using HTTPS.
I am following this documentation, to manage TLS Certificates in the Kubernetes Cluster.
After following the steps, mentioned in the documentation, I got these
But, as mentioned in the last point of the documentation, I need to add the certificate keys in the Kubernetes controller manager. To do that I used this command
kube-controller-manager --cluster-signing-cert-file ca.pem --cluster-signing-key-file ca-key.pem
But I am getting the below error :
I0422 16:02:37.871541 1032467 serving.go:348] Generated self-signed cert in-memory
W0422 16:02:37.871592 1032467 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
W0422 16:02:37.871598 1032467 client_config.go:622] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined
invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
Can anyone please tell me how can I configure the cluster to provide security (HTTPS), to the cluster?
I am trying to implement CI / CD using GitLab + Terraform to K8S Cluster and K8S Control Plane (Master node) was setup on CentOS
However, Pipeline job fails with the following error
Error: Failed to get existing workspaces: Get "https://192.xx.xx.xx/api/v1/namespaces/default/secrets?labelSelector=tfstate%3Dtrue": dial tcp 192.xx.xx.xx:443: i/o timeout
From the error mentioned above (default/secrets?labelSelector=tfstate%3Dtrue), I assume the error is related to missing 'terraform secret' on default namespace
Example (Terraform secret taken from my Windows)
PS C:\> kubectl get secret
NAME TYPE DATA AGE
default-token-7mzv6 kubernetes.io/service-account-token 3 27d
tfstate-default-state Opaque 1 15h
However, I am not sure which process would create 'tfsecret' or should we create it manually ?
Kindly let me know if I my understanding is wrong and had I missed anything else
EDIT
The issue mentioned above occurred because existing Gitlab-runner was on a different subnet (eg 172.xx.xx.xx instead of 192.xx.xx.xx)
I was asked to use a different Gitlab-runner which runs on the same subnet and now it throws the following error
Error: Failed to get existing workspaces: Get "https://192.xx.xx.xx:6443/api/v1/namespaces/default/secrets?labelSelector=tfstate%3Dtrue": x509: certificate signed by unknown authority
Now, I am bit confused whether the certificate-issue is between GitLab-Runner and Gitlab-Server or Gitlab-Server and K8S Cluster or something else
You have configured Kubernetes as the remote state backend for your Terraform configuration. The error is, that the backend is trying to query existing secrets to determine what workspaces are configured. The x509: certificate signed by unknown authority indicates, that the KUBECONFIG the remote state backend uses does not match the CA of the API server you're connecting to.
If the runners are K8s pods themselves, make sure you provide a KUBECONFIG that matches your target cluster and that the remote state does not configure itself as in-cluster by reading the service account token every K8s pod has - which in most cases will only work for the cluster the pod is running on.
You don't provide enough information to be more specific. But big picture, you have to configure the state backend, and any provider that connect to K8s. Theoretically, the state backend secrets and the K8s resources do not have to be on the same cluster. Meaning, you may have to have different configuration for state backend and K8s providers.
I have the following scenario in the lab and would like to see if its possible to recover. The cluster is broken but very expected since I was testing how far I could go with breaking the cluster and still be able to recover.
Env:
Kubernetes 1.16.3
Kubespray
I was experimenting a bit and don't have any data on this cluster but I am still very curious if it's possible to recover. I have a healthy 3 node etcd cluster with the original configuration (all namespaces, workloads, configmaps etc). I don't have the original SSL certs for the control plane.
I removed all nodes from the cluster (kubeadm reset). I have original manifests and kubelet config and try to re-init master nodes. It is quite more successful than I thought it would be but not where I want it to be.
After successful kubeadm init, the kubelet and control plane containers start successfully but the corresponding pods are not created. I am able to use the kube API with kubectl and see the nodes, namespaces, deployments, etc.
In the kube-system namespace all daemonsets still exist but the pods won't start with the following message:
49m Warning FailedCreate daemonset/kube-proxy Error creating: Timeout: request did not complete within requested timeout
The kubelet logs the following re control plane pods
Jul 21 22:30:02 k8s-master-4 kubelet[13791]: E0721 22:30:02.088787 13791 kubelet.go:1664] Failed creating a mirror pod for "kube-scheduler-k8s-master-4_kube-system(3e128801ef687b022f6c8ae175c9c56d)": Timeout: request did not complete within requested timeout
Jul 21 22:30:53 k8s-master-4 kubelet[13791]: E0721 22:30:53.089517 13791 kubelet.go:1664] Failed creating a mirror pod for "kube-controller-manager-k8s-master-4_kube-system(da5cfae13814fa171a320ce0605de98f)": Timeout: request did not complete within requested timeout
During kubeadm reset/init process I already have some steps so I can get to where I am now (delete serviceaccounts to reset the tokens, delete some configmaps (kuebadm etc))
My question is - is it possible to recover the control plane without the certs. And if its complicated but still possible process I would still like to know.
All help appreciated
Henro
is it possible to recover the control plane without the certs.
Yes, should be able to. The certs 🔏 are required but they don't have to be the very same ones that you created the cluster initially with. All the certificates including the CA can be rotated across the board. The kubelet even supports certificate auto-rotation. The configurations need to match everywhere though. Meaning the CA needs to be the same that created the CSRs and cert keys/certs need to be created from the same CSRs. 🔑
Also, all the components need to use the same CA and be able to authenticate with the API server (kube-controller-manager, kube-scheduler, etc) 🔐. I'm not entirely sure about the logs that you are seeing but it looks like the kube-controller-manager and kube-scheduler are not able to authenticate and join the cluster. So I would take a look at their cert configurations:
/etc/kubernetes/kube-controller-manager.conf
/etc/kubernetes/kube-scheduler.conf
Also, you would find every PKI component that you need to verify under /etc/kubernetes/pki
✌️
I'm trying to install Ceph using Helm on Kunbernetes following this tutorial
install ceph
Probably the problem is that I installed trow registry before because as soon as I run the helm step
helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml
I get this error in ceph namespace
Error creating: Internal error occurred: failed calling webhook "validator.trow.io": Post https://trow.kube-public.svc:443/validate-image?timeout=30s: dial tcp 10.102.137.73:443: connect: connection refused
How can I solve this?
Apparently you are right with the presumption, I have a few concerns about this issue.
Trow registry manager controls the images that run in the cluster via implementing Admission webhooks that validate every request before pulling image, and as far as I can see Docker Hub images are not accepted by default.
The default policy will allow all images local to the Trow registry to
be used, plus Kubernetes system images and the Trow images themselves.
All other images are denied by default, including Docker Hub images.
Due to the fact that during Trow installation procedure you might require to distribute and approve certificate in order to establish secure HTTPS connection from target node to Trow server, I would suggest to check certificate presence on the node where you run ceph-helm chart as described in Trow documentation.
The other option you can run Trow registry manager with disabled TLS over HTTP, as was guided in the installation instruction.
This command should help to get it cleaned.
kubectl delete ValidatingWebhookConfiguration -n rook-ceph rook-ceph-webhook
I'm a bit confused by this, because it was working for days without issue.
I use to be able to join nodes to my cluster withoout issue. I would run the below on the master node:
kubeadm init .....
After that, it would generate a join command and token to issue to the other nodes I want to join. Something like this:
kubeadm join --token 99385f.7b6e7e515416a041 192.168.122.100
I would run this on the nodes, and they would join without issue. The next morning, all of a sudden this stopped working. This is what I see when I run the command now:
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for
production clusters.
[preflight] Running pre-flight checks
[tokens] Validating provided token
[discovery] Created cluster info discovery client, requesting info from "http://192.168.122.100:9898/cluster-info/v1/?token-id=99385f"
[discovery] Cluster info object received, verifying signature using given token
[discovery] Cluster info signature and contents are valid, will use API endpoints [https://192.168.122.100:6443]
[bootstrap] Trying to connect to endpoint https://192.168.122.100:6443
[bootstrap] Detected server version: v1.6.0-rc.1
[bootstrap] Successfully established connection with endpoint "https://192.168.122.100:6443"
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
failed to request signed certificate from the API server [cannot create certificate signing request: the server could not find the requested resource]
It seems like the node I'm trying to join does successfully connect to the API server on the master node, but for some reason, it now fails to request a certificate.
Any thoughts?
To me
sudo service kubelet restart
didn't work.
What I did was the following:
Copied from master node contents of /etc/kubernetes/* into slave nodes at same location /etc/kubernetes
I tried again "kubeadm join ..." command. This time the nodes joined the cluster without any complaint.
I think this is a temporary hack, but worked!
ok, I just stop and started kubelet on the master node as shown below, and things started working again:
sudo service kubelet stop
sudo service kubelet start
EDIT:
This only seemed to work on time for me.