Kube-apiserver complains about remote error bad certificate - kubernetes

I reinstalled some nodes and a master. Now on the master I am getting:
Sep 15 04:53:58 master kube-apiserver[803]: I0915 04:53:58.413581 803 logs.go:41] http: TLS handshake error from $ip:54337: remote error: bad certificate
Where $ip is one of the nodes.
So I likely need to delete or recreate certificates. What would the location of those be? Any recommended commands to recreate or remove those or copy them from node to master or vice versa? Whatever gets me past this error message...

Take a look through the Creating Certificates section of authentication.md. It walks you through the certificates that you need to create and how to pass them to the system components, and you should be able to use that to re-generate certificates for your cluster.

Related

Hyperledger fabric v1.4 certificate renewal gives peer.blocksprovider warning at peer

Entities: Org1 with 2 peers(peer0 & peer1), 1 Orderer, 1 IntCA.
Both peers joining a single channel
I won't be able to add files/logs or code, as it's not allowed to. Hope it's understood.
Network was initially build with peer0+CA+orderer and later peer1 was added into Org1.
Recently we renewed certificates before the expiry date. peer0 and peer1 allows transaction in, but peer1 also throws a warning/error
[peer.blocksprovider] func1 -\u003e WARN 4c87\u001b[0m Encountered an
error reading from deliver stream: rpc error: code = Canceled desc =
context canceled channel=mobileid orderer-address=orderer.xyz.com
What could be the possibility for this error(peer.blocksprovider)? could there be a mistake in cert renewal? if yes, what part could it be?
This issue was due to disabling gossip protocol within an org. Both the peers were leaders in this case and peer1 failing to disseminate blocks to peer0.
Same is not the case with peer0, which makes me think back on this.
Still no idea why this issue triggered after certificate renewal.

What does "certificate is valid for 10.96.0.1, 10.198.74.71, not 127.0.0.1" mean?

What does this error mean? I have Argo workflows working on my development computer, but when I deploy it, this is what I see. Where do I need to start reading to fix it?
ERROR
Post https://127.0.0.1:6443/apis/argoproj.io/v1alpha1/namespaces/argo/workflows: x509: certificate is valid for 10.96.0.1, 10.198.74.71, not 127.0.0.1
For anyone who comes across a strange error like this, it is (again) an RBAC problem. To fix this error, I updated my kubeconfig to reflect the current clusters and roles.

kubeadm join failing on Pi 3 model B. cannot create certificate signing request: the server could not find the requested resource

I have set up a cluster of four Raspberry Pi 3 Model B and installed HypriotOS.
The master node is running fine however when I issue the join command on one of the other Pi 3 nodes it fails with the following error:
HypriotOS/armv7: root#black-pearl_1 in ~
$ kubeadm join --token=f5ffb9.0fefbf6e0f289a61 192.168.1.20 --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Skipping pre-flight checks
[tokens] Validating provided token
[discovery] Created cluster info discovery client, requesting info from "http://192.168.1.20:9898/cluster-info/v1/?token-id=f5ffb9"
[discovery] Cluster info object received, verifying signature using given token
[discovery] Cluster info signature and contents are valid, will use API endpoints [https://192.168.1.20:6443]
[bootstrap] Trying to connect to endpoint https://192.168.1.20:6443
[bootstrap] Detected server version: v1.6.0
[bootstrap] Successfully established connection with endpoint "https://192.168.1.20:6443"
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
failed to request signed certificate from the API server [cannot create certificate signing request: the server could not find the requested resource]
I have tried these solutions and neither of them work, I get the same error on any of the other three Pi nodes when running the join. kubeadm join failing. Unable to request signed cert
To replicate this just follow this set up procedure
https://blog.hypriot.com/post/setup-kubernetes-raspberry-pi-cluster/
Can anyone help with my understanding of what is going wrong and how to fix as I cannot find a solution on the Kubernetes trouble shooting?
To help by responding to Karun's the question about using the --skip-preflight-checks flag. When running the join command without this it simply fails with the following:
HypriotOS/armv7: root#black-pearl_2 in ~
$ kubeadm join --token=f5ffb9.0fefbf6e0f289a61 192.168.1.20
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] The system verification failed. Printing the output from the verification:
OS: Linux
KERNEL_VERSION: 4.4.50-hypriotos-v7+
CONFIG_NAMESPACES: enabled
CONFIG_NET_NS: enabled
CONFIG_PID_NS: enabled
CONFIG_IPC_NS: enabled
CONFIG_UTS_NS: enabled
CONFIG_CGROUPS: enabled
CONFIG_CGROUP_CPUACCT: enabled
CONFIG_CGROUP_DEVICE: enabled
CONFIG_CGROUP_FREEZER: enabled
CONFIG_CGROUP_SCHED: enabled
CONFIG_CPUSETS: enabled
CONFIG_MEMCG: enabled
CONFIG_INET: enabled
CONFIG_EXT4_FS: enabled
CONFIG_PROC_FS: enabled
CONFIG_NETFILTER_XT_TARGET_REDIRECT: enabled (as module)
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled (as module)
CONFIG_OVERLAY_FS: enabled (as module)
CONFIG_AUFS_FS: not set - Required for aufs.
CONFIG_BLK_DEV_DM: enabled (as module)
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
DOCKER_VERSION: 17.03.0-ce
[preflight] Some fatal errors occurred:
unsupported docker version: 17.03.0-ce
hostname "black-pearl_2" must match the regex [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* (e.g. 'example.com')
Port 10250 is in use
/etc/kubernetes/manifests is not empty
/var/lib/kubelet is not empty
/etc/kubernetes/kubelet.conf already exists
[preflight] If you know what you are doing, you can skip pre-flight checks with `--skip-preflight-checks`
The --skip-preflight-checks flag was also required to get the master node running. I therefore assume this issue must be arm specific as the solution to copy
/etc/kubernetes/*
from the master also fails to solve the issue.
Any reason why you are passing --skip-preflight-checks as a flag in your kubeadm join command? Anything failing there? May be that can lead to some clues.
I have not worked on Raspberry, but I faced this sort of an issue on Ubuntu env. For some reason on slave node ('am firing kubeadm join command) /etc/kubernetes folder wasn't there. I just copied from master node contents of /etc/kubernetes/* into slave nodes at same location /etc/kubernetes.
After this run kubeadm join with --skip-preflight-checks explicity.
This time kubeadm got the required resources and

generated serviceaccount token is rejected by kube-apiserver

I have one successfully working cluster, with out any problems, I've tried to make a copy of it. It's working basically, except one issue - token generated by apiserver is not valid with error message:
6 handlers.go:37] Unable to authenticate the request due to an error: crypto/rsa: verification error
I have api server started up with following parameters:
kube-apiserver --address=0.0.0.0 --admission_control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota --service-cluster-ip-range=10.116.0.0/23 --client_ca_file=/srv/kubernetes/ca.crt --basic_auth_file=/srv/kubernetes/basic_auth.csv --authorization-mode=AlwaysAllow --tls_cert_file=/srv/kubernetes/server.cert --tls_private_key_file=/srv/kubernetes/server.key --secure_port=6443 --token_auth_file=/srv/kubernetes/known_tokens.csv --v=2 --cors_allowed_origins=.* --etcd-config=/etc/kubernetes/etcd.config --allow_privileged=False
I think I'm missing something but can't find what exactly, any help will be appreciated!
So, apparently it was wrong server.key used by controller manager.
According to kubernetes documentation token is generated by controller manager.
While I was doing copy of the all my configuration, I had to change ipaddress and had to change certificate due to this as well. But controller-manager started with "old" certificate and after the change created wrong keys because server.key.
You can see this below flag for api server, it works for me. Check this.
--insecure-bind-address=${OS_PRIVATE_IPV4}
--bind-address=${OS_PRIVATE_IPV4}
--tls-cert-file=/srv/kubernetes/server.cert
--tls-private-key-file=/srv/kubernetes/server.key
--client-ca-file=/srv/kubernetes/ca.crt
--admission_control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ResourceQuota
--token-auth-file=/srv/kubernetes/known_tokens.csv
--basic-auth-file=/srv/kubernetes/basic_auth.csv
--etcd_servers=http://${OS_PRIVATE_IPV4}:4001
--service-cluster-ip-range=10.10.0.0/16
--logtostderr=true
--v=5

Not able to connect to cluster. Facing Certificate signed by unknown authority

I am not sure either what I am trying to do is possible or correct way.
One of my colleague spinup kubernetes gce cluster (with 1 master and 4 minions.) in a project which is shared with me as owner access.
After setup he shared his ~/.kubernetes_auth keys along with .kubecfg.crt, .kubecfg.ca.crt and .kubecfg.key. I copied all of the at my home folder and setup the kubernetes workspace.
I also set the project name as a default project in geconfig. and now I can connect to the master and slaves using 'gcutil ssh --zone us-central1-b kubernetes-master'
But when I try to list of existing pods using 'cluster/kubecfg.sh list pods'
I see
"F1017 21:05:31.037148 18021 kubecfg.go:422] Got request error: Get https://107.178.208.109/api/v1beta1/pods?namespace=default: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "ChangeMe")
I tried to debug from my side but failed to come any conclusion. Any sort of clue will be helpful.
You can also copy the cert files off of the master again. They are located in /usr/share/nginx on the master.
It is probably due to a not implemented feature, see this issue:
https://github.com/GoogleCloudPlatform/kubernetes/issues/1886
you can copy the files from /usr/share/nginx/... on the master
into your home dir and try again.
I figured out a workaround: set the -insecure_skip_tls_verify option
In kubecfg.sh, change the code near the bottom to
else
auth_config=(
"-insecure_skip_tls_verify"
)
fi
Obviously this is insecure and you are putting yourself at risk of a man in the middle attack, etc.