Istio Pods Not Coming Up - kubernetes

I have Installed Istio-1.8.3 via Rancher UI long back and Istio Pods and
Ingress Gateway Pods are Up and Running and My Application is getting served by Istio.
Now, Recently We have Upgraded the K8's Cluster version from 1.21 to 1.22 and then 1.22 to 1.23.
Once We restart the kubelet, Istio Pods used to come up with No Issues.
Now, Because of few Issues We have rebooted the Node and Istio Pods got restarted, They are in Running State but Readiness Probe is getting failed.
The Error I was able to find is
failed to list CRDs: the server could not find the requested resource
Below are the full logs of Istio Pod.
stream logs failed container "discovery" in pod "istiod-5fbc9568cd-qgqkk" is waiting to start: ContainerCreating for istio-system/istiod-5fbc9568cd-qgqkk (discovery)
2022-06-27T05:35:32.772949Z info FLAG: --log_rotate_max_age="30"
2022-06-27T05:35:32.772952Z info FLAG: --log_rotate_max_backups="1000"
2022-06-27T05:35:32.772955Z info FLAG: --log_rotate_max_size="104857600"
2022-06-27T05:35:32.772958Z info FLAG: --log_stacktrace_level="default:none"
2022-06-27T05:35:32.772963Z info FLAG: --log_target="[stdout]"
2022-06-27T05:35:32.772971Z info FLAG: --mcpInitialConnWindowSize="1048576"
2022-06-27T05:35:32.772974Z info FLAG: --mcpInitialWindowSize="1048576"
2022-06-27T05:35:32.772977Z info FLAG: --mcpMaxMsgSize="4194304"
2022-06-27T05:35:32.772980Z info FLAG: --meshConfig="./etc/istio/config/mesh"
2022-06-27T05:35:32.772982Z info FLAG: --monitoringAddr=":15014"
2022-06-27T05:35:32.772985Z info FLAG: --namespace="istio-system"
2022-06-27T05:35:32.772988Z info FLAG: --networksConfig="/etc/istio/config/meshNetworks"
2022-06-27T05:35:32.772999Z info FLAG: --plugins="[authn,authz,health]"
2022-06-27T05:35:32.773002Z info FLAG: --profile="true"
2022-06-27T05:35:32.773005Z info FLAG: --registries="[Kubernetes]"
2022-06-27T05:35:32.773008Z info FLAG: --resync="1m0s"
2022-06-27T05:35:32.773011Z info FLAG: --secureGRPCAddr=":15012"
2022-06-27T05:35:32.773013Z info FLAG: --tlsCertFile=""
2022-06-27T05:35:32.773016Z info FLAG: --tlsKeyFile=""
2022-06-27T05:35:32.773018Z info FLAG: --trust-domain=""
2022-06-27T05:35:32.801976Z info klog Config not found: /var/run/secrets/remote/config[]
2022-06-27T05:35:32.803516Z info initializing mesh configuration ./etc/istio/config/mesh
2022-06-27T05:35:32.804499Z info mesh configuration: {
"proxyListenPort": 15001,
"connectTimeout": "10s",
"protocolDetectionTimeout": "0s",
"ingressClass": "istio",
"ingressService": "istio-ingressgateway",
"ingressControllerMode": "STRICT",
"enableTracing": true,
"defaultConfig": {
"configPath": "./etc/istio/proxy",
"binaryPath": "/usr/local/bin/envoy",
"serviceCluster": "istio-proxy",
"drainDuration": "45s",
"parentShutdownDuration": "60s",
"discoveryAddress": "istiod.istio-system.svc:15012",
"proxyAdminPort": 15000,
"controlPlaneAuthPolicy": "MUTUAL_TLS",
"statNameLength": 189,
"concurrency": 2,
"tracing": {
"zipkin": {
"address": "zipkin.istio-system:9411"
}
},
"envoyAccessLogService": {
},
"envoyMetricsService": {
},
"proxyMetadata": {
"DNS_AGENT": ""
},
"statusPort": 15020,
"terminationDrainDuration": "5s"
},
"outboundTrafficPolicy": {
"mode": "ALLOW_ANY"
},
"enableAutoMtls": true,
"trustDomain": "cluster.local",
"trustDomainAliases": [
],
"defaultServiceExportTo": [
"*"
],
"defaultVirtualServiceExportTo": [
"*"
],
"defaultDestinationRuleExportTo": [
"*"
],
"rootNamespace": "istio-system",
"localityLbSetting": {
"enabled": true
},
"dnsRefreshRate": "5s",
"certificates": [
],
"thriftConfig": {
},
"serviceSettings": [
],
"enablePrometheusMerge": true
}
2022-06-27T05:35:32.804516Z info version: 1.8.3-e282a1f927086cc046b967f0171840e238a9aa8c-Clean
2022-06-27T05:35:32.804699Z info flags:
2022-06-27T05:35:32.804706Z info initializing mesh networks
2022-06-27T05:35:32.804877Z info mesh networks configuration: {
"networks": {
}
}
2022-06-27T05:35:32.804938Z info initializing mesh handlers
2022-06-27T05:35:32.804949Z info initializing controllers
2022-06-27T05:35:32.804952Z info No certificates specified, skipping K8S DNS certificate controller
2022-06-27T05:35:32.814002Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:35:33.816596Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:35:35.819157Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:35:39.821510Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:35:47.823675Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:36:03.827023Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:36:35.829441Z error kube failed to list CRDs: the server could not find the requested resource
2022-06-27T05:37:35.831758Z error kube failed to list CRDs: the server could not find the requested resource

Upgrading Istio Pilot and Istio Ingress Gateway from 1.8.3 to 1.10.2 will work.
https://github.com/istio/istio/issues/34665

Istio version 1.8.x is too old version for kubernetes 1.23. You can refer istio documentation for k8s and istio combinations support and upgrade istio

Related

How to install Eclipse-che in azure Kubernetes cluster

I'm trying to install the Eclipse-Che by following this blog : https://che.eclipseprojects.io/2022/07/25/#karatkep-installing-eclipse-che-on-aks.html,
yet following all the steps i'm not able to successfully install the Eclipse che.
1)
After running this command:
kubectl logs -l app.kubernetes.io/component=che-operator -n eclipse-che -f
these are the errors i'm facing:
logs: Waited for 1.034843163s due to client-side throttling, not priority and fairness, request: GET:https://10.1.0.1:443/apis/discovery.k8s.io/v1?timeout=32s
time="2022-09-12T14:08:29Z" level=info msg="Successfully reconciled."
2) the Che-gateway pod is failing:
che-gateway-7d54ccdd59-bblw6 3/4 CrashLoopBackOff 18 (2m51s ago) 70m
Description: Oauth-proxy container is getting failed (Crash loop back error)
Logs of the oauth- Proxy container:
#invalid configuration:
missing setting: login-url
missing setting: redeem-url

Linkerd inbound port annotation leads to "Failed to bind inbound listener"

We are using Linkerd 2.11.1 on Azure AKS Kubernetes. Amongst others there is a Deployment using using an Alpine Linux image containing Apache/mod_php/PHP8 serving an API. HTTPS is resolved by Traefik v2 with cert-manager, so that in coming traffic to the APIs is on port 80. The Linkerd proxy container is injected as a Sidecar.
Recently I saw that the API containers return 504 errors during a short period of time when doing a Rolling deployment. In the Sidecars log, I found the following :
[ 0.000590s] INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
[ 0.001062s] INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191
[ 0.001078s] INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143
[ 0.001081s] INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140
[ 0.001083s] INFO ThreadId(01) linkerd2_proxy: Tap interface on 0.0.0.0:4190
[ 0.001085s] INFO ThreadId(01) linkerd2_proxy: Local identity is default.my-api.serviceaccount.identity.linkerd.cluster.local
[ 0.001088s] INFO ThreadId(01) linkerd2_proxy: Identity verified via linkerd-identity-headless.linkerd.svc.cluster.local:8080 (linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local)
[ 0.001090s] INFO ThreadId(01) linkerd2_proxy: Destinations resolved via linkerd-dst-headless.linkerd.svc.cluster.local:8086 (linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local)
[ 0.014676s] INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity: default.my-api.serviceaccount.identity.linkerd.cluster.local
[ 3674.769855s] INFO ThreadId(01) inbound:server{port=80}: linkerd_app_inbound::detect: Handling connection as opaque timeout=linkerd_proxy_http::version::Version protocol detection timed out after 10s
My guess is that this detection leads to the 504 errors somehow. However, if I add the linkerd inbound port annotation to the pod template (terraform syntax):
resource "kubernetes_deployment" "my_api" {
metadata {
name = "my-api"
namespace = "my-api"
labels = {
app = "my-api"
}
}
spec {
replicas = 20
selector {
match_labels = {
app = "my-api"
}
}
template {
metadata {
labels = {
app = "my-api"
}
annotations = {
"config.linkerd.io/inbound-port" = "80"
}
}
I get the following:
time="2022-03-01T14:56:44Z" level=info msg="Found pre-existing key: /var/run/linkerd/identity/end-entity/key.p8"
time="2022-03-01T14:56:44Z" level=info msg="Found pre-existing CSR: /var/run/linkerd/identity/end-entity/csr.der"
[ 0.000547s] INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
thread 'main' panicked at 'Failed to bind inbound listener: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }', /github/workspace/linkerd/app/src/lib.rs:195:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Can somebody tell me why it fails to bind the inbound listener?
Any help is much appreciated,
thanks,
Pascal
Found it : Kubernetes sends asynchronuously requests to shutdown the pods and to no longer send traffic to them. And if the pod shuts down faster than it's removal from the IP lists, it can receive requests when already being dead.
To fix this, I added a preStop lifecycle hook to the application container:
lifecycle {
pre_stop {
exec {
command = ["/bin/sh", "-c" , "sleep 5"]
}
}
}
and the following annotation to pod template :
annotations = {
"config.alpha.linkerd.io/proxy-wait-before-exit-seconds" = "10"
}
Documented here :
https://linkerd.io/2.11/tasks/graceful-shutdown/
and here :
https://blog.gruntwork.io/delaying-shutdown-to-wait-for-pod-deletion-propagation-445f779a8304
annotations = {
"config.linkerd.io/inbound-port" = "80"
}
I don't think you want this setting. Linkerd will transparently proxy connections without you setting anything.
This setting configures Linkerd's proxy to try to listen on port 80. This would likely conflict with your web server's port configuration; but the specific error you're hitting is that the Linkerd proxy does not run as root and so it does not have permission to bind port 80.
I'd expect it all to work if you removed that annotation :)

Consul agent on kubernetes, on node or pod?

I deployed an aws eks cluster via terraform. I also deployed Consul following hasicorp’s tutorial and I see the nodes in consul’s UI.
Now I’m wondering how al the consul agents will know about the pods I deploy? I deploy something and it’s not shown anywhere on consul.
I can’t find any documentation as to how to register pods (services) on consul via the node’s consul agent, do I need to configure that somewhere? Should I not use the node’s agent and register the service straight from the pod? Hashicorp discourages this since it may increase resource utilization depending on how many pods one deploy on a given node. But then how does the node’s agent know about my services deployed on that node?
Moreover, when I deploy a pod in a node and ssh into the node, and install consul, consul’s agent can’t find the consul server (as opposed from the node, which can find it)
EDIT:
Bottom line is I can't find WHERE to add the configuration. If I execute ON THE POD:
consul members
It works properly and I get:
Node Address Status Type Build Protocol DC Segment
consul-consul-server-0 10.0.103.23:8301 alive server 1.10.0 2 full <all>
consul-consul-server-1 10.0.101.151:8301 alive server 1.10.0 2 full <all>
consul-consul-server-2 10.0.102.112:8301 alive server 1.10.0 2 full <all>
ip-10-0-101-129.ec2.internal 10.0.101.70:8301 alive client 1.10.0 2 full <default>
ip-10-0-102-175.ec2.internal 10.0.102.244:8301 alive client 1.10.0 2 full <default>
ip-10-0-103-240.ec2.internal 10.0.103.245:8301 alive client 1.10.0 2 full <default>
ip-10-0-3-223.ec2.internal 10.0.3.249:8301 alive client 1.10.0 2 full <default>
But if i execute:
# consul agent -datacenter=voip-full -config-dir=/etc/consul.d/ -log-file=log-file -advertise=$(wget -q -O - http://169.254.169.254/latest/meta-data/local-ipv4)
I get the following error:
==> Starting Consul agent...
Version: '1.10.1'
Node ID: 'f10070e7-9910-06c7-0e12-6edb6cc4c9b9'
Node name: 'ip-10-0-3-223.ec2.internal'
Datacenter: 'voip-full' (Segment: '')
Server: false (Bootstrap: false)
Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: 8600)
Cluster Addr: 10.0.3.223 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false
==> Log data will now stream in as it occurs:
2021-08-16T18:23:06.936Z [WARN] agent: skipping file /etc/consul.d/consul.env, extension must be .hcl or .json, or config format must be set
2021-08-16T18:23:06.936Z [WARN] agent: Node name "ip-10-0-3-223.ec2.internal" will not be discoverable via DNS due to invalid characters. Valid characters include all alpha-numerics and dashes.
2021-08-16T18:23:06.946Z [WARN] agent.auto_config: skipping file /etc/consul.d/consul.env, extension must be .hcl or .json, or config format must be set
2021-08-16T18:23:06.947Z [WARN] agent.auto_config: Node name "ip-10-0-3-223.ec2.internal" will not be discoverable via DNS due to invalid characters. Valid characters include all alpha-numerics and dashes.
2021-08-16T18:23:06.948Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: ip-10-0-3-223.ec2.internal 10.0.3.223
2021-08-16T18:23:06.948Z [INFO] agent.router: Initializing LAN area manager
2021-08-16T18:23:06.950Z [INFO] agent: Started DNS server: address=127.0.0.1:8600 network=udp
2021-08-16T18:23:06.950Z [WARN] agent.client.serf.lan: serf: Failed to re-join any previously known node
2021-08-16T18:23:06.950Z [INFO] agent: Started DNS server: address=127.0.0.1:8600 network=tcp
2021-08-16T18:23:06.951Z [INFO] agent: Starting server: address=127.0.0.1:8500 network=tcp protocol=http
2021-08-16T18:23:06.951Z [WARN] agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set `telemetry { disable_compat_1.9 = true }` to disable them.
2021-08-16T18:23:06.953Z [INFO] agent: started state syncer
2021-08-16T18:23:06.953Z [INFO] agent: Consul agent running!
2021-08-16T18:23:06.953Z [WARN] agent.router.manager: No servers available
2021-08-16T18:23:06.954Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
2021-08-16T18:23:34.169Z [WARN] agent.router.manager: No servers available
2021-08-16T18:23:34.169Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
So where to add the config?
I also tried adding a service in k8s pointing to the pod, but the service doesn't come up on consul's UI...
What do you guys recommend?
Thanks
Consul knows where these services are located because each service
registers with its local Consul client. Operators can register
services manually, configuration management tools can register
services when they are deployed, or container orchestration platforms
can register services automatically via integrations.
if you planning to use manual option you have to register the service into the consul.
Something like
echo '{
"service": {
"name": "web",
"tags": [
"rails"
],
"port": 80
}
}' > ./consul.d/web.json
You can find the good example at : https://thenewstack.io/implementing-service-discovery-of-microservices-with-consul/
Also this is a very nice document for having detailed configuration of the health check and service discovery : https://cloud.spring.io/spring-cloud-consul/multi/multi_spring-cloud-consul-discovery.html
Official document : https://learn.hashicorp.com/tutorials/consul/get-started-service-discovery
BTW, I was finally able to figure out the issue.
consul-dns is not deployed by default, i had to manually deploy it, then forward all .consul requests from coredns to consul-dns.
All is working now. Thanks!

why admission webhook is not working in the CronJob example from kubebuilder book

I am following the CronJob example in KubeBuilder book: https://book.kubebuilder.io/cronjob-tutorial/cronjob-tutorial.html
I directly use the code from https://github.com/kubernetes-sigs/kubebuilder/tree/master/docs/book/src/cronjob-tutorial/testdata/project
after run make run, logs like this were shown:
INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"}
INFO controller-runtime.builder Registering a mutating webhook {"GVK": "batch.tutorial.kubebuilder.io/v1, Kind=CronJob", "path": "/ilder-io-v1-cronjob"}
INFO controller-runtime.webhook registering webhook {"path": "/mutate-batch-tutorial-kubebuilder-io-v1-cronjob"}
INFO controller-runtime.builder Registering a validating webhook {"GVK": "batch.tutorial.kubebuilder.io/v1, Kind=CronJob", "path": "/builder-io-v1-cronjob"}
INFO controller-runtime.webhook registering webhook {"path": "/validate-batch-tutorial-kubebuilder-io-v1-cronjob"}
INFO setup starting manager
INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
INFO controller-runtime.webhook.webhooks starting webhook server
INFO controller-runtime.controller Starting EventSource {"controller": "cronjob", "source": "kind source: /, Kind="}
INFO controller-runtime.certwatcher Updated current TLS certificate
INFO controller-runtime.webhook serving webhook server {"host": "", "port": 9443}
INFO controller-runtime.certwatcher Starting certificate watcher
INFO controller-runtime.controller Starting EventSource {"controller": "cronjob", "source": "kind source: /, Kind="}
INFO controller-runtime.controller Starting Controller {"controller": "cronjob"}
INFO controller-runtime.controller Starting workers {"controller": "cronjob", "worker count": 1}
From the log, it is easy to tell that both the controller and the admission webhook have been started successfully as expected.
In order to test if admissionWebhook is working, I make the CronJob schedule invalid like this:
-*- * * * *,
After apply the config: kubectl apply -f config/samples/batch_v1_cronjob.yaml,
No log from webhook were shown, and the only log showing that the cronjob schedule is invalid is from the code of controller:
2020-02-22T15:45:17.665+0800 ERROR controllers.Captain unable to figure out CronJob schedule {"cronjob": "default/cronjob-sample", "error": "Unparseable schedule \"-*- * * * *\": Failed to parse int from : strconv.Atoi: parsing \"\": invalid syntax"}
github.com/go-logr/zapr.(*zapLogger).Error
/Users/my-name/.go/pkg/mod/github.com/go-logr/zapr#v0.1.0/zapr.go:128
tutorial.kubebuilder.io/project/controllers.(*CronJobReconciler).Reconcile
/Users/my-name/tmp/kubebuilder/docs/book/src/cronjob-tutorial/testdata/project/controllers/cronjob_controller.go:380
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/Users/my-name/.go/pkg/mod/sigs.k8s.io/controller-runtime#v0.4.0/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/Users/my-name/.go/pkg/mod/sigs.k8s.io/controller-runtime#v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/Users/my-name/.go/pkg/mod/sigs.k8s.io/controller-runtime#v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/Users/my-name/.go/pkg/mod/k8s.io/apimachinery#v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/Users/my-name/.go/pkg/mod/k8s.io/apimachinery#v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/Users/my-name/.go/pkg/mod/k8s.io/apimachinery#v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88
so why webhook is not working?
You must create ValidatingWebhookConfiguration in order to configure apiserver to forward request for validation to your webhook. You can find it here: https://github.com/kubernetes-sigs/kubebuilder/blob/master/docs/book/src/cronjob-tutorial/testdata/project/config/webhook/manifests.yaml

Unable to fully collect metrics, when installing metric-server

I have installed the metric server on kubernetes, but its not working and logs
unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:xxx: unable to fetch metrics from Kubelet ... (X.X): Get https:....: x509: cannot validate certificate for 1x.x.
x509: certificate signed by unknown authority
I was able to get metrics if modified the deployment yaml and added
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
this now collects metrics, and kubectl top node returns results...
but logs still show
E1120 11:58:45.624974 1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-6z6qz: no metrics known for pod
E1120 11:58:45.625289 1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-rzvfj: no metrics known for pod
E1120 12:00:06.462505 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-1x.x.x.eu-west-1.compute.internal: unable to get CPU for container ...discarding data: missing cpu usage metric, unable to fully scrape metrics from source
so questions
1) All this works on minikube, but not on my dev cluster, why would that be?
2) In production i dont want to do insecure-tls.. so can someone please explain why this issue is arising... or point me to some resource.
Kubeadm generates the kubelet certificate at /var/lib/kubelet/pki and those certificates (kubelet.crt and kubelet.key) are signed by different CA from the one which is used to generate all other certificates at /etc/kubelet/pki.
You need to regenerate the kubelet certificates which is signed by your root CA (/etc/kubernetes/pki/ca.crt)
You can use openssl or cfssl to generate the new certificates(I am using cfssl)
$ mkdir certs; cd certs
$ cp /etc/kubernetes/pki/ca.crt ca.pem
$ cp /etc/kubernetes/pki/ca.key ca-key.pem
Create a file kubelet-csr.json:
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"<node_name>",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [{
"C": "US",
"ST": "NY",
"L": "City",
"O": "Org",
"OU": "Unit"
}]
}
Create a ca-config.json file:
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "8760h"
}
}
}
}
Now generate the new certificates using above files:
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem \
--config=ca-config.json -profile=kubernetes \
kubelet-csr.json | cfssljson -bare kubelet
Replace the old certificates with newly generated one:
$ scp kubelet.pem <nodeip>:/var/lib/kubelet/pki/kubelet.crt
$ scp kubelet-key.pem <nodeip>:/var/lib/kubelet/pki/kubelet.key
Now restart the kubelet so that new certificates will take effect on your node.
$ systemctl restart kubelet
Look at the following tickets to get the context of issue:
https://github.com/kubernetes-incubator/metrics-server/issues/146
Hope this helps.