Readiness probe failed: Get http://*.*.*.*:8080/***/healthCheck: net/http: request canceled (Client.Timeout exceeded while awaiting headers) - kubernetes

The GCP Error reporting Dashboard is showing following error for one of our production cluster ,could you please help me with it is this an issue or information about the pod is not ready to take traffic at that movement
Readiness probe failed: Get http://...:8080/***healthCheck: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

Related

How do I solve a timeout when trying to add a node to my Kubernetes cluster?

I am trying to add a node to my (currently running) Kubernetes cluster.
When I run the kubeadm join command, I get the following error:
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker
cgroup driver. The recommended driver is "systemd".
Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: couldn't validate the identity of the API Server:
Get "https://159.65.40.41:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s":
net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
To see the stack trace of this error execute with --v=5 or higher
here is a snippet from the stack trace
I0917 16:06:58.162180 2714 token.go:215] [discovery] Failed to request cluster-info, will try again: Get "https://*redacted*:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
What does this mean and how do I solve it?
I forgot that I had a firewall installed on my server
I have added port 6443 per instructions found here (Kubeadm join failed : Failed to request cluster-info) and all is well!

After starting Ditto services, pods toggle from "OK" to "Liveness probe failed" or "Readiness probe failed"

I managed to get Ditto up and running on minikube, following the instructions provided in the README.txt file. I had to do some minor adjustments to the .yaml files (see Deployment of Ditto and MongoDB using kubectl fails because of unsupported version "extensions/v1beta1").
Now that the Ditto services have been started, the pods toggle from status "OK" to the following errors:
pod connectivity: Liveness probe failed: Get "http://172.17.0.6:8558/alive": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
pod gateway: Readiness probe failed: Get "http://172.17.0.9:8558/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
pod things: Readiness probe failed: Get "http://172.17.0.5:8558/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Back-off restarting failed container
pod things-search: Readiness probe failed: Get "http://172.17.0.8:8558/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Back-off restarting failed container
pod policies: Readiness probe failed: Get "http://172.17.0.7:8558/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Back-off restarting failed container
pod concierge: Readiness probe failed: Get "http://172.17.0.4:8558/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Even when all pods have the status "OK", I can't send POST requests without getting Error 502 (Bad Gateway).
Any help for solving this problem is highly appreciated.
Thank you in advance.
Maybe this is caused by a resource issue for your Minikube VM.
How many CPUs and Memory does the VM have?
Maybe you can you scale up resources and try again?
I had several Problems with Ditto running in docker till I changed the CPU usage in docker from 4 to 8.
Docker Settings
Since I am using a 4Core/8Thread I wonder if a setting of 4 does lead to the usage of 2 cores (on an old mac). Which seems to be too few for Ditto.
Thomas

kubelet unable to get node status after kube-controller-manager and kube-scheduler restarted

My k8s 1.12.8 cluster (created via kops) has been running fine for 6+ months. Recently, something caused both kube-scheduler and kube-controller-manager on the master node to die and restart:
SyncLoop (PLEG): "kube-controller-manager-ip-x-x-x-x.z.compute.internal_kube-system(abc123)", event: &pleg.PodLifecycleEvent{ID:"abc123", Type:"ContainerDied", Data:"def456"}
hostname for pod:"kube-controller-manager-ip-x-x-x-x.z.compute.internal" was longer than 63. Truncated hostname to :"kube-controller-manager-ip-x-x-x-x.z.compute.inter"
SyncLoop (PLEG): "kube-scheduler-ip-x-x-x-x.z.compute.internal_kube-system(hij678)", event: &pleg.PodLifecycleEvent{ID:"hij678", Type:"ContainerDied", Data:"890klm"}
SyncLoop (PLEG): "kube-controller-manager-ip-x-x-x-x.eu-west-2.compute.internal_kube-system(abc123)", event: &pleg.PodLifecycleEvent{ID:"abc123", Type:"ContainerStarted", Data:"def345"}
SyncLoop (container unhealthy): "kube-scheduler-ip-x-x-x-x.z.compute.internal_kube-system(hjk678)"
SyncLoop (PLEG): "kube-scheduler-ip-x-x-x-x.z.compute.internal_kube-system(ghj567)", event: &pleg.PodLifecycleEvent{ID:"ghj567", Type:"ContainerStarted", Data:"hjk768"}
Ever since kube-scheduler and kube-controller-manager restarted, kubelet is completely unable to get or update any node status:
Error updating node status, will retry: failed to patch status "{"status":{"$setElementOrder/conditions":[{"type":"NetworkUnavailable"},{"type":"OutOfDisk"},{"type":"MemoryPressure"},{"type":"DiskPressure"},{"type":"PIDPressure"},{"type":"Ready"}],"conditions":[{"lastHeartbeatTime":"2020-08-12T09:22:08Z","type":"OutOfDisk"},{"lastHeartbeatTime":"2020-08-12T09:22:08Z","type":"MemoryPressure"},{"lastHeartbeatTime":"2020-08-12T09:22:08Z","type":"DiskPressure"},{"lastHeartbeatTime":"2020-08-12T09:22:08Z","type":"PIDPressure"},{"lastHeartbeatTime":"2020-08-12T09:22:08Z","type":"Ready"}]}}" for node "ip-172-20-60-88.eu-west-2.compute.internal": Patch https://127.0.0.1/api/v1/nodes/ip-172-20-60-88.eu-west-2.compute.internal/status?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Error updating node status, will retry: error getting node "ip-x-x-x-x.z.compute.internal": Get https://127.0.0.1/api/v1/nodes/ip-x-x-x-x.z.compute.internal?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Error updating node status, will retry: error getting node "ip-x-x-x-x.z.compute.internal": Get https://127.0.0.1/api/v1/nodes/ip-x-x-x-x.z.compute.internal?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Error updating node status, will retry: error getting node "ip-x-x-x-x.z.compute.internal": Get https://127.0.0.1/api/v1/nodes/ip-x-x-x-x.z.compute.internal?timeout=10s: context deadline exceeded
Error updating node status, will retry: error getting node "ip-x-x-x-x.z.compute.internal": Get https://127.0.0.1/api/v1/nodes/ip-x-x-x-x.z.compute.internal?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Unable to update node status: update node status exceeds retry count
The cluster is completely unable to perform any updates in this state.
What can cause the master node to lose connectivity to nodes like
this?
Is the 2nd line in the first log output 'Truncated
hostname..' a potential source of the issue?
How can I further
diagnose what is actually causing the get/update node actions to
fail?
I remember kubernetes limits the hostname to less than 64 characters. Is there a case where hostname is updated this time?
If so it would be good to reconstruct the kubelet configuration using this documentation
https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/

Grafana clock panel manual installation

I have installed Grafana on Windows Server 2016 and it's runing as a Service.
Because my server is behind proxy server and i didn't find where i can setup proxy server in custom.ing i approach to manual installation of Grafana clock panel. The server it self has internet, but not Grafana.
From command line i tried with:
grafana-cli --pluginsDir C:\grafana-6.1.6\data\plugins\grafana-clock-panel plugins install grafana-clock-panel-6fdc3d5
but i get error:
Failed to send request: Get https://grafana.com/api/plugins/repo/grafana-clock-panel-6fdc3d5: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
[31mError[0m: [31m✗[0m Failed to send request. error: Get https://grafana.com/api/plugins/repo/grafana-clock-panel-6fdc3d5: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

bx cr push command fails

I am following the documentation at
https://console.bluemix.net/docs/services/Registry/index.html#registry_images_pushing
But Unable to push image to IBM container registry
**1. Login to container service **
bx cr login
Logging in to 'registry.ng.bluemix.net'...
FAILED
Failed to 'docker login' to 'registry.ng.bluemix.net' with error: WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: Get https://registry.ng.bluemix.net/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
.
That, specifically the line: Error response from daemon: Get https://registry.ng.bluemix.net/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), indicates that your docker daemon is not able to open a connection to the registry.
This is a network problem, you'll need to look into why you can't open this connection. nc -zv registry.ng.bluemix.net 443 should succeed and curl -v https://registry.ng.bluemix.net/v2/ should return 401 UNAUTHORIZED.