I have a Problem with the Kubernetes Dashboard.
I use actually the Managed Kubernetes Service AKS and created a Kubernetes Cluster with following Setup:
Kubernetes-Version 1.20.9
1 Worker Node with Size Standard_DS2_v2
It starts successfully with the automatic configuration of coredns, corednsautoscaler, omsagent-rs, tunnelfront and the metrics-sever.
After that i applied three deployments for my services, which all are deployed successfully.
Now, i want to get access to the Kubernetes Dashboard. I used the instruction which is described on https://artifacthub.io/packages/helm/k8s-dashboard/kubernetes-dashboard.
After that I call kubectl proxy to access the dashboard via the url: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/.
After i use my kubeconfig-File to Sign to Kubernetes Dashboard i get following output and nor cpu neither memory usage is displayed.
When i execute kubectl describe kubernetes-dashboard pod i get following:
And the logs from the pod say following:
Internal error occurred: No metric client provided. Skipping metrics.
2021/12/11 19:23:04 [2021-12-11T19:23:04Z] Outcoming response to with 200 status code
2021/12/11 19:23:04 Internal error occurred: No metric client provided. Skipping metrics.
2021/12/11 19:23:04 [2021-12-11T19:23:04Z] Outcoming response to with 200 status code
2021/12/11 19:23:04 Internal error occurred: No metric client provided. Skipping metrics.
... I used the instruction which is described on https://artifacthub.io/packages/helm/k8s-dashboard/kubernetes-dashboard.
The dashboard needs a way to "cache" a small window of metrics collected from the metrics server. The instruction provided there doesn't have this enabled. You can run the following to install/upgrade kubernetes-dashboard with metrics scraper enabled:
helm upgrade -i kubernetes-dashboard/kubernetes-dashboard --name my-release \
I have a GKE with Workload identity enabled.
Most of our workloads use Cloud Storage or Cloud logging GCP packages which means actually using the Workload identity for GCP access.
Recently we’ve started adding Secret Manager to the stack and started encountering random errors for the Metadata Server on workload startup. It happens on different frameworks.
File "/venv/lib/python3.8/site-packages/google/auth/compute_engine/credentials.py", line 117, in refresh six.raise_from(new_exc, caught_exc) File "<string>", line 3, in raise_from google.auth.exceptions.RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Enginemetadata service. Status: 404 Response:\nb'Not Found\\n'", <google.auth.transport.requests._Response object at 0x7f3a3084dd60>)
failed to initialize. exiting. Error: 16 UNAUTHENTICATED: Failed to retrieve auth metadata with error: Could not refresh access token: network timeout at: at Object
I’m trying to understand why it's happening.
First, 404 Not Found means we are trying to get metadata which does not exist/deleted. The thing is it recovers a few seconds later so I'm not sure how exactly.
Based on documentation, sometimes it takes some time for the metadata server to be available, and hence the error which ‘recover’ afterwards. So recommendation is to add delays on the app code or using init Containers until the Metadata server is operated.
I wonder if that's really the best approach, to add an init container to all of our workloads, and if it's really our use case as the error code is a bit misleading. Also, not quite sure why its only started when adding the secret manager.
This sometimes happens due to OOM issues on Metadata server. you can check status of the pod running metadata server using:
kubectl -n kube-system describe pods <pod_name>
you can get the pod_name using:
kubectl get pods --namespace kube-system .
the pod name will start with a prefix gke-metadata-server-
if you see something like following in output when you describe the pod:
Last State: Terminated
Reason: OOMKilled
then that would indicate OOM issue.
Some mitigations that you can try:
check if you have un-used ServiceAccounts in your cluster and if you can remove em.
check if you are creating too many clients (new one for every API
request). sharing clients if possible will reduce token refresh calls to Metadata server thus, saving memory.
check if you can find metadata server's definition under /etc/kubernetes/addons/. if you can, update the memory to increase it and apply the updated config.
If I look at my logs in GCP logs, I see for instance that I got a request that gave 500
log_message: "Method: some_cloud_goo.Endpoint failed: INTERNAL_SERVER_ERROR"
I would like to quickly go to that pod and do a kubectl logs on it. But I did not find a way to do this.
I am fairly new to k8s and GKE, any way to traceback the pod that handled that request?
You could run command "kubectl get pods " on each node to check the status of all pods and could figure out accordingly by running for detail description of an error " kubectl describe pod pod-name"
As mentioned in #Neelam answer, you can can get the pod names with the command kubectl get pods -A and log into all your pods to find the error.
Or, alternatively, you could deploy a custom monitoring system like Elastic GKE Logging available in GCP github Click-to-deploy.
See here to install from MarketPlace with few clicks.
It is a free alternative to have a complete monitoring system and you can filter your logs in Kibana dashboard after deployed.
I am deploying my application using HPA algorithm based on http request.
I follow this link.
In the "Auto Scaling Based on Custom Metrics" part. I built successfully with their application. But when I deploy with my application, I get the error:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests" | jq .
Error from server (NotFound): the server could not find the metric http_requests for pods
In this part, they said that "The podinfo app exposes a custom metric named http_requests_total". So, how can my application expose a custom metric like that?
Thank you so much!
You can find more information about this deployment here.
Another example how to build application for custom metrics you can find here and here.
In both cases the are using "Golang Client API"
In the example from your tutorial there is working application on port 9898
tcp 0 0 :::9898 :::* LISTEN 1/podinfo
If you are not sure if your application is working properly than please verify:
kubectl get deploy,pods
-- verify if your new deployment is working properly
kubectl describe <your pod>
-- in case of any issues with your application:
kubectl logs <your pod>
-- if the port and an endpoint are the same as in the example
curl <your pod ip (endpoint)>:9898/metrics
you should notice http_requests_total metrics: htp_requests_total{status="200"} 1926
-- for generall issues
kubectl get events
Please share with your findings and results.
I'm using Kubernetes 1.11 on Digital Ocean, when I try to use kubectl top node I get this error:
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
but as stated in the doc, heapster is deprecated and no longer required from kubernetes 1.10
If you are running a newer version of Kubernetes and still receiving this error, there is probably a problem with your installation.
Please note that to install metrics server on kubernetes, you should first clone it by typing:
git clone https://github.com/kodekloudhub/kubernetes-metrics-server.git
kubectl create -f kubernetes-metrics-server/
In this way all services and components are installed correctly and you can run:
kubectl top nodes
kubectl top pods
and get the correct result.
For kubectl top node/pod to work you either need the heapster or the metrics server installed on your cluster.
Like the warning says: heapster is being deprecated so the recommended choice now is the metrics server.
So follow the directions here to install the metrics server
Without using Heapster is there any way to collect like CPU or Disk metrics about a node within a Kubernetes cluster?
How does Heapster even collect those metrics in the first place?
Kubernetes monitoring is detailed in the documentation here, but that mostly covers tools using heapster.
Node-specific information is exposed through the cAdvisor UI which can be accessed on port 4194 (see the commands below to access this through the proxy API).
Heapster queries the kubelet for stats served at <kubelet address>:10255/stats/ (other endpoints can be found in the code here).
Try this:
$ kubectl proxy &
Starting to serve on
$ NODE=$(kubectl get nodes -o=jsonpath="{.items[0].metadata.name}")
$ curl -X "POST" -d '{"containerName":"/","subcontainers":true,"num_stats":1}' localhost:8001/api/v1/proxy/nodes/${NODE}:10255/stats/container
Note that these endpoints are not documented as they are intended for internal use (and debugging), and may change in the future (we eventually want to offer a more stable versioned endpoint).
As of Kubernetes version 1.2, the Kubelet exports a "summary" API that aggregates stats from all Pods:
$ kubectl proxy &
Starting to serve on
$ NODE=$(kubectl get nodes -o=jsonpath="{.items[0].metadata.name}")
$ curl localhost:8001/api/v1/proxy/nodes/${NODE}:10255/stats/summary
I would recommend using heapster to collect metrics. It's pretty straight forward. However, in order to access those metrics, you need to add "type: NodePort" in hepaster.yml file. I modified the original heapster files and you can found them here. See my readme file how to access metrics. More metrics are available here.
Metrics can be accessed via a web browser by accessing http://heapster-pod-ip:heapster-service-port/api/v1/model/metrics/cpu/usage_rate. The Same result can be seen by executing following command.
$ curl -L http://heapster-pod-ip:heapster-service-port/api/v1/model/metrics/cpu/usage_rate