Kubernetes: How to get disk / cpu metrics of a node - kubernetes

Without using Heapster is there any way to collect like CPU or Disk metrics about a node within a Kubernetes cluster?
How does Heapster even collect those metrics in the first place?

Kubernetes monitoring is detailed in the documentation here, but that mostly covers tools using heapster.
Node-specific information is exposed through the cAdvisor UI which can be accessed on port 4194 (see the commands below to access this through the proxy API).
Heapster queries the kubelet for stats served at <kubelet address>:10255/stats/ (other endpoints can be found in the code here).
Try this:
$ kubectl proxy &
Starting to serve on 127.0.0.1:8001
$ NODE=$(kubectl get nodes -o=jsonpath="{.items[0].metadata.name}")
$ curl -X "POST" -d '{"containerName":"/","subcontainers":true,"num_stats":1}' localhost:8001/api/v1/proxy/nodes/${NODE}:10255/stats/container
...
Note that these endpoints are not documented as they are intended for internal use (and debugging), and may change in the future (we eventually want to offer a more stable versioned endpoint).
Update:
As of Kubernetes version 1.2, the Kubelet exports a "summary" API that aggregates stats from all Pods:
$ kubectl proxy &
Starting to serve on 127.0.0.1:8001
$ NODE=$(kubectl get nodes -o=jsonpath="{.items[0].metadata.name}")
$ curl localhost:8001/api/v1/proxy/nodes/${NODE}:10255/stats/summary
...

I would recommend using heapster to collect metrics. It's pretty straight forward. However, in order to access those metrics, you need to add "type: NodePort" in hepaster.yml file. I modified the original heapster files and you can found them here. See my readme file how to access metrics. More metrics are available here.
Metrics can be accessed via a web browser by accessing http://heapster-pod-ip:heapster-service-port/api/v1/model/metrics/cpu/usage_rate. The Same result can be seen by executing following command.
$ curl -L http://heapster-pod-ip:heapster-service-port/api/v1/model/metrics/cpu/usage_rate

Related

Prometheus - Monitoring command output in container

I need to monitoring a lot of legacy containers in my eks cluster that having a nfs mountpath. To map nfs directory in container i using nfs-client helm chart.
I need to monitor when my mountpath for some reason is lost, and the only way that i find to do that is exec a command in container.
#!/bin/bash
df -h | grep ip_of_my_nfs_server | wc -l
if the output above returns 1 i know that my nfs mountpath is ok.
Anybody knows some whay that monitoring an output script exec in container with prometheus?
Thanks!
As Matt has pointed out in the comments: first order of business should be to see if you can simply facilitate your monitoring requirement from node_exporter.
Below is a more generic answer on collecting metrics from arbitrary shell commands.
Prometheus is a pull-based monitoring system. You configure it with "scrape targets": these are effectively just HTTP endpoints that expose metrics in a specific format. Some target needs to be alive for long enough to allow it to be scraped.
The two most obvious options you have are:
Wrap your logic in a long-running process that exposes this metric on an HTTP endpoint, and configure it as a scrape target
Spin up an instance of pushgateway, and configure it as a scrape target , and have your command push its metrics there
Based on the little information you provided, the latter option seems like the most sane one. Important and relevant note from the README:
The Prometheus Pushgateway exists to allow ephemeral and batch jobs to expose their metrics to Prometheus. Since these kinds of jobs may not exist long enough to be scraped, they can instead push their metrics to a Pushgateway. The Pushgateway then exposes these metrics to Prometheus.
Your command would look something like:
#!/bin/bash
printf "mount_path_up %d" $(df -h | grep ip_of_my_nfs_server | wc -l) | curl --data-binary #- http://pushgateway.example.org:9091/metrics/job/some_job_name

How to get Kubernetes cluster name from K8s API using client-go

How to get Kubernetes cluster name from K8s API mentions that
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
(from within the cluster), or
kubectl run curl --rm --restart=Never -it --image=appropriate/curl -- -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/cluster-name
(from outside the cluster), can be used to retrieve the cluster name. That works.
Is there a way to perform the same programmatically using the k8s client-go library? Maybe using the RESTClient()? I've tried but kept getting the server could not find the requested resource.
UPDATE
What I'm trying to do is to get the cluster-name from an app that runs either in a local computer or within a k8s cluster. the k8s client-go allows to initialise the clientset via in cluster or out of cluster authentication.
With the two commands mentioned at the top that is achievable. I was wondering if there was a way from the client-go library to achieve the same, instead of having to do kubectl or curl depending on where the service is run from.
The data that you're looking for (name of the cluster) is available at GCP level. The name itself is a resource within GKE, not Kubernetes. This means that this specific information is not available using the client-go.
So in order to get this data, you can use the Google Cloud Client Libraries for Go, designed to interact with GCP.
As a starting point, you can consult this document.
First you have to download the container package:
➜ go get google.golang.org/api/container/v1
Before you will launch you code you will have authenticate to fetch the data:
Google has a very good document how to achieve that.
Basically you have generate a ServiceAccount key and pass it in GOOGLE_APPLICATION_CREDENTIALS environment:
➜ export GOOGLE_APPLICATION_CREDENTIALS=sakey.json
Regarding the information that you want, you can fetch the cluster information (including name) following this example.
Once you do do this you can launch your application like this:
➜ go run main.go -project <google_project_name> -zone us-central1-a
And the result would be information about your cluster:
Cluster "tom" (RUNNING) master_version: v1.14.10-gke.17 -> Pool "default-pool" (RUNNING) machineType=n1-standard-2 node_version=v1.14.10-gke.17 autoscaling=false%
Also it is worth mentioning that if you run this command:
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
You are also interacting with the GCP APIs and can go unauthenticated as long as it's run within a GCE machine/GKE cluster. This provided automatic authentication.
You can read more about it under google`s Storing and retrieving instance metadata document.
Finally, one great advantage of doing this with the Cloud Client Libraries, is that it can be launched externally (as long as it's authenticated) or internally within pods in a deployment.
Let me know if it helps.
If you're running inside GKE, you can get the cluster name through the instance attributes: https://pkg.go.dev/cloud.google.com/go/compute/metadata#InstanceAttributeValue
More specifically, the following should give you the cluster name:
metadata.InstanceAttributeValue("cluster-name")
The example shared by Thomas lists all the clusters in your project, which may not be very helpful if you just want to query the name of the GKE cluster hosting your pod.

Kubectl documentation without starting Kubernetes

I have installed a K8S cluster on laptop using Kubeadm and VirtualBox. It seems a bit odd that the cluster has to be up and running to see the documentation as shown below.
praveensripati#praveen-ubuntu:~$ kubectl explain pods
Unable to connect to the server: dial tcp 192.168.0.31:6443: connect: no route to host
Any workaround for this?
See "kubectl explain — #HeptioProTip"
Behind the scenes, kubectl just made an API request to my Kubernetes cluster, grabbed the current Swagger documentation of the API version running in the cluster, and output the documentation and object types.
Try kubectl help as an offline alternative, but that won't be as complete (limite to kubectl itself).
So the rather sobering news is that AFAIK there's not out-of-the box way how to do it, though you could totally write a kubectl plugin (it has become rather trivial now in 1.12). But for now, the best I can offer is the following:
# figure out which endpoint kubectl uses to retrieve docs:
$ kubectl -v9 explain pods
# from above I learn that in my case it's apparently
# https://192.168.64.11:8443/openapi/v2 so let's curl that:
$ curl -k https://192.168.64.11:8443/openapi/v2 > resources-docs.json
From here you can, for example, use jq to query for the descriptions. It's not as nice as a proper explain, but kinda is a good enough workaround until someone writes an docs offline query kubectl plugin.
The 'explain' documentation lives in the kube-apiserver and its resource definitions. Hence the need to connect to it through kubectl explain to get any docs. This is different from the standard very basic cli help from kubectl where it's in the kubectl Golang code.
So no workaround really other than setting up a dummy Kubernetes cluster and have kubectl point to it. Please note that CRDs help might not be available since they live in the deployed CRDs themselves.

Fetching Stackdriver Monitoring TimeSeries data for a pod running on a k8s cluster on GKE using the REST API

My objective is to fetch the time series of a metric for a pod running on a kubernetes cluster on GKE using the Stackdriver TimeSeries REST API.
I have ensured that Stackdriver monitoring and logging are enabled on the kubernetes cluster.
Currently, I am able to fetch the time series of all the resources available in a cluster using the following filter:
metric.type="container.googleapis.com/container/cpu/usage_time" AND resource.labels.cluster_name="<MY_CLUSTER_NAME>"
In order to fetch the time series of a given pod id, I am using the following filter:
metric.type="container.googleapis.com/container/cpu/usage_time" AND resource.labels.cluster_name="<MY_CLUSTER_NAME>" AND resource.labels.pod_id="<POD_ID>"
This filter returns an HTTP 200 OK with an empty response body. I have found the pod ID from the metadata.uid field received in the response of the following kubectl command:
kubectl get deploy -n default <SERVICE_NAME> -o yaml
However, when I use the Pod ID of a background container spawned by GKE/Stackdriver, I do get the time series values.
Since I am able to see Stackdriver metrics of my pod on the GKE UI, I believe I should also get the metric values using the REST API.
My doubts/questions are:
Am I fetching the Pod ID of my pod correctly using kubectl?
Could there be some issue with my cluster setup/service deployment due to which I'm unable to fetch the metrics?
Is there some other way in which I can get the time series of my pod using the REST APIs?
I wouldn't rely on kubectl get deploy for pod ids. I would get them with something like kubectl -n default get pods | grep <prefix-for-your-pod> | awk '{print $1}'
I don't think so, but the best way to find out is opening a support ticket with GCP if you have any doubts.
Not that I'm aware of, Stackdriver is the monitoring solution in GCP. Again, you can check with GCP support. There are other tools that you can use to get metrics from Kubernetes like Prometheus. There are multiple guides on the web on how to set it up with Grafana on k8s. This is one for example.
Hope it helps!
Am I fetching the Pod ID of my pod correctly using kubectl?
You could use JSONpath as output with kubectl, in this case iterating over the Pods and fetching the metadata.name and metadata.uid fields:
kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.uid}{"\n"}{end}'
which will output something like this:
nginx-65899c769f-2j775 d4fr5t6-bc2f-11e8-81e8-42010a84011f
nginx2-77b5c9d48c-7qlps 4f5gh6r-bc37-11e8-81e8-42010a84011f
Could there be some issue with my cluster setup/service deployment due to which I'm unable to fetch the metrics?
As #Rico mentioned in his answer, contacting the GCP support could be a way forward if you don't get further with the troubleshooting, see below.
Is there some other way in which I can get the time series of my pod using the REST APIs?
You could use the APIs Explorer or the Metrics Explorer from within the Stackdriver portal. There's some good troubleshooting tips here with a link to the APIs Explorer. In the Stackdriver Metrics Explorer it's fairly easy to reassemble the filter you've used using dropdown lists to choose e.g. a particular pod_id.
Taken from the Troubleshooting the Monitoring guide (linked above) regarding an empty HTTP 200 response on filtered queries:
If your API call returns status code 200 and an empty response, there
are several possibilities:
If your call uses a filter, then the filter might not have matched anything. The filter match is case-sensitive. To resolve filter
problems, start by specifying only one filter component, such as
metric.type, and see if you get results. Add the other filter
components one-by-one.
If you are working with a custom metric, you might not have specified the project where your custom metric is defined.*
I found this link when reading through the documentation of the Monitoring API. That link will get you to the APIs Explorer with some pre-filled fields, change these accordingly and add your own filter.
I have not tested more using the REST API at the moment but hopefully this could get you forward.

No heapster filesystem metrics from REST API

I process a kubernetes proxy on my local machine through kubectl proxy.
And I deployed heapster onto my kubernetes environment as well as influxdb and grafana.
I can see the metrics of filesystem usage retrieved by grafana.
However, I cannot get the filesystem usage through heapster REST API through:
Please help me to check if there is any misconfigured or url wrong or other issue?
Thanks.
In some pods there is no such type of metrics as filesystem/usage:
Here is an example of available metrics list for etcd-minikube pod:
http://127.0.0.1:8001/api/v1/namespaces/kube-system/services/heapster/proxy/api/v1/model/namespaces/kube-system/pods/etcd-minikube/metrics/
$ curl http://127.0.0.1:8001/api/v1/namespaces/kube-system/services/heapster/proxy/api/v1/model/namespaces/kube-system/pods/etcd-minikube/metrics/
[
"network/rx_errors_rate",
"cpu/usage_rate",
"network/rx_errors",
"memory/request",
"memory/page_faults_rate",
"network/rx_rate",
"network/tx_errors_rate",
"memory/limit",
"network/rx",
"memory/major_page_faults_rate",
"uptime",
"memory/rss",
"memory/working_set",
"restart_count",
"network/tx_errors",
"cpu/request",
"cpu/limit",
"network/tx",
"memory/usage",
"network/tx_rate"
]
In this example, there is no filesystem/usage in the list.
If I try to get it, I'll get exactly the same result to the one you’ve posted in the question:
http://127.0.0.1:8001/api/v1/namespaces/kube-system/services/heapster/proxy/api/v1/model/namespaces/kube-system/pods/etcd-minikube/metrics/filesystem/usage
$ curl http://127.0.0.1:8001/api/v1/namespaces/kube-system/services/heapster/proxy/api/v1/model/namespaces/kube-system/pods/etcd-minikube/metrics/filesystem/usage
{
"metrics": [],
"latestTimestamp": "0001-01-01T00:00:00Z"
}
Therefore, check the available options for the pod using URL similar to the first example.