For example, kubelet(cAdvisor) container_cpu_usage_seconds_total has value with some parameter (e.g. pod, namespace).
I wonder how to summarize this kind of values into Service(for example, CPU usage per service)? I understand that Service is a set of pods so that just aggregating these values per pod to service, but I do not know how?
Is there any aggregation method to Service? Or, process_cpu_seconds_total is a kind of aggregated value per service of 'container_cpu_usage_seconds_total'?
Thank you for your help!
What about
sum(rate(container_cpu_usage_seconds_total{job="kubelet", cluster="", namespace="default", pod_name=~"your_pod_name.*"}[3m]))
Taken from kubernetes-mixin
In general, cAdvisor collects metrics about containers and doesn't know anything about Services. If you want to aggregate by Service, you need to manually select the metrics of the Pod that belong to this Service.
For example, if your cAdvisor metrics are in Prometheus, you can use this PromQL query:
sum(rate(container_cpu_usage_seconds_total{pod_name=~"myprefix-*"}[2m]))
This adds up the CPU usages of all containers of all Pods that have a name starting with myprefix-.
Or if you have the Resource Metrics API enabled (i.e. the Metrics Server installed), you can query the CPU usage of a specific Pod (in fractions of a CPU core) with:
kubectl get --raw="/apis/metrics.k8s.io/v1beta1/namespaces/{namespace}/pods/{pod}"
To get the total usage of a Service, you would need to iterate through all the Pods of the Service, extract the values, and add them together.
In general, Service is a Kubernetes concept and does not exist in cAdvisor, which is an independent project and just happens to be used in Kubernetes.
Related
We are planning on delivering small k8s clusters to clients with our application on top.
Currently we are struggling on see what resources we actually need. At average we are running 20-30 pods in the system.
While getting resources requests and limits per deployment is not hard to see.
It is hared to get full view of all requests or all limits resources for all pods that are running in the cluster. At least in an automated way.
Is there prebuild dashboard in Grafana or some kind of kubectl command that would collect all of the requests and limits for all pods running in the k8s cluster?
The result should be a "nice" report for all resource requirements.
Since we are delivering a "static" cluster to clients there is no hpa roles in our clusters.
So far we have done manual check per each pod and write it in Excel table which is not time efficient and repeatable.
Hi skolko you can use prometheus for monitoring your kubernetes cluster there are various options available like monitoring individual deployments, monitoring entire cluster and monitoring each pod individually. Follow this document for setting up the prometheus monitoring for kubernetes and this document for getting an overview on metrics available for monitoring.
I am trying to aggregate the pod duration of the status Running over another custom business-logic tag. Then I can calculate how much it cost me to run this service.
I have tried to use docker.uptime but it has not been fruitful as I imagine multiple containers can be run in parallel per node. I saw that Datadog KSM provides pods.age and pods.uptime metrics but they do not appear in Datadog metric explorer.
I do not want to use Prometheus/Grafana to do this because I think this should be possible with Datadog. Prometheus Solution
avg:container.uptime{kube_namespace:<your namespace>} by {pod_name}
I am analyzing jvm metrics on prometheus dashboard for a service deployed in kubernetes. There are several pods, each running an instance of the service.
When I do:
jvm_memory_max_bytes{area="heap", app="my-application",job="my-job"}
This fetches all the entries for all the pods.
Now I apply sum function:
sum(jvm_memory_max_bytes{area="heap", app="my-application",job="my-job"})
It sums up all the results from first query.
My objective is to find average jvm statistics, which may need the number of pods running.
In grafana, I tried to search kube_* metrics, but couldn't find any suitable one.
How can I get average jvm metrics for a set of pods?
You can get number of running pods by using:
sum(kube_pod_status_phase{phase="Running"})
So your final query might look like this:
sum(jvm_memory_max_bytes{area="heap", app="my-application",job="my-job"}) / sum(kube_pod_status_phase{phase="Running"})
I am fairly new to Kubernetes and had a question concerning kube-state-metrics. When I simply monitor Kubernetes using Prometheus I obtain a set of metrics from the cAdvisor, the nodes (node exporter), the pods, etc. When I include the kube-state-metrics, I seem to obtain more "relevant" metrics. Do kube-state-metrics allow to scrape "new" information from Kubernetes or are they rather "formatted" metrics using the initial Kubernetes metrics (from the nodes, etc. I mentioned earlier).
The two are basically unrelated. Cadvisor is giving you low-level stats about the containers like how much RAM and CPU they are using. KSM gives you info from the Kubernetes API like the Pod object status. Both are useful for different things and you probably want both.
would like to see k8 Service level metrics in Grafana from underlying prometheus server.
For instance:
1) If i have 3 application pods exposed through a service i would like to see service level metrics for CPU,memory & network I/O pressure ,Total # of requests,# of requests failed
2)Also if i have group of pods(replicas) related to an application which doesn"t have Service on top of them would like to see the aggregated metrics of the pods related to that application in a single view on grafana
What would be the prometheus queries to achieve the same
Service level metrics for CPU, memory & network I/O pressure
If you have Prometheus installed on your Kubernetes cluster, all those statistics are being already collected by Prometheus. There are many good articles about how to install and how to use Kubernetes+Prometheus, try to check that one, as an example.
Here is an example of a request to fetch container memory usage:
container_memory_usage_bytes{image="CONTAINER:VERSION"}
Total # of requests,# of requests failed
Those are service-level metrics, and for collecting them, you need to use Prometheus Exporter created especially for your service. Check the list with exporters, find one which you need for your service and follow its instruction.
If you cannot find an Exporter for your application, you can write it yourself, here is an official documentation about it.
application which doesn"t have Service on top of them would like to see the aggregated metrics of the pods related to that application in a single view on grafana
It is possible to combine any graphics in a single view in Grafana using Dashboards and Panels. Check an official documentation, all that topics pretty detailed and easy to understand.
Aggregation can be done by Prometheus itself by aggregation operations.
All metrics from Kubernetes has labels, so you can group by them:
sum(http_requests_total) by (application, group), where application and group is labels.
Also, here is an official Prometheus instruction about how to add Prometheus to Grafana as a Datasourse.