How do I group pod metrics by deployment in Prometheus? - kubernetes

Prometheus has metrics such as container_cpu_usage_seconds_total. However, they are only grouped by pod. How can I group them by deployment/cronjobs/etc?

I was able to handle this with the following query:
((label_replace((rate(container_cpu_usage_seconds_total{image!=""}[2m]) * on(pod) group_left(owner_name) (sum without (instance) (kube_pod_owner))), "replicaset", "$1", "owner_name", "(.*)")) * on(replicaset) group_left(owner_name) (sum without (instance) (kube_replicaset_owner{})))
Here is the explanation:
Join container_cpu_usage_seconds_total with kube_pod_owner on pod
Copy over the owner_name from kube_pod_owner
Use label_replace to rename kube_pod_owner's owner_name to replicaset
Join that with kube_replicaset_owner on replicaset
Copy over the owner_name from kube_replicaset_owner (this value is your deployment etc)
The without (instance) are used to remove the instance field from the joined sets. Because there can be multiple instances for a single deployment, this can cause issues.
Lastly, the rate function is called on container_cpu_usage_seconds_total directly at the innermost area because otherwise Prometheus complains about parse error: ranges only allowed for vector selectors. Placing it in the innermost area is a workaround.

Related

PromQL "where" clause

How does one add a where clause in PromQL?
I'm trying to construct a query that displays when an application running in Kubernetes has been up for more than one minute but I want to filter by namespace.
This is what my query looks like at the moment
100 * (count(up == 1) BY (job, namespace, service) ) > 1
This works fine but it gives me additional information that I don't need.
{job="prometheus-grafana", namespace="monitor", service="prometheus-grafana"}
{job="jenkins", namespace="jenkins", service="jenkins"}
{job="kube-state-metrics", namespace="monitor", service="prometheus-kube-state-metrics"}
{job="node-exporter", namespace="monitor", service="prometheus-prometheus-node-exporter"}
{job="kubelet", namespace="kube-system", service="prometheus-kube-prometheus-kubelet"}
{job="apiserver", namespace="default", service="kubernetes"}
What I'm trying to accomplish is to get results for only the jenkins and default namespace.
{job="apiserver", namespace="default", service="kubernetes"}
{job="jenkins", namespace="jenkins", service="jenkins"}
I've tried doing
100 * (count(up == 1) BY (job, namespace, service) ) > 1 and ON {namespace="jenkins"}
But I get an invalid parameter "query": 1:65: parse error: unexpected "{" in grouping opts, expected "(" error.
You would have to filter the metric "up" by the labels you want (namespaces) in your case it should look something like this:
100 * count(up{namespace=~"default|jenkins"} == 1) > 1
You can try this too. In Kubernetes all resources uses pod. So if you take pod status metrics and minus current time with 60, which gives post 1 min pods running status.
time()-60 > (kube_pod_start_time)
Prometheus provides the following ways for filtering the data in queries:
Time series selectors. They allow filtering time series by metrics and labels. For example, up{namespace=~"default|jenkins"} is a series selector, which returns only time series with the name up, which contain label namespace matching the given given regular expression: default|jenkins. This is roughly equivalent to the following SQL:
SELECT * FROM table WHERE name = 'up' and namespace ~ '^(default|jenkins)$'
Comparison operators, which allow filtering time series by values. For example, up == 0 returns time series with up name, which have 0 value. This is roughly equivalent to the following SQL:
SELECT * FROM table WHERE name = 'up' and value == 0
Time series matching via binary operators. This allows performing join-like queries. For example, up * on(instance) group_left(name) node_os_info joins up metric with node_os_info metric via instance label and selects additional name label from node_os_info metric. This is roughly equivalent to the following SQL:
SELECT up.*, node_os_info.name
FROM up LEFT JOIN node_os_info ON (instance)

How can I filter within a created column in pgsql

I have the following query in pgsql to extract a tabular view that has the 3 columns - node, value and time. The column node is created and if I try to filter for a specific node, I get a message - "node is not a column in the table metrics" so do I need to use regex or a specific where node like xyz.
Query:
select label->>'machine' as node, value, time
from metrics
where metrics.name='total_bytes' and time<'09-22-2020'
group by node, metrics.value, metrics.time
order by node;`
Thanks for any help.
You can write the clause this way:
where label->>'machine' like 'xyz'
To use node directly, you could do this:
with my_metrics as (
select label->>'machine' as node, value, time
from metrics
where metrics.name='total_bytes' and time<'09-22-2020'
)
select node, value, time
from my_metrics
where node like 'xyz'
group by node, value, time
order by node;

Sum query result by name specified by regex

I am using Grafana together with Prometheus to display data of my Pods from Kubernetes Cluster. Here I am displaying Memory Usage for each Pod by name:
sum (container_memory_working_set_bytes{namespace="namespace1", image!="",name=~"^k8s_.*",kubernetes_io_hostname=~"^$Node$"}) by (pod_name)
It gives correct result for each pod. In example:
namespace1-eventstore-1
namespace1-eventstore-0
avsandbox-X-64ff4d-rl9z6
avsandbox-X-64ff4d-ldfnx
avsandbox-Y-7d9df9ddff-asdf
avsandbox-Y-7d9df9ddff-dfas
avsandbox-Z-5957dbaf58dt-gds24
avsandbox-Z-5957dbaf58dt-g4gd7
Now I want to sum them by their respective names to receive following result or closest I can get to it
namespace1-eventstore
avsandbox-X
avsandbox-Y
avsandbox-Z
So in conclusion I want to sum everything that has same name before second -. How can I achieve that?
Edit.: Here's further example what I'm looking for (hopefully it's helpful to give practical example and general idea)
sum (container_memory_working_set_bytes{namespace="namespace1", image!="",name=~"^k8s_.*",kubernetes_io_hostname=~"^$Node$"}) by (pod_name="([a-zA-Z0-9]+-[a-zA-Z0-9])-.*")
But that's not possible because of syntax.

Prometheus many-to-many problem for kube cronjobs

Hy there,
I'm trying to configure Kubernetes Cronjobs monitoring & alerts with Prometheus. I found this helpful guide
But I always get a many-to-many matching not allowed: matching labels must be unique on one side error.
For example, this is the PromQL query which triggers this error:
max(
kube_job_status_start_time
* ON(job_name) GROUP_RIGHT()
kube_job_labels{label_cronjob!=""}
) BY (job_name, label_cronjob)
The queries by itself result in e.g. these metrics
kube_job_status_start_time:
kube_job_status_start_time{app="kube-state-metrics",chart="kube-state-metrics-0.12.1",heritage="Tiller",instance="REDACTED",job="kubernetes-service-endpoints",job_name="test-1546295400",kubernetes_name="kube-state-metrics",kubernetes_namespace="monitoring",kubernetes_node="REDACTED",namespace="test-develop",release="kube-state-metrics"}
kube_job_labels{label_cronjob!=""}:
kube_job_labels{app="kube-state-metrics",chart="kube-state-metrics-0.12.1",heritage="Tiller",instance="REDACTED",job="kubernetes-service-endpoints",job_name="test-1546295400",kubernetes_name="kube-state-metrics",kubernetes_namespace="monitoring",kubernetes_node="REDACTED",label_cronjob="test",label_environment="test-develop",namespace="test-develop",release="kube-state-metrics"}
Is there something I'm missing here? The same many-to-many error happens for every query I tried from the guide.
Even constructing it by myself from ground up resulted in the same error.
Hope you can help me out here :)
In my case I don't get this extra label from Prometheus when installed via helm (stable/prometheus-operator).
You need to configure it in Prometheus. It calls: honor_labels: false
# If honor_labels is set to "false", label conflicts are resolved by renaming
# conflicting labels in the scraped data to "exported_<original-label>" (for
# example "exported_instance", "exported_job") and then attaching server-side
# labels.
So you have to configure your prometheus.yaml file - config with option honor_labels: false
# Setting honor_labels to "true" is useful for use cases such as federation and
# scraping the Pushgateway, where all labels specified in the target should be
# preserved
Anyway if I have it like this (I have now exported_jobs), still can't do proper query, but I guess is still because of my LHS.
Error executing query: found duplicate series for the match group
{exported_job="kube-state-metrics"} on the left hand-side of the operation:
[{__name__=
I ran into the same issue when I followed that article, but for me, I actually get duplicate job names but in different namespaces.
Ex. When running kube_job_status_start_time:
kube_job_status_start_time{instance="REDACTED",job="kube-state-metrics",job_name="job-abc-123",namespace="us"}
kube_job_status_start_time{instance="REDACTED",job="kube-state-metrics",job_name="job-abc-123",namespace="ca"}
So I had to either add a filter for the namespace or add namespace into the ON/BY clauses to get it to be unique.
e.g. for one of the subqueries I had to do this:
max(
kube_job_status_start_time
* ON(namespace, job_name) GROUP_RIGHT()
kube_job_labels{label_cronjob!=""}
) BY (namespace, label_cronjob)
Essentially had to apply that principle to all the rest of the queries for it to work for me. Not sure if that applies in your case.
Replacing kube_job_status_start_time with max(kube_job_status_start_time) by (job_name) will aggregate out any duplicates and should resolve the error.
The resulting query will look like this
max(
max(kube_job_status_start_time) by (job_name)
* ON(job_name) GROUP_RIGHT()
kube_job_labels{label_cronjob!=""}
) BY (job_name, label_cronjob)
I dug into this issue a bit more, and I guess the root cause of it is within this one-to-many vector matching expression:
kube_job_status_start_time * ON(job_name) GROUP_RIGHT() kube_job_labels{label_cronjob!=""}
where the group modifier "GROUP_RIGHT()" suggests, that each vector element from the left side (kube_job_status_start_time) can match with multiple elements on the right side (kube_job_labels), based on common label (job_name). The thing is that we are really dealing here with many-to-many matching, as each vector element from right side can match also multiple elements from left vector as well:
I think that what we are missing here is the way to uniquely identify exported Job objects from K8S by Prometheus. The author of this blog post, mentions about this feature in his setup:
...Prometheus resolves this collision of label names by including the
raw metric’s label as an exported_job label...
In my case I don't get this extra label from Prometheus when installed via helm (stable/prometheus-operator).
Regarding the missing labels - make sure that your kube-state-metrics is configured with a --metric-labels-allowlist. This is "new" since kube-state-metrics v2. See https://kubernetes.io/blog/2021/04/13/kube-state-metrics-v-2-0/#what-is-new-in-v2-0
By default, the metric contains only name and namespace labels.
But... the original guide is not woking with newer kube-state-metrics anyway. I can recommend this guide, which is a rework and does not need the labels.

Combine Grafana metrics with mismatched labels

I have two metrics (relating to memory usage in my Kubernetes pods) defined as follows:
kube_pod_container_resource_limits_memory_bytes{app="kube-state-metrics",container="foo",instance="10.244.0.7:8080",job="kubernetes-endpoints",kubernetes_name="kube-state-metrics",kubernetes_namespace="monitoring",namespace="test",node="aks-nodepool1-25518080-0",pod="foo-cb9bc5fb5-2bghz"}
container_memory_working_set_bytes{agentpool="nodepool1",beta_kubernetes_io_arch="amd64",beta_kubernetes_io_instance_type="Standard_A2",beta_kubernetes_io_os="linux",container_name="foo",failure_domain_beta_kubernetes_io_region="westeurope",failure_domain_beta_kubernetes_io_zone="1",id="/kubepods/burstable/pod5b0099a9-eeff-11e8-884b-ca2011a99774/eeb183b21e2b3226a32de41dd85d7a2e9fc8715cf31ea7109bfbb2cae7c00c44",image="#sha256:6d6003ba86a0b7f74f512b08768093b4c098e825bd7850db66d11f66bc384870",instance="aks-nodepool1-25518080-0",job="kubernetes-cadvisor",kubernetes_azure_com_cluster="MC_test.planned.bthbygg.se_bthbygg-test_westeurope",kubernetes_io_hostname="aks-nodepool1-25518080-0",kubernetes_io_role="agent",name="k8s_foo_foo-cb9bc5fb5-2bghz_test_5b0099a9-eeff-11e8-884b-ca2011a99774_0",namespace="test",pod_name="foo-cb9bc5fb5-2bghz",storageprofile="managed",storagetier="Standard_LRS"}
I want to combine these two into a percentage, by doing something like
container_memory_working_set_bytes{namespace="test"}
/ kube_pod_container_resource_limits_memory_bytes{namespace="test"}
but that gives me no data back, presumably because there are no matching labels to join the data sets on. As you can see, I do have matching label values, but the label names don't match.
Is there somehow I can formulate my query to join these on e.g. pod == pod_name, without having to change the metrics at the other end (where they are exported)?
You can use PromQL label_replace function to create a new matching label from the original labels.
For instance, you can use the below expression to add a container_name="foo" label to the first metric which can be used to do the join:
label_replace(
kube_pod_container_resource_limits_memory_bytes,
"container_name", "$1", "container", "(.*)")
You can use the above patern to create new labels that can be used for the matching.