Show metrics in Grafana from the Kubernetes Pod that was scraped last by Prometheus - kubernetes

Context
We have a Spring Boot application, deployed into K8s cluster (with 2 instances) configured with Micrometer exporter for Prometheus and visualization in Grafana.
My custom metrics
I've implemented couple of additional Micrometer metrics, that report some information regarding business data in the database (PostgreSQL) and I could see those metrics in Grafana, however separately for each pod.
Problem:
For our 2 pods in Grafana - I can see separate set of same metrics and the most recent value can be found by choosing (by label) one of the pods.
However there is no way to tell which pod reported the most recent values.
Is there a way to somehow always show the metrics values from the pod that was scraped last (ie it will contain the most fresh metric data)?
Right now in order to see the most fresh metric data - I have to switch pods and guess which one has the latest values.
(The metrics in question relate to database, therefore yielding the same values no matter the pod from which they are requested.)

In Prometheus, you can obtain the labels of the latest scrape using topk() and timestamp() function:
topk(1,timestamp(up{job="micrometer"}))
This can then be used in Grafana to populate a (hidden) variable containing the instance name:
Name: instance
Type: Query
Query: topk(1,timestamp(up{job="micrometer"}))
Regex: /.*instance="([^"]*)".*/
I advise to active the refresh on time range change to get the last scrape in your time range.
Then you can use the variable in all your dashboard's queries:
micrometer_metric{instance="${instance}"}
EDIT: requester wants to update it on each data refresh
If you want to update it on each data refresh, it needs to be used in every query of your dashboard using AND logical operator:
micrometer_other_metric AND ON(instance) topk(1,timestamp(up{job="micrometer"}))
vector1 AND vector2 results in a vector consisting of the elements of vector1 for which there are elements in vector2 with exactly matching label sets. Other elements are dropped.

Related

I can't find CloudWatch metric in Grafana UI query editor/builder

I'm trying to create a Grafana dashboard that will reflect my AWS RDS cluster metrics.
For the simplicity I've chose CloudWatch as a datasource, It works well for showing the 'direct' metrics from the RDS cluster.
Problem is that we've switched to use RDS Proxy due the high number of connections we are required to support.
Now, I'm adjusting my dashboard to reflect few metrics that are lacking, most important is number of actual connections, which in AWS CloudWatch console presented by this query:
SELECT AVG(DatabaseConnections)
FROM SCHEMA("AWS/RDS", ProxyName,Target,TargetGroup)
WHERE Target = 'db:my-db-1'
AND ProxyName = 'my-db-rds-proxy'
AND TargetGroup = 'default'
Problem is that I can't find it anywhere in the CloudWatch Grafana query editor:
The only metric with "connections" is the standard DatabaseConnections which represents the 'direct' connections to the RDS cluster and not the connections to the RDS Proxy.
Any ideas?
That UI editor is generated from hardcoded list of metrics, which may not contain all metrics and dimensions (especially if they have been added recently), so in that case UI doesn't generate them in the selectbox.
But that is not a problem, because that selectbox is not a standard selectbox. It is an input, where you can write your own metric and dimension name. Just click there, write what you need and Hit enter to add (the same is applicable for:
Pro tip: don't use UI query builder (that's for beginners), but switch to Code and write your queries directly (anyway UI builder builds that query under the hood):
It would be nice if you create a Grafana PR - add these metrics and dimensions which are missing in the UI builder to metrics.go.
So for who ever will ever get here you should use ClientConnections and use the ProxyName as the dimension (which I didn't set initially
I was using old Grafana version (7.3.5) which didn't have it built in.

How to provide label_values in grafana variables with time range for prometheus data source?

I have used a variable in grafana which looks like this:
label_values(some_metric, service)
If the metric is not emitted by the data source at the current time the variable values are not available for the charts. The variable in my case is the release name and all the charts of grafana are dependent on this variable.
After the server I was monitoring crashed, this metric is not emitted. Even if I set a time range to match the time when metric was emitted, it has no impact as the query for the variable is not taking the time range into account.
In Prometheus I can see the values for the metric using the query:
some_metric[24h]
In grafana this is invalid:
label_values(some_metric[24h], service)
Also as per the documentation its invalid to provide $__range etc for label_values.
If I have to use the query_result instead how do I write the above invalid grafana query in correct way so that I get the same result as label_values?
Is there any other way to do this?
The data source is Prometheus.
I'd suggest query_result(count by (somelabel)(count_over_time(some_metric[$__range]))) and then use regular expressions to extract out the label value you want.
That I'm using count here isn't too important, it's more that I'm using an over_time function and then aggregating.
The most straightforward and lightweight solution is to use last_over_time function. For example, the following Grafana query template would return all the unique service label values for all the some_metric time series, which were available during the last 24 hours:
label_values(last_over_time(some_metric[24h]), service)

Dynamic dropdown values grafana with prometheus

This is probably simple and I am missing some piece.
I have a grafana dashboard backed by prometheus. Prometheus is running in two different kubernetes cluster.
What I want is the first dropdown to be the cluster - say A and B. And based on what I select in the first dropdown, I want the values populated in the second dropdown. The second dropdown in my case is label_values.
The first dropdown is defined by variable datasource and is type datasource and type for datasource options is prometheus.
For the second dropdown, I have variable service, type=Query.
In the query options, I define the query as label_values(rt) but that gives the values of all labels irrespective of the cluster I chose in the first dropdown.
Any help is appreciated.
You need to use the value of the first template variable in the query for the second. I.e. assuming your metric labels for cluster and service are actually cluster and respectively service then you should define your template variable queries as:
cluster: label_values(up, cluster)
service: label_values(up{cluster="$cluster"}, service)
This will automagically populate the second dropdown whenever you change selection in the first.

grafana dashboard for prometheus not working

I am newbie to grafana and prometheus. I setup prometheus, grafana, alertmanager, nodeexporter and cadvisor using the docker-compose.yml from this post https://github.com/vegasbrianc/prometheus
And imported grafana dashboard #893 from https://grafana.com/dashboards/893
But the dashboard is not working as I can see N/A in some panels. For example below are the queries used by the panels and I couldn't figure out how to get the values for the template variable in the query. I looked at http://node-exporter:9100/metrics and do not see a value for variable '$server'
Query1: time() - node_boot_time{instance=~"$server:.*"}
Query2:min((node_filesystem_size_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"} - node_filesystem_free_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"} )/ node_filesystem_size_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"})
What should I configure for node-exporter and prometheus to evaluate the template variable $server in the queries?
$server is a Grafana template variable. These usually show up as dropdowns at the top of the Grafana dashboard.
label_values is a Prometheus-specific Grafana function that is applied to a Prometheus query. Your particular example, label_values(node_boot_time, instance) will return all values of the instance label for all node_boot_time metrics collected by Prometheus (i.e. all node exporter targets monitored by Prometheus).
I have no experience with the particular dashboard you are using (or node exporter, for that matter), but usually the cause for some panels displaying "N/A" or no values while other panels work just fine is that the underlying metric names might have changed. You can click on the header of the problematic panel in Grafana, select Edit, then click on the Metrics tab to try different metric names. For "inspiration", check the /metrics endpoint of your node exporter. If you don't know how to get to it, on the Prometheus web interface navigate to Status > Targets and click on the URL of your node exporter.
An old question, but it still didn't work for me.
The reason is that the label_values(...) works fine obtaining all the instance names that have a node_boot_time metric.
The problem is in the regex that follows the expression (next line). In my case it was something tricky resembling "/([^:].*):/". My instance names start with "i-" and contain no colon, so nothing was being selected. I just used a ProductCode to figure out the right instances instead.

Varying labels in Prometheus

I annotate my Kubernetes objects with things like version and whom to contact when there are failures. How would I relay this information to Prometheus, knowing that these annotation values will frequently change? I can't capture this information in Prometheus labels, as they serve as the primary key for a target (e.g. if the version changes, it's a new target altogether, which I don't want). Thanks!
I just wrote a blog post about this exact topic! https://www.weave.works/aggregating-pod-resource-cpu-memory-usage-arbitrary-labels-prometheus/
The trick is Kubelet/cAdvisor doesn't expose them directly, so I run a little exporter which does, and join this with the pod name in PromQL. The exporter is: https://github.com/tomwilkie/kube-api-exporter
You can do a join in Prometheus like this:
sum by (namespace, name) (
sum(rate(container_cpu_usage_seconds_total{image!=""}[5m])) by (pod_name, namespace)
* on (pod_name) group_left(name)
k8s_pod_labels{job="monitoring/kube-api-exporter"}
)
Here I'm using a label called "name", but it could be any label.
We use the same trick to get metrics (such as error rate) by version, which we then use to drive our continuous deployment system. kube-api-exporter exports a bunch of useful meta-information about Kubernetes objects to Prometheus.
Hope this helps!