Prometheus Grafana Templating order by count - grafana

I am trying to put a dropdown for each API end point which will show the QPS and Latency of http requests (RED metrics).
I used Grafana's templating and used the following prometheus query.
label_values(http_duration_milliseconds_count, api_path)
But the problem here is sort order. It shows some longtail api requests like /admin/phpMyAdmin all.
I want to do only the top 10 endpoints by count to be shown in this drop down. How do I achieve this?
Attached an image for reference on my first dashboard.

We can use query_result to achieve this.
https://grafana.com/docs/grafana/latest/datasources/prometheus/template-variables/#use-query-variables
query_result(topk(10, sort_desc(sum(http_tt_ms_count) by (api_path))))
http_tt_ms_count - is my metric timeseries of Prometheus with time taken.
api_path - is my label name
This query_result will give three-tuple value like this.
{api_path="/search/query"} 25704195 1507641522000
used the Regex field in query path to get only the api names.
*api_path="(.*)".*
This looks like a long way but
label_values((topk(10, sort_desc(sum(http_tt_ms_count) by (api_path)))), api_path)
is not working in Grafana which made me to go into this path.

Related

Grafana - How to set a default field value for a visualization with a cloudwatch query as the data source

I'm new to grafana, so I might be missing something obvious. But, I have a custom cloudwatch metric that records http response codes into buckets (e.g. 2xx, 3xx, etc.).
My grafana visualization is using a query to pull and group data from cloudwatch and the resulting fields are dynamic: 2xx (us-east-1), 2xx (us-west-1), 3xx (us-east-1), etc.
I then use transformations to aggregate those values for a global view of the data:
The problem is, I can't create the transformation until the data exists. I'd like to have a 5xx field, but since that data is sporadic, it doesn't show up in the UI and I can't find a way to force "5xx (...)" to exist and have it get used when/if those response codes start occurring.
Is there a way to create placeholder fields somehow to achieve this?
You can't create it in the UI. But you have still option to edit that in the panel model directly. It is JSON, which represent whole panel. Edit it manually - in the panel menu click Inspect > Panel JSON and create&customize another item in the transformation section. It is not very convenient option to edit panel, but you will achieve your target.

How to provide label_values in grafana variables with time range for prometheus data source?

I have used a variable in grafana which looks like this:
label_values(some_metric, service)
If the metric is not emitted by the data source at the current time the variable values are not available for the charts. The variable in my case is the release name and all the charts of grafana are dependent on this variable.
After the server I was monitoring crashed, this metric is not emitted. Even if I set a time range to match the time when metric was emitted, it has no impact as the query for the variable is not taking the time range into account.
In Prometheus I can see the values for the metric using the query:
some_metric[24h]
In grafana this is invalid:
label_values(some_metric[24h], service)
Also as per the documentation its invalid to provide $__range etc for label_values.
If I have to use the query_result instead how do I write the above invalid grafana query in correct way so that I get the same result as label_values?
Is there any other way to do this?
The data source is Prometheus.
I'd suggest query_result(count by (somelabel)(count_over_time(some_metric[$__range]))) and then use regular expressions to extract out the label value you want.
That I'm using count here isn't too important, it's more that I'm using an over_time function and then aggregating.
The most straightforward and lightweight solution is to use last_over_time function. For example, the following Grafana query template would return all the unique service label values for all the some_metric time series, which were available during the last 24 hours:
label_values(last_over_time(some_metric[24h]), service)

grafana dashboard for prometheus not working

I am newbie to grafana and prometheus. I setup prometheus, grafana, alertmanager, nodeexporter and cadvisor using the docker-compose.yml from this post https://github.com/vegasbrianc/prometheus
And imported grafana dashboard #893 from https://grafana.com/dashboards/893
But the dashboard is not working as I can see N/A in some panels. For example below are the queries used by the panels and I couldn't figure out how to get the values for the template variable in the query. I looked at http://node-exporter:9100/metrics and do not see a value for variable '$server'
Query1: time() - node_boot_time{instance=~"$server:.*"}
Query2:min((node_filesystem_size_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"} - node_filesystem_free_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"} )/ node_filesystem_size_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"})
What should I configure for node-exporter and prometheus to evaluate the template variable $server in the queries?
$server is a Grafana template variable. These usually show up as dropdowns at the top of the Grafana dashboard.
label_values is a Prometheus-specific Grafana function that is applied to a Prometheus query. Your particular example, label_values(node_boot_time, instance) will return all values of the instance label for all node_boot_time metrics collected by Prometheus (i.e. all node exporter targets monitored by Prometheus).
I have no experience with the particular dashboard you are using (or node exporter, for that matter), but usually the cause for some panels displaying "N/A" or no values while other panels work just fine is that the underlying metric names might have changed. You can click on the header of the problematic panel in Grafana, select Edit, then click on the Metrics tab to try different metric names. For "inspiration", check the /metrics endpoint of your node exporter. If you don't know how to get to it, on the Prometheus web interface navigate to Status > Targets and click on the URL of your node exporter.
An old question, but it still didn't work for me.
The reason is that the label_values(...) works fine obtaining all the instance names that have a node_boot_time metric.
The problem is in the regex that follows the expression (next line). In my case it was something tricky resembling "/([^:].*):/". My instance names start with "i-" and contain no colon, so nothing was being selected. I just used a ProductCode to figure out the right instances instead.

Tracking events with prometheus and grafana

There's an article "Tracking Every Release" which tells about displaying a vertical line on graphs for every code deployment. They are using Graphite. I would like to do something similar with Prometheus 2.2 and Grafana 5.1. More specifically I want to get an "application start" event displayed on a graph.
Grafana annotations seem to be the appropriate mechanism for this but I can't figure out what type of prometheus metric to use and how to query it.
The simplest way to do this is via the same basic approach as in the article, by having your deployment tool tell Grafana when it performs a deployment.
Grafan has a built-in system for storing annotations, which are displayed on graphs as vertical lines and can have text associated with them. It would be as simple as creating an API key in your Grafana instance and adding a curl call to your deploy script:
curl -H "Authorization: Bearer <apikey>" http://grafana:3000/api/annotations -H "Content-Type: application/json" -d '{"text":"version 1.2.3 deployed","tags":["deploy","production"]}'
For more info on the available options check the documentation:
http://docs.grafana.org/http_api/annotations/
Once you have your deployments being added as annotations, you can display those on your dashboard by going to the annotations tab in the dashboard settings and adding a new annotation source:
Then the annotations will be shown on the panels in your dashboard:
You can get the same result purely from Prometheus metrics, no need to push anything into Grafana:
If you wanted to track all restarts your search expression could be something like:
changes(start_time_seconds{job="foo",env="prod"} > 0
Or something like this if you only wanted to track version changes (and you had some sort of info metric that provided the version):
alertmanager_build_info unless max_over_time(alertmanager_build_info[1d] offset 5m)
The latter expression should only produce an output for 5 minutes whenever a new alertmanager_build_info metric appears (i.e. one with different labels such as version). You can further tweak it to only produce an output when version changes, e.g. by aggregating away all other labels.
A note here as technology has evolved. We get deployment job state information in Prometheus metrics format scraped directly from the community edition of Hashicorp's Nomad and we view this information in Grafana.
In your case, you would just add an additional query to an existing panel to overlay job start events, which is equivalent to a new deployment for us. There are a lot of related metrics "out of the box," such as for a change in job version that can be considered as well. The main point is no additional work is required besides adding a query in Grafana.

Grafana - Graph with metrics on demand

I am using Grafana for my application, where I have metrics being exposed from my data source on demand, and I want to monitor such on-demand metrics in Grafana in a user-friendly graph. For example, until an exception has been hit by my application, the data source does NOT expose the metric named 'Exception'. However, I want to create a graph before hand where I should be able to specify the metric 'Exception' and it should log it in the graph whenever my data source exposes the 'Exception' metric.
When I try to create a graph on Grafana using the web GUI, I'm unable to see these 'on-demand metrics' since they've not yet been exposed by my data source. However, I should be able to configure the graph such that in case these metrics are exposed then show them. If I go ahead and type out the non-exposed metric name in the metrics field, I get an error "Timeseries data request error".
Does Grafana provide a method to do this? If so, what am I missing?
It depends on what data source you are using (Graphite, InfluxDB, OpenTSDB?).
For graphite you can enter raw query mode (pen button). To specify what ever query you want, it does not need to exist. Same is true InfluxDB, you find the raw query mode in the hamburger menu drop down to the right of eacy query.
You can also use wildcards in a graphite query (or regex in InfluxDB) to create generic graphs that will add series to the graph as they come in.