Prometheus query equivalent to SQL DISTINCT - grafana

I have multiple Prometheus instances providing the same metric, such as:
my_metric{app="foo", state="active", instance="server-1"} 20
my_metric{app="foo", state="inactive", instance="server-1"} 30
my_metric{app="foo", state="active", instance="server-2"} 20
my_metric{app="foo", state="inactive", instance="server-2"} 30
Now I want to display this metric in a Grafana singlestat widget. When I use the following query...
sum(my_metric{app="foo", state="active"})
...it, of course, sums up all values and returns 40. So I tell Prometheus to sum it by instance...
sum(my_metric{app="foo", state="active"}) by (instance)
...which results in a "Multiple Series Error" in Grafana. Is there a way to tell Prometheus/Grafana to only use the first of the results?

I don't know of a distinct, but I think this would work too:
topk(1, my_metric{app="foo", state="active"} by (instance))
Check out the second to last example in here:
https://prometheus.io/docs/prometheus/latest/querying/examples/

One way I just found is to additionally do an average over all values:
avg(sum(my_metric{app="foo", state="active"}) by(instance))

If you need to return an arbitrary time series out of multiple matching time series, then this can be done with topk() or bottomk() functions. For example, the following query returns a single time series with the maximum value out of multiple time series which match my_metric{app="foo", state="active"}:
topk(1, my_metric{app="foo", state="active"})
You need to set instant query option in Grafana when using topk(). Otherwise topk(1, ...) may return multiple time series when it is used for building a graph with range query. This is because topk(1, ...) selects a single time series with the max value individually per each point on the graph. Different points on the graph may have different time series with the max value. There is a workaround, which allows returning a single series out of many series on a graph in alternative Prometheus-like systems such as VictoriaMetrics. It provides topk_* and bottomk_* functions for this purpose. See, for example, topk_last or topk_avg.
Note that topk() has no common grounds with DISTINCT from SQL. If you need to select distinct label values with PromQL, then you need to use count(...) by (label). It will return unique label values for the given label alongside the number of unique time series per each label value. For example, count(my_metric) by (app) will return unique app label names for time series with my_metric name. This is roughly equivalent to the following SQL with DISTINCT clause:
SELECT DISTINCT app FROM my_metric
See count() docs for details.

Related

How can I add a line per grouping in grafana without hardcoding multiple queries?

I have data that can be aggregated by the company that produced the data item. There are around 96 such companies. As such I don't want to use 96 queries, as this seems inefficient.
How can I get grafana to do this with time series data please so I can get all the lines on the same graph?
CAVEAT: I get that 96 data streams is a lot on one graph. However I'm interested in boundary breaches and outliers which don't occur very often per supplier.
Grafana creates multiple lines if you have 3 variables called time, metric and value. Metric has to be a string and in this case I suppose it is the company id. If it is an integer id then you need to cast it to string. Also the query type needs to be time series.
For me, this works:
SELECT
date AS 'time',
cast(runDate AS char) as 'metric',
value/1000 as 'value'
FROM forecast
WHERE $__timeFilter(runDate)
ORDER BY date

Find count of active users in the last 29 days in Tableau

Require assistance in calculating the Total Active Users from March 16 2020 to Feb 16 2020.
I have tried using calculated fields, but not getting the correct results. Please advise.
Thank you,
Nirmal
To find the number of unique values that appear in a field, say [user_code], you can use the COUNT DISTINCT function, COUNTD() as in COUNTD([user_code])
To restrict the data to a particular time range, one way is put your date field on the Filter shelf and choose the settings that include only the data rows you want — say the range from 2/16 to 3/16 as you stated.
Alternatively, you can push the filtering condition into the calculation with an IF function call, as in COUNTD(IF <data is relevant> THEN [user_code] END) Thus effectively combining the two techniques. That works because if there is no ELSE clause and the IF condition is False then the IF statement evaluates to null. Since COUNTD() silently ignores nulls, like other aggregation functions, the expression acts as if the irrelevant data rows were filtered.
So, for example,
COUNTD(IF [dates] >= #2/16/2020# AND [dates] <= #3/16/2020# THEN [user_code] END)
Will tell you then number of unique user codes during the period between 2/16 and 3/16. The DateDiff() function will probably be useful in more elaborate tests.
Finally, what if you want more flexibility? You could easily use Parameters or Filter controls to let the user choose the date range interactively.
If you want this calculation repeated for each possible day, showing the unique users in the preceding 30 day period, as some sort of rolling calculation, then you’ll need to learn about some more advanced features. Either multiple calculations as above for different time ranges, using Table Calculations, or some data prep and/or data padding with Tableau Prep Builder, Python or some other technique — mostly because in that scenario each data row contributes to multiple rolling counts, rather than one count when partitioning the data by some dimension.

Grafana: combining two queries from two prometheus exporters

I have two exporters for feeding data into prometheus - the node exporter and the elasticsearch exporter. I'm trying to combine sources from both exporters into one query, but unfortunately get "No data points" in the graph.
Each of the series successfully shows data:
elasticsearch_jvm_memory_max_bytes{cluster="$cluster", name=~"$node"}
node_memory_MemTotal{name=~"$node"}
This is the result when I try to subtract the two series from one another:
node_memory_MemTotal{name=~"$node"} - elasticsearch_jvm_memory_max_bytes{cluster="$cluster", name=~"$node"}
What am I missing here?
Thanks.
The subtraction you are trying here is more complex than it reads in the beginning. On both sides of the - operator are queries that can result in one or more time series. So the operation requested works as follows: Execute the query on the left hand side and get a result of one or more time series. A time series means a unique combination of a metric and all its labels and their values. Then a second query for your right hand side is executed which also results in one or more time series. Now to calculate the results, only those combinations with matching label combinations are used.
For your example this means that the metrics from node_exporter and from the elasticsearch_exporter have different label names (or even only different values for the labels). When there are no combinations that exist on both sides, you will see the empty result. For details on how operators are applied, please see the prometheus docs.
To solve your problem, you could do the following:
Check the metrics of both left and right side on their own
Evaluate if there are additional labels that could be ignored
See if there is a good label to match on (e.g. instance / node / hostname)
Use the ignoring(a,b,c) on the required side(s) to drop superfluous dimensions, e.g. the job
Try the following query:
node_memory_MemTotal{name=~"$node"}
- on(name)
sum(elasticsearch_jvm_memory_max_bytes{cluster="$cluster", name=~"$node"}) by (name)
It works in the following way:
It selects all the time series matching the node_memory_MemTotal{name=~"$node"} time series selector.
It selects all the time series matching the elasticsearch_jvm_memory_max_bytes{cluster="$cluster", name=~"$node"} selector.
It groups time series found at step 2 by name label value and sums time series in each group with sum() aggregate function. The end result of the sum(...) by (name) is per-name sums.
It finds pairs of time series with identical name label value from the step 1 and step 3 and calculates the difference between the first and the second time series in each pair. The on(name) modifier is used for limiting the set of labels, which are used for finding time series pairs with matching labels. See more details about this process here.

Prometheus query to count unique label values

I want to count number of unique label values. Kind of like
select count (distinct a) from hello_info
For example if my metric 'hello_info' has labels a and b. I want to count number of unique a's. Here the count would be 3 for a = "1", "2", "3".
hello_info(a="1", b="ddd")
hello_info(a="2", b="eee")
hello_info(a="1", b="fff")
hello_info(a="3", b="ggg")
count(count by (a) (hello_info))
First you want an aggregator with a result per value of a, and then you can count them.
Other example:
If you want to count the number of apps deployed in a kubernetes cluster based on different values of a label( ex:app):
count(count(kube_pod_labels{app=~".*"}) by (app))
The count(count(hello_info) by (a)) is equivalent to the following SQL:
SELECT
time_bucket('5 minutes', timestamp) AS t,
COUNT(DISTINCT a)
FROM hello_info
GROUP BY t
See time_bucket() function description.
E.g. it returns the number of distinct values for a label per each 5-minute interval by default - see staleness docs for details about 5-minute interval.
If you need to calculate the number of unique values for a label over custom interval (for example, over the last day), then the following PromQL query must be used instead:
count(count(last_over_time(hello_info[1d])) by (a))
The custom interval - 1d in the case above - can be changed to an arbitrary value - see these docs for possible values, which can be used there.
This query uses last_over_time() function for selecting all the time series, which were active during the last day. Time series can stop receiving new samples and become inactive at any time. Such time series aren't captured with simple count(...) by (a) after 5 minutes of inactivity. New deployments in Kubernetes and horizontal pod autoscaling are the most frequent source of big number of inactive time series (aka high churn rate).

How show metrics on graph if value is null

In Prometheus exists some custom metrics from DB
In Graphana I maked Graph Dashboard with datasource from Prometheus
count(custom_metrics_project1<1)
If condition custom_metrics_project1<1 not found any metrics Graphana displays Points not found.
How change condition to be displyed 0?
You can select how to display NULL values in the edit menu for a given graph.
Please follow these steps:
click on your graph
click on edit to bring up the edit menu for the graph
switch to the tag Display
choose one of the options from the Null Value dropdown
The option you want is: null as zero.
Just use count(custom_metrics_project1<1) or on() vector(0). This will substitute all the gaps with zeroes. See docs for or operator and docs for vector() function.
Note that the q or on() vector(0) trick works only when q returns a single time series. If q returns multiple time series, then Prometheus doesn't provide an easy way to fill gaps in every time series with zeroes :(
If you still need to fill gaps in multiple time series, then you can use default operator in VictoriaMetrics. For example, the following query would fill all the gaps for all the time series returned from q with zeroes:
q default 0
VictoriaMetrics also provides interpolate, keep_last_value and keep_next_value functions, which can be used for filling gaps in more sophisticated ways.