Grafana with OpenTSDB source - how to disable implicit gauge downsample? - grafana

I have Grafana with Bosun connected as OpenTSDB source. Problem is Grafana interprets data in different way than Bosun. To be precise, when I set same query in Bosun and in Grafana, resulting graphs differ. When I turn on gauge downsample, graphs are same. So I guess there is implicit gauging of some sort in Grafana. I would be grateful for some hint how to disable that gauging.
Bosun:
Grafana:

The os.net.bytes metric includes metadata to indicate that it is a rate. When you use the default "auto" in Bosun's graph page it will convert the raw counter data into a rate calculation. Grafana's OpenTSDB data source does not have an auto mode, so things always default to a gauge unless you check the Rate box at the bottom of the metric.
In your example you should just need to check the rate box to get the graphs to match. You can also use the Counter option and provide a max or reset value if you need to deal with counter overflows
You can also use the Bosun data source if you want to use a Bosun query instead of accessing OpenTSDB directly. In this example we combine two queries to generate a Singlestat panel (displays last value and a line graph in the background)
The __ny-nexus01/02 part comes from using tsdbrelay to denormalize the metric and address high tag cardinality issues.

Related

Share kusto variables between grafana panels

Well, let's say that I have the query from my previous question: How to do multi graph time series on Grafana with Kusto
Then I'd like to consume the tiemposCicloBruto variable from one panel to another in order to avoid repeating queries.
I saw: https://grafana.com/blog/2020/10/14/learn-grafana-share-query-results-between-panels-to-reduce-load-time/
But there isn't any way to share variables at all...
I also tried it as a dashboard variable, but it doesn't seem to support tabular expressions at all...
You can share only input variables across dashboard panels. Variables work as primitive text substitution in one direction (from dashboard to query), and do not take into account any context in your query language.
Your link tells about sharing results of the query between different panels. If exact same result set returned to a panel fits your needs, you can reuse it "for free", without putting load on the database. You don't need to save it into any variable, you just set it as a pseudo-datasource and you get the result immediately.
You can factor this feature into design of you panels. Examples could be:
time series plus histogram visualizations of the same data;
time-series chart plus a panel with latest readings (or use other Grafana reduce expressions).

How to make sense of the micrometer metrics using SpringBoot 2, InfluxDB and Grafana?

I'm trying to configure a SpringBoot application to export metrics to InfluxDB to visualise them using a Grafana dashboard. I'm using this dashboard as an example which uses Prometheus as a backend.
For some metrics I have no problem figuring out how to create graphs for them but for some others I don't know how to create the graphs or even if it's possible at all. So I enumerate the things I'm not really sure about in the following points:
Is there any documentation where a value unit is described? The application I'm using as an example doesn't have any load on it so sometimes I don't know whether the value is a bit, a byte, a second, a millisecond, a count, etc.
Some measurements contain the tag 'metric_type = histogram' with fields 'count', 'sum', 'mean' and 'upper'. Again, here I don't know what the value units are, what upper means or how I'm suppose to plot them. Examples of this are 'http_server_requests' or 'jvm_gc_pause'.
From what I see in the Grafana dashboard example, it seems I should use these measurements of type histogram to create both a graph with counts and graphs with duration. For example I see I should be able to create a graph with the number of requests and another one with their duration. Or for the garbage collector, I should be able to provide a graph for the number of minor and major GCs and another for their duration.
As an example of measures I get inserted into InfluxDB:
time count exception mean method metric_type outcome status sum upper uri
1625579637946000000 1 None 0.892144 GET histogram SUCCESS 200 0.892144 0.892144 /actuator/health
or
time action cause count mean metric_type sum upper
1625581132316000000 end of minor GC Allocation Failure 1 2 histogram 2 2
I agree the documentation for micrometer is not great. I've had to dig through the code to find answers...
Regarding your questions about jvm_gc_pause, it is a Timer and the implementation is AbstractTimer which is a class that wraps a Histogram among other components. This particular metric is registered by the JvmGcMetrics class. The various measurements that are published to InfluxDB are determined by the InfluxMeterRegistry.writeTimer(Timer timer) method:
sum: timer.totalTime(getBaseTimeUnit()) // The total time of recorded events
count: timer.count() // The number of times stop has been called on the timer
mean: timer.mean(getBaseTimeUnit()) // totalTime()/count()
upper: timer.max(getBaseTimeUnit()) // The max time of a single event
The base time unit is milliseconds.
Similarly, http_server_requests appears to be a Timer as well.
I believe you are correct that the sensible thing is to chart on two separate Grafana panels: one panel for GC pause seconds using sum (or mean or upper), and one panel for GC events using count.

How do we change the "precision:ms" setting in the Grafana Query Inspector?

I have an InfluxDB database with only x11 data points in it. These data are not displaying correctly (or at least as I would expect) in Grafana when the time between them is shorter than 1ms.
If I insert data points 1 ms apart, then everything works as expected and I see all x11 points at the correct times, as shown below.:
However, if I delete these points and upload new ones but this time one point per 100 μs, then although the data displays correctly in InfluxDB, in Grafana I see only two points in my graph:
It seems like the data is being rounded/binned to the nearest millisecond, an that this is related to the “precision=ms” setting in the query here:
but I cannot find any way to change this setting. What is the correct way to fix this?
You can't configure Grafana to support different time precision for the InfluxDB. It is hardcoded in the source code: https://github.com/grafana/grafana/blob/36fd746c5df1438f27aa33fc74b24be77debc7ff/public/app/plugins/datasource/influxdb/datasource.ts#L364 (It may need to be fixed in multiple places of the source, not only in this one.)
So the correct way to fix it is to code it, which is of course not in the scope of this question.

Query Graphite Metrics for specific data points

I want to query my graphite server to retrieve certain metrics.
I am able to query all data points between certain time period but my requirement is, I want to query data points of specific time of previous days.
How can I do this?
The Graphite Render API supports a number of arguments in order to make your query more specific. Specifically, the from / until arguments will be useful to you, you can read about them here: https://graphite.readthedocs.io/en/latest/render_api.html#from-until
edit: I should add that if you're using Grafana for visulaising your data, you can click+drag on the graph to select specific time ranges or use the timepicker in the top-right corner to choose Custom and set your range there.

how to add a custom value in grafana legend?

There is a graph display elasticsearch index count, see below
I want to add a value: diff = max - min in Legend, how to implement it?
I'm pretty sure you can't, easily. You can hack your way around it by adding yet another query to your graph, something like
max_over_time(my_metric[[[__range_s]]s]) - min_over_time(my_metric[[[__range_s]]s])
Grafana will replace the [[__range_s]] bit with the length of the time range of the current dashboard, e.g. 3600 for the default 1h, so the query actually sent to Prometheus will be
max_over_time(my_metric[3600s]) - min_over_time(my_metric[3600s])
Meaning Prometheus will compute the difference between the max and min separately from Grafana (which does it on top of the samples returned by Prometheus). (It will also compute this difference for the whole time range, not just the most recent sample, which is what you're interested in.) Then you can tweak the display of said time series in Grafana (e.g. by setting line=0, fill=0) so it will not show up on the graph itself, only in the legend. But the legend will then display the current value of the difference, as well as its min, max, avg, which will be quite the crappy UX.
Edit: Or you can add said query to a separate panel (e.g. a table panel), to the right of your graph. That may let you better control the UX, although it still won't be part of the actual legend.
Edit 2: One final thing you could try, that would give you exactly what you want, is to tweak Grafana's graph panel to add a "range" value next to "min", "max" and the bunch. The source code is here, I'm pretty sure it's mostly a copy-pasta job. You likely wouldn't even have to rebuild all of Grafana, you could just package the modified panel as "Tweaked Graph Panel" plugin and drop it into your Grafana deployment's plugins folder. Then, in your dashboard, instead of using "Graph Panel", use "Tweaked Graph Panel".