Best fit prometheus metric data model for grafana sysdig - kubernetes

I am using prometheues metric in grafana UI emitted from sysdig dashboard.
I am implementing a state change metric i.e pod states and my data mode is below:
pod_request_state_duration(id,method="create",demoapi,state=creating-running)
I want to use promQL to find the changing state and display in grafana UI. Please help.

As the query is not exact, i will try to give a best possible solution.
Try using delta.
delta(pod_request_state_duration{id,method="create",demoapi,state=creating-running}[time_duration])

Related

Prometheus Config reload Annotation for Grafana

I want to show in Grafana with an Annotation if there is a successful Prometheus config reload.
Grafana v6.3.5 &
Prometheus v2.12.0
I imported an existing Dashboard for internal Prometheus Stats and saw that within this Dashboard they use the following Statement as Annotion: sum(changes(prometheus_config_last_reload_success_timestamp_seconds[10m]))
Sadly this does not work and I am not sure how to properly use the metric to create Annotations.
How can I use this Metric to make this work?
Since you are using a recent version of Grafana, you don't need this expression any more. There is a feature to display annotations base on series value.
If you want in annotations the successful reloads of configuration, you can simply use the value of the metric prometheus_config_last_reload_success_timestamp_seconds, multiplied by 1000 to have the timestamp in msec (as expected by Grafana). And there is a tick box at the bottom of the annotation panel Series value as timestamp to active.
Save your dashboard and that's all.

How to customize label from Netdata in Grafana

I'm using netdata to monitor my instance and containers.
I use prometheus to query the data and I use influxDB to store it
Netdata creates long chart name and I would like to reduce them to make my dashboard clearer.
I want the label to be the actual name of the containers :
nginx
grafana
netdata
...
But what I get is cgroup_<container_name>.<metric_name>
I can see that there is a pull request about legend formatting since 2016 but I was wondering if there is another solution.
Maybe directly from Netdata? Or maybe using another tool such as Graphite instead of Prometheus?
i'm not using prometheus so i can't help you but i'm on graphite. With graphite u can use alias() function to display shorter name.

Grafana and Prometheus: add metrics automatically

I'm using Grafana and Prometheus to monitor our server. We have a lot of database procedures like "select_users" or "insert_task". In order to monitor how many pending database procedure calls are there in the server, we add data points for every procedure call in Prometheus dynamically. Now we have data points like "pending_select_users", "pending_insert_task" in Prometheus.
However, since there are so many database procedures(and the number will increase during developing), it's not very practical for us to add metrics in Grafana for each data point manually. Is there a way we can add metrics dynamically in Grafana? Since all the data point have a common name prefix("pending_"), can we add metrics in Grafana with wildcard? Or is there a better way to do this?
Since Grafana uses JSON as the underlying dashboard DSL, you could dynamically create dashboards, every time you add a new metric, and import it (via API) into Grafana.
I'd add an automation on top of your Prometheus targets, scrape the metrics, and if new metrics (with the required prefix) are found without a matching dashboard, the automation would create it and import it into Grafana.
Grafana API: http://docs.grafana.org/http_api/ (specifically for Dashbboards).
The solution described by #Eitan is definitely feasible. The same goes for using a library like grafonnet to generate dashboards dynamically.
But the simplest approach in my opinion would be to create a variable in Grafana that contains all the label values you are interested in. Something like
label_values(metric_name{label_name=~"prefix*"}, label_name)
should work for that. And then use the repeating panels / rows feature of Grafana to repeat a set of panels for every value in the variable. Though this could get out of hand if you have dozens / hundreds of distinct values.
https://grafana.com/docs/grafana/latest/variables/repeat-panels-or-rows/
https://grafana.com/blog/2020/06/09/learn-grafana-how-to-automatically-repeat-rows-and-panels-in-dynamic-dashboards/
If you want to generate just a single dashboard from your Proimetheus metrics sample, you can use this service:
http://eljah.tatar/micrometer2grafana/

How to properly monitor ELB latency on AWS using Grafana?

I am trying to monitor Latency on ElasticBeanstalk environment using Grafana.
I get some things to work, and some things do not provide any information.
I am using "CloudWatch" data source.
There is ELB and ApplicationELB.
The ApplicationELB does not offer Latency metric. In fact, every metric I select here will result with "no data".
When I configure monitoring on AWS, I get this following graph:
I am able to query for Latency on a region using Grafana and I do get some correlation
As you can see around 13:50 some requests timed-out. But it is also obvious Grafana is showing additional information from other environments which I would like to ignore.
My query currently looks like this:
Which I know is too broad, but I do not know how to refine.
I tried using "InstanceName" as dimension, but it is not clear to me which ELB I should look for, and seems to me like ApplicationELB should be what I am looking for, but that one does not offer Latency and does not provide any data either way.
Using AvailabilityZone does not help, and that's the only other option for dimension (other than InstanceName).
I need a way to refine the query so I see the same result in AWS and Grafana.
A clarification about ApplicationELB and ELB would be great also!
Application ELB vs ELB: they are just different types of load balancers provided by AWS https://aws.amazon.com/elasticloadbalancing/ - I'm not sure which one is used by ElasticBeanstalk.
You need to add dimension to filter your metrics. Some metrics may need multiple dimensions for correct filtering. Available dimensions are available in the docs. For example LoadBalancerName is a correct dimension for AWS/ELB namespace: https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-cloudwatch-metrics.html
I recommend to use existing published AWS dashboard(s) (https://github.com/monitoringartist/grafana-aws-cloudwatch-dashboards - I'm the author) and then just customize them for your needs.

Grafana with OpenTSDB source - how to disable implicit gauge downsample?

I have Grafana with Bosun connected as OpenTSDB source. Problem is Grafana interprets data in different way than Bosun. To be precise, when I set same query in Bosun and in Grafana, resulting graphs differ. When I turn on gauge downsample, graphs are same. So I guess there is implicit gauging of some sort in Grafana. I would be grateful for some hint how to disable that gauging.
Bosun:
Grafana:
The os.net.bytes metric includes metadata to indicate that it is a rate. When you use the default "auto" in Bosun's graph page it will convert the raw counter data into a rate calculation. Grafana's OpenTSDB data source does not have an auto mode, so things always default to a gauge unless you check the Rate box at the bottom of the metric.
In your example you should just need to check the rate box to get the graphs to match. You can also use the Counter option and provide a max or reset value if you need to deal with counter overflows
You can also use the Bosun data source if you want to use a Bosun query instead of accessing OpenTSDB directly. In this example we combine two queries to generate a Singlestat panel (displays last value and a line graph in the background)
The __ny-nexus01/02 part comes from using tsdbrelay to denormalize the metric and address high tag cardinality issues.