I can't find CloudWatch metric in Grafana UI query editor/builder - grafana

I'm trying to create a Grafana dashboard that will reflect my AWS RDS cluster metrics.
For the simplicity I've chose CloudWatch as a datasource, It works well for showing the 'direct' metrics from the RDS cluster.
Problem is that we've switched to use RDS Proxy due the high number of connections we are required to support.
Now, I'm adjusting my dashboard to reflect few metrics that are lacking, most important is number of actual connections, which in AWS CloudWatch console presented by this query:
SELECT AVG(DatabaseConnections)
FROM SCHEMA("AWS/RDS", ProxyName,Target,TargetGroup)
WHERE Target = 'db:my-db-1'
AND ProxyName = 'my-db-rds-proxy'
AND TargetGroup = 'default'
Problem is that I can't find it anywhere in the CloudWatch Grafana query editor:
The only metric with "connections" is the standard DatabaseConnections which represents the 'direct' connections to the RDS cluster and not the connections to the RDS Proxy.
Any ideas?

That UI editor is generated from hardcoded list of metrics, which may not contain all metrics and dimensions (especially if they have been added recently), so in that case UI doesn't generate them in the selectbox.
But that is not a problem, because that selectbox is not a standard selectbox. It is an input, where you can write your own metric and dimension name. Just click there, write what you need and Hit enter to add (the same is applicable for:
Pro tip: don't use UI query builder (that's for beginners), but switch to Code and write your queries directly (anyway UI builder builds that query under the hood):
It would be nice if you create a Grafana PR - add these metrics and dimensions which are missing in the UI builder to metrics.go.

So for who ever will ever get here you should use ClientConnections and use the ProxyName as the dimension (which I didn't set initially
I was using old Grafana version (7.3.5) which didn't have it built in.

Related

How the dead nodes are handled in AWS OpenSearch?

Trying to understand what is the right approach to connect to AWS OpenSearch (single cluster, multiple data nodes).
To my understanding, as long as data nodes are behind the load balancer (according to this and other AWS docs: https://aws.amazon.com/blogs/database/set-access-control-for-amazon-elasticsearch-service/), we can not use:
var pool = new StaticConnectionPool(nodes);
and we probably should not use CloudConnectionPool - as originally it was dedicated to elastic search cloud and was left in open search client by mistake?
Hence we use SingleNodeConnectionPool and it works, but I've noticed several exceptions, which indicated that node had DeadUntil set to date one hour in advance - so I was wondering if that is expected behavior, as from client's perspective that is the only node it knows about?
What is correct way to connect to AWS OpenSearch that has multiple nodes and should I be concerned about DeadUntil property?

Show metrics in Grafana from the Kubernetes Pod that was scraped last by Prometheus

Context
We have a Spring Boot application, deployed into K8s cluster (with 2 instances) configured with Micrometer exporter for Prometheus and visualization in Grafana.
My custom metrics
I've implemented couple of additional Micrometer metrics, that report some information regarding business data in the database (PostgreSQL) and I could see those metrics in Grafana, however separately for each pod.
Problem:
For our 2 pods in Grafana - I can see separate set of same metrics and the most recent value can be found by choosing (by label) one of the pods.
However there is no way to tell which pod reported the most recent values.
Is there a way to somehow always show the metrics values from the pod that was scraped last (ie it will contain the most fresh metric data)?
Right now in order to see the most fresh metric data - I have to switch pods and guess which one has the latest values.
(The metrics in question relate to database, therefore yielding the same values no matter the pod from which they are requested.)
In Prometheus, you can obtain the labels of the latest scrape using topk() and timestamp() function:
topk(1,timestamp(up{job="micrometer"}))
This can then be used in Grafana to populate a (hidden) variable containing the instance name:
Name: instance
Type: Query
Query: topk(1,timestamp(up{job="micrometer"}))
Regex: /.*instance="([^"]*)".*/
I advise to active the refresh on time range change to get the last scrape in your time range.
Then you can use the variable in all your dashboard's queries:
micrometer_metric{instance="${instance}"}
EDIT: requester wants to update it on each data refresh
If you want to update it on each data refresh, it needs to be used in every query of your dashboard using AND logical operator:
micrometer_other_metric AND ON(instance) topk(1,timestamp(up{job="micrometer"}))
vector1 AND vector2 results in a vector consisting of the elements of vector1 for which there are elements in vector2 with exactly matching label sets. Other elements are dropped.

grafana dashboard for prometheus not working

I am newbie to grafana and prometheus. I setup prometheus, grafana, alertmanager, nodeexporter and cadvisor using the docker-compose.yml from this post https://github.com/vegasbrianc/prometheus
And imported grafana dashboard #893 from https://grafana.com/dashboards/893
But the dashboard is not working as I can see N/A in some panels. For example below are the queries used by the panels and I couldn't figure out how to get the values for the template variable in the query. I looked at http://node-exporter:9100/metrics and do not see a value for variable '$server'
Query1: time() - node_boot_time{instance=~"$server:.*"}
Query2:min((node_filesystem_size_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"} - node_filesystem_free_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"} )/ node_filesystem_size_bytes{fstype=~"xfs|ext4",instance=~"$server:.*"})
What should I configure for node-exporter and prometheus to evaluate the template variable $server in the queries?
$server is a Grafana template variable. These usually show up as dropdowns at the top of the Grafana dashboard.
label_values is a Prometheus-specific Grafana function that is applied to a Prometheus query. Your particular example, label_values(node_boot_time, instance) will return all values of the instance label for all node_boot_time metrics collected by Prometheus (i.e. all node exporter targets monitored by Prometheus).
I have no experience with the particular dashboard you are using (or node exporter, for that matter), but usually the cause for some panels displaying "N/A" or no values while other panels work just fine is that the underlying metric names might have changed. You can click on the header of the problematic panel in Grafana, select Edit, then click on the Metrics tab to try different metric names. For "inspiration", check the /metrics endpoint of your node exporter. If you don't know how to get to it, on the Prometheus web interface navigate to Status > Targets and click on the URL of your node exporter.
An old question, but it still didn't work for me.
The reason is that the label_values(...) works fine obtaining all the instance names that have a node_boot_time metric.
The problem is in the regex that follows the expression (next line). In my case it was something tricky resembling "/([^:].*):/". My instance names start with "i-" and contain no colon, so nothing was being selected. I just used a ProductCode to figure out the right instances instead.

Tracking events with prometheus and grafana

There's an article "Tracking Every Release" which tells about displaying a vertical line on graphs for every code deployment. They are using Graphite. I would like to do something similar with Prometheus 2.2 and Grafana 5.1. More specifically I want to get an "application start" event displayed on a graph.
Grafana annotations seem to be the appropriate mechanism for this but I can't figure out what type of prometheus metric to use and how to query it.
The simplest way to do this is via the same basic approach as in the article, by having your deployment tool tell Grafana when it performs a deployment.
Grafan has a built-in system for storing annotations, which are displayed on graphs as vertical lines and can have text associated with them. It would be as simple as creating an API key in your Grafana instance and adding a curl call to your deploy script:
curl -H "Authorization: Bearer <apikey>" http://grafana:3000/api/annotations -H "Content-Type: application/json" -d '{"text":"version 1.2.3 deployed","tags":["deploy","production"]}'
For more info on the available options check the documentation:
http://docs.grafana.org/http_api/annotations/
Once you have your deployments being added as annotations, you can display those on your dashboard by going to the annotations tab in the dashboard settings and adding a new annotation source:
Then the annotations will be shown on the panels in your dashboard:
You can get the same result purely from Prometheus metrics, no need to push anything into Grafana:
If you wanted to track all restarts your search expression could be something like:
changes(start_time_seconds{job="foo",env="prod"} > 0
Or something like this if you only wanted to track version changes (and you had some sort of info metric that provided the version):
alertmanager_build_info unless max_over_time(alertmanager_build_info[1d] offset 5m)
The latter expression should only produce an output for 5 minutes whenever a new alertmanager_build_info metric appears (i.e. one with different labels such as version). You can further tweak it to only produce an output when version changes, e.g. by aggregating away all other labels.
A note here as technology has evolved. We get deployment job state information in Prometheus metrics format scraped directly from the community edition of Hashicorp's Nomad and we view this information in Grafana.
In your case, you would just add an additional query to an existing panel to overlay job start events, which is equivalent to a new deployment for us. There are a lot of related metrics "out of the box," such as for a change in job version that can be considered as well. The main point is no additional work is required besides adding a query in Grafana.

Grafana - Graph with metrics on demand

I am using Grafana for my application, where I have metrics being exposed from my data source on demand, and I want to monitor such on-demand metrics in Grafana in a user-friendly graph. For example, until an exception has been hit by my application, the data source does NOT expose the metric named 'Exception'. However, I want to create a graph before hand where I should be able to specify the metric 'Exception' and it should log it in the graph whenever my data source exposes the 'Exception' metric.
When I try to create a graph on Grafana using the web GUI, I'm unable to see these 'on-demand metrics' since they've not yet been exposed by my data source. However, I should be able to configure the graph such that in case these metrics are exposed then show them. If I go ahead and type out the non-exposed metric name in the metrics field, I get an error "Timeseries data request error".
Does Grafana provide a method to do this? If so, what am I missing?
It depends on what data source you are using (Graphite, InfluxDB, OpenTSDB?).
For graphite you can enter raw query mode (pen button). To specify what ever query you want, it does not need to exist. Same is true InfluxDB, you find the raw query mode in the hamburger menu drop down to the right of eacy query.
You can also use wildcards in a graphite query (or regex in InfluxDB) to create generic graphs that will add series to the graph as they come in.