We have grafana dashboard which has x axis has time and y axis with free memory in the server(in gb). Dashboard has fields for 100+ servers in a particular datacenter. Threshold for free memory is below 12gb. If a server has memory below 12gb it’s critical. We want to create alert on data center level. Example if 75 servers has free memory below 12gb it should trigger an alert. Is that possible to handle this condition in grafana query functions ? Source for this grafana dashboard is from graphite
Able to implement this with removeAbove and removeBelow functions.
Related
How can I get the utilization of my rack system. I tried the function PRD.capacity() - PRD.size() but it only changes the amount when pallets were stored into my racks but not the percentage of the reserved and the free capacity, also not the colors (don't know how I can create the different colors for reserved and free racks)
piechart_simulation
piechart_function
welcome to SOF.
How can I get the utilization of my rack system.
Simply use myRackSystem.statsUtilization.xxx where xxx gives you various handles on the statistics collected as below:
For more advanced statistics, you will need to collect those yourself. Lots of example models show you how to use Dataset and Histogram objects to plot stuff in pie charts, check those as well :-)
Suppose a system collecting hourly energy consumptions (Wh) from clients with power analyzers (i.e. a sensor that measures how much energy an electrical appliance consumes).
In detail, each client periodically publishes the energy value consumed by a device in the last hour. On the server side, this data is stored in Graphite (v1.2.0) and visualized in a Grafana (v6.5.2) dashboard.
In the dashboard, I can easily show hourly consumptions of a device as a line/bar graph. However, I need graphs to show total daily and monthly consumptions aggregating hourly values.
How can I do that using Graphite and/or Grafana without collecting extra metrics? Or is it at least possible or not?
I am monitoring windows machine and i installed wmi exporter in my machine. I am using prometheus and grafana as monitoring tools. which query i should use to monitor the CPU status of my windows machine
This gets you the percentege of CPU use.
100 - (avg by (instance) (irate(wmi_cpu_time_total{mode="idle", instance=~"$server.*"}[1m])) * 100)
I don't have a WMI exporter running , but according to its documentation something like this should work with a stacked graph:
sum by(mode) (rate(wmi_cpu_time_total[5m]))
You can add labels to the metric to filter by instance / job / whatever and you can tweak the range that you compute the rate over (e.g. 1m for less smoothing; 1h over longer ranges of time; or Grafana's $__interval for dashboard range + screen resolution dependent graphing).
Edit: the query above would give you CPU usage in absolute terms, i.e. if your machine had 4 cores, the stacked graph would add up to (approximately) 4 or 400%. If you want it to instead add up to exactly 100% you should use something like this (not tested):
sum by(mode) (rate(wmi_cpu_time_total[5m]))
/
scalar(sum(rate(wmi_cpu_time_total[5m]))
All it does is it divides each per-CPU-mode value by their sum, so the results will always sum up to 1. All you need to do in Grafana is select the unit of measurement to be "percentage (0-1)".
I am trying to put some metrics up on Grafana from a Postgres db.
On the Postgres database, I get the following values and I am not sure what statement is needed to convert them to % (for the cpu, mem usage).
And also how the value of the wallclock time could be converted to days,hours,min,sec
Image
You can use axes as per below screenshot. it will show you percentage logically.
you can set min, max memory or other metric. For ex: you have 4 GB memory then select Unit: bits. Min:0, Max:4294967296 (bit of 4GB) then it will show you as per screenshot that if your process use, 345MB, it will fill graph respective percentage.
I have an % CPU usage grafana graph.
The problem is that the source data is collected by collectd as Jiffies.
I am using the following formula:
collectd|<ServerName>|cpu-*|cpu-idle|value|nonNegativeDerivative()|asPercent(-6000)|offset(100)
The problem is that when I increase the time range (to 30 days for example), the grafana is aggregating the data and since it is accumulative numbers (And not percentage or something it can make a simple average), the data in the graph is becoming invalid.
Any idea how to create a better formula?
Have you looked at the aggregation plugin (read type) to compute averages?
https://collectd.org/wiki/index.php/Plugin:Aggregation/Config
it is very strange that you have to use the nonNegativeDerivative function for a CPU metric. nonNegativeDerivative should only be used for ever increasing counters, not a gauge like metric like CPU