Count values by groups - aggregate

I currently send events to graphite like
server.my_value with a count that varies between 1 and 1000
I'd like to plot a graph that will show me how many of these my_value I received by groups of 100 (so 1,4,7 and 89 will be counted toward the 1-100 group and etc)
so to illustrate, lets say I sent these values in a specific time
1
3
45
13
299
455
74
924
the groups will be
1-100: 5
200-300: 1
300-400: 0
400-500: 1
500-600: 0
600-700: 0
700-800: 0
800-900: 0
900-1000: 1
so I would have 10 lines, each represent a group
This could be in a histogram, but it won't show me the changes with time
Is it possible to aggregate values by ranges?

If you avoid using downsampled data by only querying for small time intervals and do not cross the aggregation boundary defined in your Graphite storage schema then you can just use the new histogram feature in the Graph panel in Grafana or the new Heatmap panel to get a histogram over time. (For example, if you have the storage schema 15s:7d,1m:21d,15m:5y and you stay within the 7 day boundary.) Here is an example from the Grafana demo site.
I don't think there is a good way to aggregate values by ranges in a Graphite query if you want to aggregate for larger time intervals. However, it seems to be possible to do it by aggregating the data in statsd:
http://dieter.plaetinck.be/post/histogram-statsd-graphing-over-time-with-graphite/

Related

Want to SUM all values for a specific date within column NOT sum all values in that column

I want to create a graph which shows the total capacity for each week relative to remaining availability across a series of specific dates. Just now when I attempt this in Power Bi it calculates this correctly for one of the values (remaining availability) but generates a value much higher than expected by manual calculation for the total capacity - instead showing the total for the entire column rather than for each specific date.
Why is Power Bi doing this and how can I solve it?
So far, I have tried generating the graph like this:
(https://i.stack.imgur.com/GV3vk.png)
and as you can see the capacity values are incredibly high they should be 25 days.
The total availability values are correct (ranging from 0 to 5.5 days).
When I create matrices to see the sum breakdown they are correct but it only appears to be that when combined together one of the values changes to the value for the whole column.
If anyone could help me with this issue that would be great! Thanks!

Tableau calculated measure using previous months

I have the following data table:
Month Accounts Sales
Jan-19 50 5000
Feb-19 60 6000
Mar-19 70 7000
Apr-19 80 8000
May-19 90 9000
I am trying to create a new measure which will return the sum of 3 months of sales / Sum of 1st month Accounts.
For e.g.
for Mar-19 the value should be (Jan+Feb+Mar'19 Sales)/Jan-19 Accounts i.e. (18000/50)
for Apr-19 the value should be (Feb+Mar+Apr'19 Sales)/Feb-19 Accounts i.e. (21000/60)
for May-19 the value should be (Mar+Apr+May'19 Sales)/Mar-19 Accounts i.e. (25000/70)
.......
and so on...
Was wondering can a DATEDIFF or some table calculation could be use to achieve the above?
Best Regards
Table calculations are well suited for this. They take a little time to understand but are very useful. Start with the online help. Make sure you understand partitioning and addressing.
The functions that will be useful are Window_Sum() and Lookup()
An example calculation could be
WINDOW_SUM(Sum([Sales]),-2,0) / LOOKUP(Sum([Sales],-2))
The -2 and 0 are offsets from the current position
Note, for table calcs, the formula is only part of the definition of the calculation. You need to edit the table calc to set the partitioning and addressing (aka compute using) to tell Tableau how to arrange the data before evaluating the table calc.
Tableau will take a guess for partitioning based on how your viz is arranged, and the guess is often right, but it is usually best to specify the specific dimensions for partitioning. See the help pages on table calcs.

How to do a distinct count of a metric using graphite datasource in grafana?

I have a metric that shows the state of a server. The values are integers and if the value is 0 (zero) then the server is stable, else it is unstable. And the graph we have is at a minute level. So, I want to show an aggregated value to know how many hours the server is unstable in the selected time range.
Lets say, if I select "Last 7 days" as the time duration...we have get X hours of instability of server.
And one more thing, I have a line graph (time series graph) that shows the state of server...but, the thing is when I select "Last 24 hours or 48 hours" I am getting the graph at a minute level...when I increase the duration to a quarter I am getting the graph for every 5 min or something like that....I understand it's aggregating the values....but does any body know how the grafana is doing the aggregation ??
I have tried "scaleToSeconds" function and "ConsolidateBy" functions and many more to first get the count of non zero value minutes, but no success.
Any help would be greatly appreciated.
Thanks in advance.
There are a few different ways to tackle this, there are 2 places that aggregation happens in this situation:
When you query for a time range longer than your raw retention interval and whisper returns aggregated data. The aggregation method used here is defined in your carbon aggregation configuration.
When Grafana sends a query to Graphite it passes maxDataPoints=<width of graph in pixels>, and Graphite will perform aggregation to return at most that many points (because you don't have enough pixels to render more points than that). The method used for this consolidation is controlled by the consolidateBy function.
It is possible for both of these to be used in the same query if you eg have a panel that queries 3 days worth of data and you store 2 days at 1-minute and 7 days at 5-minute intervals in whisper then you'd have 72 * 60 / 5 = 864 points from the 5-minute archive in whisper, but if your graph is only 500px wide then at runtime that would be consolidated down to 10-minute intervals and return 432 points.
So, if you want to always have access to the count then you can change your carbon configuration to use sum aggregation for those series (and remove the existing whisper files so new ones are created with the new aggregation config), and pass consolidateBy('sum') in your queries, and you'll always get the sum back for each interval.
That said, you can also address this at query time by multiplying the average back out to get a total (assuming that your whisper aggregation config is using average). The simplest way to do that will be to summarize the data with average into buckets that match the longest aggregation interval you'll be querying, then scale those values by that interval to calculate the total number of minutes. Finally, you'll want to use consolidateBy('sum') so that any runtime consolidation will work properly.
consolidateBy(scale(summarize(my.series, '10min', 'avg'), 60), 'sum')
With all of that said, you may want to consider reporting uptime in terms of percentages rather than raw minutes, in which case you can use the raw averages directly.
When you say the value is zero (0), the server is healthy - what other values are reported while the server is unhealthy/unstable? If you're only reporting zero (healthy) or one (unhealthy), for example, then you could use the sumSeries function to get a count across multiple servers.
Some more information is needed here about the types of values the server is reporting in order to give you a better answer.
Grafana does aggregate - or consolidate - data typically by using the average aggregation function. You can override this using the 'sum' aggregation in the consolidateBy function.
To get a running calculation over time, you would most likely have to use the summarize function (also with the sum aggregation) and define the time period, e.g. 1 hour, 1 day, 1 week, and so on. You could take this a step further by combining this with a time template variable so that as the period grows/shrinks, the summarize period will increase/decrease accordingly.

qlik sense capability api 10000 limit

We've reached the limit for hypercubes and need to extract more than 10000 (data points - I used this term for lack of words to describe the individual cell that the API sends over 10000 is the max when you multiply width and height of your initial fetch) using the capability API. has anyone been able to get the next page for hypercubes? note that our requirement is for mashups not extensions.
we did a work around but it required us to break our dataset and it takes a little longer.
makes you think, since Qlik is a data analytics tool there should be a way to get all of your data. in an era where we process millions if not billions of records, 10000 data points (not even records) is miniscule.
I should also volunteer that the app we are using this for is for stock analysis and they want to see trends and require to see information on individual points as tooltips. with the number of dimensions and measures we pass (total of 7 times the number of stocks - about 20 = 140) we are constricted to only 70 days (10000/140).
we are using qliksenseserver 11.24.4
Qlik Sense November 2017 Patch 2

Prometheus histograms and averaging sets with NaN values included

In my app I have histograms setup for websocket ping times to every country, one histogram per country. In Grafana I have a graph of the average ping time for several countries I'm most interested in via the following query
rate(country_ping_sum{country=~"AU|NZ|CA|GB|US",instance="$instance"}[15m]) / rate(country_ping_count{country=~"AU|NZ|CA|GB|US",instance="$instance"}[15m])
This works perfectly well. I get a graph for each country. Now I want to add to the same graph an average of all the other countries combined into one.
avg(rate(country_ping_sum{country!~"AU|NZ|CA|GB|US",instance="$instance"}[15m]) / rate(country_ping_count{country!~"AU|NZ|CA|GB|US",instance="$instance"}[15m]))
This fails. When I try the query in the Prometheus query in the Prometheus console I get a value of NaN. If I take the same query and remove the avg() function then I get a list of every matching country, some have values and some have NaN. Many of the countries have a rate of 0 for both the sum and the count. Clearly those divisions by 0 are amounting to NaN for those particular countries.
So my question, how can I filter out NaN values before passing to avg()?
You're effectively taking an average of an average, which is generally not correct.
Instead do a sum of each rate, and then divide to get the overall average.