Power BI: Finding average of averages and STDEV.P of averages - average

All,
My overall objective is to find outliers within an aggregated data set vs the underlying detail for different date ranges. The issue I am having is that Power BI is averaging the SalesPerDay and finding the STDEV.P at the daily level which is the grain of the raw data. I need to first find the average Sales, then find the average of those averages for that "rolled up" data set. Same with STDEV.P. Need to find the STDEV of the "rolled up" averages. Screenshot below depicting how I need the tool to aggregate.
I have brought the Sales column into my dashboard, dimentionalized by user, and set to AVERAGE to get average SalesPerDay.
Then I created the new measure
newavg = CALCULATE(AVERAGE(SalesPerDay[Sales]),ALLSELECTED())
Which is finding the overall average, but at the daily level vs the aggregated level.
I also tried
newSTDV = CALCULATE(STDEV.P(AVERAGE(SalesPerDay[Sales])),ALLSELECTED())
But you cannot find the STDEV.P of a calculation.
Thank you.

What you are looking for is the iterator functions, which take a table or column of data as a grouping, and then applies a calculation on that group.
Example of one would be SUMX. In the example below, it would do a grouping based on Product. Within each product it would get the total of qty and multiply it by the sum of x. It would then sum the results of that calculation into a total.
SUMX( VALUES( table1 [ Product ] ), [Qty] * [x] )
There also being averagex, minx, maxx, plus for the statistical functions there is STDEVX.P and STDEVX.S

Related

Tableau Weighted Average of Last Value in Date Group over Running Sum Across Extra Level of Detail not in Report

I am an absolute Tableau beginner, so forgive my lack of proper terminology.
Context
To give some context to the problem, think of the dataset as the balances and current interest rates of two different loans for which we are trying to calculate a weighted average cost of funds at any point in time, while retaining the ability to filter on Program (specific loan).
I have a single dataset that looks like:
The Balance field is used as a running sum, i.e. to get the actual balance as of 4/30/2022, you would sum the column across all Date values on or before 4/30/2022.
The Rate field is the opposite: it represents the discrete interest rate as of the Date. Thus, it cannot be summed.
Each data point is specific to a specific loan, or Program.
So to get the interest rate of Program A as of 4/30/2022, you would simply grab the Rate value of the row where Date = 4/30/2022 and Program = A, or 5.30%. Sums are fine here, since the value of Rate is never repeated for a single Program and Date combo, but we cannot use a running sum.
On the other hand, to get the balance of Program A as of 4/30/2022, you would need to add (running sum) the Balance values for all rows where Date <= 4/30/2022 and Program = A, or 10,000 + -2500 + -2500 + -2500 = 2500.
Problem / Need
I need a report (or whatever it's called in Tableau) with the following:
Date as a column
Measures as rows
This report would NOT include Program as a row or column, but would include it as a filter.
In this report, I need a Weighted Average Cost of Funds measure.
This is effectively the weighted average Rate over/weighted by the running sum of Balance across Programs included in the filter, of course for any given Date in the columns.
In other words, by Date, latest Ratefor eachProgramtimes thePrograms running sum of Balance, divided by running sum of all Balancesfor allProgram`s included in filter.
Here's an example in Excel:
Here's an example if we were to exclude Program A:
And here's an example if we were to exclude Program B:
Finally, here's the formulas underneath everything in the Excel example:

Sum of calculated averages PowerBI

I'm fairly new to PowerBI, I want to calculate sum of averages as measure.
So average is perfectly fine but I couldn't manage to sum them.
average = AVERAGEX(SUMMARIZE(ProductionVolumeData,
ProductionVolumeData[ProductionOrderID],
MDCProductionVolumeData[MachineID],
"sum_volume",
SUM(ProductionVolumeData[Volume])),[sum_volume])
this formula calculates aggregates volume group by production order id and machine id and find mean.
I checked in table, it works for one ProductionOrderID but whenever I add another ProductionOrderID to table it also calculates average. What I want is to sum up averages.
How can I do that?
Thanks in advance

Qliksense: Compute median of grouped data

I'm facing an issue in QlikSense, trying to compute some statistical indicators (Percentiles, Quartiles, StdDev, Median etc.) on a dataset which is already grouped by the source.
I mean that my dataset is something similar to the following, in which I have for each combination of Week and Customer Age the total number of purchases:
I want to show the median of Customer Age, and due to the structure of the dataset I can't use fractile or median built-in functions, since they would come out with something different.
Let's suppose I want to calculate the median age of people for all the 3 weeks, so that I want to know what's the age of people who have done the 50% of my purchases.
To let you better understand the question, I show you the histogram:
In this case, the median I want to get is 24-26 years, since the 50% of the total population falls under that range.
I found a useful reference here, but I am having troubles in writing this formula in QlikSense
https://mba-lectures.com/statistics/descriptive-statistics/603/relationship-between-quartiles-decile...
Thanks a lot in advance.
[EDIT]: This is my Data Model View:
[EDIT 2]: Here is my qvf with a dataset more similar to the original one I'm using. As you can see, I can't get the correct result using your formula. In addition, I would like to use it in order to plot the trend of the median through weeks, but it doesn't seem to be possible (Even if I use the modified version of the formula I pointed out in the comments).
If you want to calculate median in such a scenario you need to weighted median and basically check which dimension value is in the middle:
Aggr(
If(
(Rangesum(
Above([# Purchases],0,RowNo())
)
/Sum(TOTAL [# Purchases]))>=0.5
and
(Rangesum(
Above([# Purchases],1,RowNo()-1))
/Sum(TOTAL [# Purchases]))<0.5
,[Customer Age])
,[Customer Age])

Why window_avg in tableau has to work on aggregated variable?

I am trying to calculate a rolling average by 30 days. However, in Tableau, I have to use window_avg(avg(varaible), - 30, 0). It means that it is actually calculating the average of daily average. It first calculate the average value per day, then average the values for past 30 days. I am wondering whether there is a function in Tableau that can calculate directly rolling average, like pandas.rolling?
In this specific case, you can use the following
window_sum(sum(variable), -30, 0) / window_sum(sum(1), -30, 0)
A few concepts about table calcs to keep in mind
Table calcs operate on aggregate query results.
This gives you flexibility - you can partition the table of query results in many ways, access multiple values in the result set, order the query results to impact your calculations, nest table calcs in different ways.
This approach can also give you efficiency if you can calculate what you need simply from the aggregate results that you've already fetched.
It also gives you complexity. You have to be aware of how each calculation specifies the addressing and partitioning of the query results. You also have to think about how double aggregation will impact your results.
In most cases, applying back to back aggregation functions requires some careful thought about what the results will mean. As you've noted, averages of averages may not mean what people think they mean. Others, may be quite reasonable, say averages of daily sales totals.
In some cases, double aggregation can be used without extra thought as the results are the same regardless. Sums of Sums, Mins of Mins, Max of Max yield the same result as calling Sum, min or max on the underlying data rows. These functions are called additive aggregation functions, and obey the associative rule you learned in grade school. Hence, the formula at the start of this answer.
You can also read about the Total() function.

Tableau calculation: I am trying to calculate the percentage of running sum but am unable to create a calculation

I am trying to calculate number of customers which represent 80% of the profit so that I can use it in a calculated field which I can use in a reference line.
This is what I wrote
IIF(RUNNING_SUM([Profit])= (0.8*SUM([Profit])),
COUNTD([Customer Name]),0)
but it gives me error saying
"All fields must be constant or aggregate when using table calculation functions"
The logic is to "Count distinct number of customers which represent 80% of running total profits"
This is meant for a pareto chart, so the values are already sorted in descending order for it to work.
How do I create such calculated field which would give me number of top customers which will represent 80% of the profits?
Let me know if more clarifications are needed.
I think you are looking for a Pareto Chart. This might help:
http://www.theinformationlab.co.uk/2014/08/27/pareto-charts-tableau/
I would leverage the power of Table Calculations, where you can first do running total of profit and then simply calculate percentage of total.
Here is the link to step-by-step tutorial in Tableau10 for Pareto Analysis (80/20 rule):
https://www.tableau.com/learn/tutorials/on-demand/pareto-charts?signin=15df68b66e703787258911e79db040a7.
Hope this helps.