Average based on distinct values in another column - average

For my project I need to create a bar-chart where category axis is the average of an aggregated sum and value axis is rowcount().
I can't find a way to calculate the average of an aggregated sum directly in the bar-chart.
Here you can find the dxp file with steps required to calculate the final value (85,26).
I came to a solution using the calculate columns (highlighted in the dataset attached), but I need to calculate it directly in the bar-chart so it will change when filters are applied .
I tried to explain better. My KPI is calculated using the following formula:
Sum([Weight]) OVER ([Period],[ID Player],[Player Type],[ID Game],[Player Role])) OVER ([ID
Player],[Player Type],[ID Game],[Player Role])>
The average is done on the following values:
(83,50 83,50 83,50 83,50 83,50 83,50 83,50 83,50 86,14 86,14 86,14
86,14 86,14 86,14 86,14 86,14)/16 = 84,82
So I tried to add DISTINCT but the result was still incorrect since the average is done for the following values:
(83,50+86,14)\2 = 84,82
The average should be based on distinct values from "Period" column so
(83,50+86,14+86,14)\3 = 85,26
Can anyone help me to find a formula to obtain the last average?
Thanks in advance!
EDIT1: I tried a formula that's close to the suggestion in the comments:
Sum([Weight] * [Score]) OVER ([Period],[ID Player],[Player Type],[ID Game],[Player Role]) / Sum([Weight]) OVER ([Period],[ID Player],[Player Type],[ID Game],[Player Role])
,null)) OVER ([ID Player],[Player Type],[ID Game],[Player Role])
But it will do the average taking just first values for each period (60+97+97/3=84,67)

Not sure how you have set up the visualisation, but something like the following formula might work
Avg(case when Rank(RowId(),[Period])=1 then [final Score] end)

Related

High Value at Each Granularity

I have an table below which shows some data and I am trying to get the high value from each row.
Below is the data I have:
Table A
Once calculated it should looks like below in tableau
Table B
Notice that for Total, its not summing up from pers and bus instead its getting it highest value from total from table A and same concept is for Grand Total. Numbers which you see are balance.
I am able to get high value up to Total but its the grand total where I am struggling with. Below is my Calculation which I am using.
if
countd([Category]) = 1 then
sum({ FIXED [Group], [Category]: max(
{ FIXED [Group], [Category], [Date]: SUM([Balance])})})
ELSE
sum({ FIXED [Group]: max(
{ FIXED [Group], [Date]: SUM([Balance])})}
)
END
Step-1: For max of categories, use this calculation desired maximums
MAX({Fixed [MVRA Group], [Product Category], [Date Display]: ([Primary Measure])})
STEP-2: For max of subtotal, use desired sub-total maximum
MAX({Fixed [MVRA Group], [Date Display]: ([Primary Measure])})
https://drive.google.com/file/d/1zyjCDdG_QECrkFgY-UFm7WM2yN1xaHmZ/view?usp=sharing

Calculate median sales price with using 3 variables Tableau 10

I would like calculate the median sales price and the median rental price for an apartment in NYC in each of the 5 boroughs, Brooklyn, Bronx Manhattan, Queens and Staten Island. In Tableau the sales and and rentals are groups of ListPrice -- Variables ListPrice is NUMBER(decimal) Type (includes Sales & Rentals, Borough
Any help is appreciated
I tried using Tableau's table calculation feature but that did not work, I tried
WINDOW_MEDIAN(SUM([ListPrice])-1, -1)
ERROR: WINDOW_MEDIAN is being called with (float, integer), did mean
(float,integer,integer)
Data
Type Borough ListPrice
RentalType1 Manhattan $5,000
RentalType2 Bronx $3,000
RentalType2 Brooklyn $3,000
SalesType2 Manhattan $900,000
SalesType1 Brooklyn $100,000
SalesType1 Bronx $500,000
SalesType2 Queens $800,000
SalesType2 Staten Island $400,000
Table calculations takes 3 arguments, Expression, First row of the partition and last row of the partition. In your formula you haven't given last row of the partition.
Run the function for type in each Borough and calculate for each Borough.
So your formula would be:
WINDOW_MEDIAN(SUM(INT([List Price])),FIRST(),LAST())
are you looking to get values below:
Here calculation2 is median value

Tableau Column Difference - as a dimension

I am looking for difference of two columns in Tableau. I have the formula with me.
IF ATTR([Valuation Profile]) = "Base" THEN
LOOKUP(ZN(SUM([Value])), 1) > - ZN(LOOKUP(SUM([Value]),0)) END
But I get it as a separate column in the columns sections. How do I get that in the rows section? Basically how to get the difference as a dimension?
Please see attached images of what I want and what I have. Apparently, I cannot upload my excel sheet and tableau worksheet here. So I have upload just the screenshots.
What I have - vs - What I want
Tableau Workbook
First off, there is no way that you can generate additional rows for your data in Tableau!
In your case however you could use a workaround and do the following:
Create a calculated field for BASE and one for CSA. The formula should
be IF [Valuation Profile] = 'BASE' THEN [Value] END and IF
[Valuation Profile] = 'CSA' THEN [Value] END respectively
Afterwards you can drag Measure Names onto your rows shelf and
replace the SUM([Value]) with your two newly created calculated fields
that should give you all three measures in different rows in your table
Reference: https://community.tableau.com/message/627171#627171
Use LOD expression to calculate the individual values first.
Create calculated fields 'BASE', 'CSA' and 'CSA-BASE' as below.
BASE:
{FIXED [Book Name]: SUM( if [Valuation Profile] = 'BASE' then Value else 0 end ) }
CSA:
{FIXED [Book Name]: SUM( if [Valuation Profile] = 'CSA' then Value else 0 end ) }
CSA-BASE
[CSA]-[BASE]
Solution

Tableau - Calculating average where date is less than value from another data source

I am trying to calculate the average of a column in Tableau, except the problem is I am trying to use a single date value (based on filter) from another data source to only calculate the average where the exam date is <= the filtered date value from the other source.
Note: Parameters will not work for me here, since new date values are being added constantly to the set.
I have tried many different approaches, but the simplest was trying to use a calculated field that pulls in the filtered exam date from the other data source.
It successfully can pull the filtered date, but the formula does not work as expected. 2 versions of the calculation are below:
IF DATE(ATTR([Exam Date])) <= DATE(ATTR([Averages (Tableau Test Scores)].[Updated])) THEN AVG([Raw Score]) END
IF DATEDIFF('day', DATE(ATTR([Exam Date])), DATE(ATTR([Averages (Tableau Test Scores)].[Updated]))) > 1 THEN AVG([Raw Score]) END
Basically, I am looking for the equivalent of this in SQL Server:
SELECT AVG([Raw Score]) WHERE ExamDate <= (Filtered Exam Date)
Below a workbook that shows an example of what I am trying to accomplish. Currently it returns all blanks, likely due to the many-to-one comparison I am trying to use in my calculation.
Any feedback is greatly appreciated!
Tableau Test Exam Workbook
I was able to solve this by using Custom SQL to join the tables together and calculate the average based on my conditions, to get the column results I wanted.
Would still be great to have this ability directly in Tableau, but whatever gets the job done.
Edit:
SELECT
[AcademicYear]
,[Discipline]
--Get the number of student takers
,COUNT([Id]) AS [Students (N)]
--Get the average of the Raw Score
,CAST(AVG(RawScore) AS DECIMAL(10,2)) AS [School Mean]
--Get the number of failures based on an "adjusted score" column
,COUNT([AdjustedScore] < 70 THEN 1 END) AS [School Failures]
--This is the column used as the cutoff point for including scores
,[Average_Update].[Updated]
FROM [dbo].[Average] [Average]
FULL OUTER JOIN [dbo].[Average_Update] [Average_Update] ON ([Average_Update].[Id] = [Average].UpdateDateId)
--The meat of joining data for accurate calculations
FULL OUTER JOIN (
SELECT DISTINCT S.[Id], S.[LastName], S.[FirstName], S.[ExamDate], S.[RawScoreStandard], S.[RawScorePercent], S.[AdjustedScore], S.[Subject], P.[Id] AS PeriodId
FROM [StudentScore] S
FULL OUTER JOIN
(
--Get only the 1st attempt
SELECT DISTINCT [NBOMEId], S2.[Subject], MIN([ExamDate]) AS ExamDate
FROM [StudentScore] S2
GROUP BY [NBOMEId],S2.[Subject]
) B
ON S.[NBOMEId] = B.[NBOMEId] AND S.[Subject] = B.[Subject] AND S.[ExamDate] = B.[ExamDate]
--Group in "Exam Periods" based on the list of periods w/ start & end dates in another table.
FULL OUTER JOIN [ExamPeriod] P
ON S.[ExamDate] = P.PeriodStart AND S.[ExamDate] <= P.PeriodEnd
WHERE S.[Subject] = B.[Subject]
GROUP BY P.[Id], S.[Subject], S.[ExamDate], S.[RawScoreStandard], S.[RawScorePercent], S.[AdjustedScore], S.[NBOMEId], S.[NBOMELastName], S.[NBOMEFirstName], S.[SecondYrTake]) [StudentScore]
ON
([StudentScore].PeriodId = [Average_Update].ExamPeriodId
AND [StudentScore].Subject = [Average].Subject
AND [StudentScore].[ExamDate] <= [Average_Update].[Updated])
--End meat
--Joins to pull in relevant data for normalized tables
FULL OUTER JOIN [dbo].[Student] [Student] ON ([StudentScore].[NBOMEId] = [Student].[NBOMEId])
INNER JOIN [dbo].[ExamPeriod] [ExamPeriod] ON ([Average_Update].ExamPeriodId = [ExamPeriod].[Id])
INNER JOIN [dbo].[AcademicYear] [AcademicYear] ON ([ExamPeriod].[AcademicYearId] = [AcademicYear].[Id])
--This will pull only the latest update entry for every academic year.
WHERE [Updated] IN (
SELECT DISTINCT MAX([Updated]) AS MaxDate
FROM [Average_Update]
GROUP BY[ExamPeriodId])
GROUP BY [AcademicYear].[AcademicYearText], [Average].[Subject], [Average_Update].[Updated],
ORDER BY [AcademicYear].[AcademicYearText], [Average_Update].[Updated], [Average].[Subject]
I couldn't download your file to test with your data, but try reversing the order of taking the average ie
average(IF DATE(ATTR([Exam Date])) <= DATE(ATTR([Averages (Tableau Test Scores)].[Updated]) then [Raw Score]) END)
as written, I believe you'll be averaging the data before returning it from the if statement, whereas you want to return the data, then average it.

Postgresql upper limit of a calculated field

is there a way to set an upper limit to a calculation (calculated field) which is already in a CASE clause? I'm calculating percentages and, obviously, don't want the highest value exceed '100'.
If it wasn't in a CASE clause already, I'd create something like 'case when calculation > 100.0 then 100 else calculation end as needed_percent' but I can't do it now..
Thanks for any suggestions.
I think using least function will be the best option.
select least((case when ...), 100) from ...
There is a way to set an upper limit on a calculated field by creating an outer query. Check out my example below. The inner query will be the query that you have currently. Then just create an outer query on it and use a WHERE clause to limit it to <= 1.
SELECT
z.id,
z.name,
z.percent
FROM(
SELECT
id,
name,
CASE WHEN id = 2 THEN sales/SUM(sales) ELSE NULL END AS percent
FROM
users_table
) AS z
WHERE z.percent <= 1