Finding median value per category - postgresql

I have a data table with columns "field" and "conc" (which is short for concentration). I am trying to output the each type of field (the categories are cosmos, egs etc) along with the associated median value of the conc statistic for each field type.
This is what I have tried:
SELECT field, percentile_cont(0.5)::numeric FROM galaxies GROUP BY conc LIMIT 5;
ERROR: function percentile_cont(numeric) does not exist
LINE 1: SELECT field, percentile_cont(0.5)::numeric FROM galaxies GR...
However, I am getting this error and am not exactly sure how to go about extracting the field name with the median value for conc for each field type

As indicated in the documentation for PostgreSQL, WITHIN GROUP (ORDER BY...) is mandatory for the ordered-set aggregate functions like percentile_cont.
If you want the median of the conc, then grouping by conc is surely the wrong thing to do. So you might want something like this:
SELECT field, percentile_cont(0.5) within group (order by conc)::numeric
FROM galaxies GROUP BY field LIMIT 5;

Related

Tableau KPI prev value depending on variable

am trying to get Previous Sum(of someField) based on a variable value which is an Id.
This is not a table, Im doing a KPI
On Qlik you would do something like:
SUM({<Id={"$(=Max(vVariable),-1))"}>} someField)
But I can not achieve it on Tableau, off course is due to my lack of knowledge, unfortunatelly time is tinking at work and wanted to see if anyone has any input!
Thanks
Assuming you may use a sample input like the Superstore (using sales as metric), this could be what you're looking for:
In red you can see your "variable" which allows you to select a value and in blue you'll find the unique row for the previous value (Order ID sorted).
The first thing you need to to do is creating a parameter based on all the Order ID values:
Then things start to get a bit complicated if you're not familiar with LOD (Level of details) and the order of execution in Tableau, especially for filters.
Assuming that you can get some information on your own (otherwise, feel free to ask), the first thing you nee to to do is to "pre-calculate" the equivalent of a table having a rowe for each Order ID, in which you also have the previous Order ID value.
You can achive this combining Fixed (LOD) and Lookup function, creating this Calculated Field "Lookup Order ID":
LOOKUP( max({ FIXED [Order ID] : MAX([Order ID])}),1)
This is actually just a calculated field that you want to "fix" because you need the filter to act after you have made that previous calculus, and then you shift your data by 1 row backward.
Once you've done that, you just nee to create another calculated field in order to test your parametric value, and it could be something like this "check param":
[Lookup Order ID] = [Order ID param]
Moving this calculated field in the filter section and selecting just "true" values, you'll get that unique rows like in the initial image, showing the previous value (blue) related to the one you select in the parameter drop-down menu (red).

level of detail expressions cannot contain table calculations or the attr function in Tableau

i have this tableau workbook
basically this calculated day different between each user_id and each transaction for each user_id with this calculation
DATEDIFF('day',LOOKUP(MIN([Created At]),-1), MIN([Created At]))
that pull filters its so filter the conditions of users (We can ignore this)
and date_rante filters its for calculated day different between date range on parameter
with this calculated
lookup(min(([Created At])),0) >= [START_DATE] and
lookup(min(([Created At])),0) <= [END_DATE]
so from the frequency i want to find out the Max of different day, with this calculated
MAX({FIXED [User Id]:DATEDIFF('day',LOOKUP(MIN([Created At]),-1), MIN([Created At]))})
but it says
level of detail expressions cannot contain table calculations or the attr function
so i used this solution https://kb.tableau.com/articles/howto/finding-the-dimension-member-with-the-highest-measure-value
and from that solution, i applied with my codes into like this
MAX({FIXED [User Id]:DATEDIFF('day',INT(LOOKUP(MIN([Created At]),-1)), INT(MIN([Created At])))})
but it turns to error datediff being called with string,integer,integer
based on #Anil solution, i tried to create it, and idk why the results was like this
new picture
Presently, as far as my knowledge of tableau is, tableau doesn't allow to calculate LOD calcs or further aggregations on table calcs. To find the transactions where the user took most/max time (in days) in subsequent order- You can do this workaround..
Let's assume your datediff calc field is named as CF1. create another calc field lets say CF2 with following calculation
rank_unique([CF1])
EDIT:
Change table calcs on this field similar to CF1. putting a filter on this field will give you the dates with max(time diff) as shown in screenshot.
table calculation options on first (datediff field)
table calculation options on second field (rank_unique)
I have added third field on colors
(Please note no field used in filters just to highlight)

Find difference between two calculated groups?

I have dummy HR data, and I want to color format via a map the difference in median salary based on groupings of birth year.
I have a quick calc field to separate them into birth year groups:
IF DATE([Date of Birth]) >=#1976# THEN "Group 1"
ELSE "Group 2"
END
Now I want to find the difference between the median salaries for those two groups, but I want to conditionally format them via a map to see where the median salary remained similar or differed a lot.
For instance: Median(Group 1([salary])-Median(Group 2([salary]) would give me a +/- difference and then I'd like that to be colored via a gradient and then outlines via state level detail.
This is probably so easy, but I can't think of how to do it via those groups. Would this be a LOD calc?
Define a calc to return the salary for rows in group 1, and null otherwise. Call it say, Old_Folks_Salary, defined something like if Year([Birth Date]) < 1976 then [Salary] end (If the condition in the if statement is not satisfied, and there is no else clause, the expression returns null.) Define a similar field for the youngsters.
The trick to know is that aggregation functions, like Median, silently ignore null values. It’s as if the null values don’t even exist. So ... You can now express your aggregate calculation as
Median([Old Folks Salary]) - Median([Young Folks Salary])
For extra credit, you can replace the hard coded threshold of 1976 with a parameter, and look for more politically acceptable field names.

SSRS Grouping Summary - with Max not working

This is the data that comes back from the database
Data Sample for one season (the report returns values for two):
What you can see is groupings, by Season, Theater then Performance number and lastly we have the revenue and ticket columns.
The SSRS Report Has three levels of groupings. Pkg (another ID that groups the below), venue -- the venue column and perf_desc -- the description column linked tot he perf_no.
Looks like this --
What I need to do is take the revenue column (a unique value) for each Performance and return it in a separate column -- so i use this formula.
sum(Max(Fields!perf_tix.Value, "perf_desc"))
This works great, gives me the total unique value for each performance -- and sums them up by the pkg level.
The catch is when i need to pull the data out by season.
I created a separate column looks like this
it's yellow because it's invisible and is referenced elsewhere. But the expression is if the Season value = to the Parameter (passed season value) -- then basically pull the sum of each of the tix values and sum them up. This also works great on the lower line - the line where the grouping exists for pkg -- light blue in my case.
=iif(Fields!season.Value = Parameters!season.Value, Sum(Max(Fields!perf_tix.Value, "perf_desc")), 0)
However, the line above -- the parent/header line its giving me the sum of the two seasons values. Basically adding it all up. This is not what I want and also why is it doing this. The season value is not equal to the passed parameter for the second season value so why is it adding it to the grouped value.
How do I fix this??
Since your aggregate function is inside your IIF function, only the first record in your dataset is being evaluated. If the first one matches the parameter, all records would be included.
This might work:
=IIF(Fields!season.Value = Parameters!season.Value, Sum(Max(Fields!perf_tix.Value, "perf_desc")), 0)
It might be better if your report was also grouping on the Venue, otherwise you count may include all values.

Second Max in Tableau Calculated Field

How can I get the second highest value from a field in a calculated field. In excel I would use the large function but there doesn't seem to be a tableau equivalent. I would prefer to do the calculation in Tableau instead of using a pass through function.
Here are two alternatives.
First, if you want the calculation to happen on the data source side, You could write a LOD calculation to find the max of your field, name it myMax
{fixed [My_Dimension1], [My_Dimension2] : max(myField)}
Whether you use fixed, include or exclude scope for the LOD calc depends on how you want to scope your analysis.
Then write a row level that returns the field value if it is less than the LOD calc, and implicitly null otherwise, name myFieldExceptMax
if myField < myMax then myField end
The max of that row level calc would be your answer.
max(myFieldExceptMax)
Alternatively, if you want to operate on the client (tableau) side to find the penultimate aggregated query result, you can use on of the ranking table calc functions, and the filter to only show the second ranking result.