Want to visualize a cumulative value of a quotient of two variables from the beginning in Tableau - tableau-api

I'm looking forward to create a parameter or calculation in Tableau that allows me to get a kind of running total of a quotient of two variables.
Let's explain this with a simple example using baseball jargon:
Let's consider that a batter starts a season registering both At Bats (AB) and Hits (H). The Batting Average (BA) is defined as H/AB. All these values are available in my data set on a game basis (each row of the dataset is a game).
I need to calculate the cumulative Batting Average of the batter since it started to play.
The following table/image shows how this can be done in Excel adding some other columns, but I want to create that directly in Tableau using a calculation.
Example data in Excel
Both averages (by game and cumulative) in a Excel graph

You can use RUNNING_SUM on At bats and Hits. And then divide [hits] / [at bats].
See attached: https://dl.dropboxusercontent.com/u/60455118/160716%20stack%20question.twbx

Related

Power BI dynamic Filters on group by

Im fairly sure what im attempting is not the ideal way to do things due to my lack of knowledge of power BI but here goes:
I have two tables in the form of:
One has the actual power against wind and the other is a reference
I created calculated columns that add a corresponding binned speed to each row (so 1-2, 2-3, 3-4 etc)
I have filters and slicers applied on the page / visual that will keep changing.
What i want is to create a pivot or a grouped table that is changed dynamically based on my filters.
The reason i want this is currently the table ive got has totals that are averaged (because individual row is averaged) but i want a sum of an average by category. If i can have this as a calculated table instead of a visual (picture below) i would likely be able to aggregate this again to get what i want
so on the above table i want to totals to be sum of individual rows. I also want to be able to use these totals to carry out other calculations (simple stuff like total divided by fixed number etc)

Want to SUM all values for a specific date within column NOT sum all values in that column

I want to create a graph which shows the total capacity for each week relative to remaining availability across a series of specific dates. Just now when I attempt this in Power Bi it calculates this correctly for one of the values (remaining availability) but generates a value much higher than expected by manual calculation for the total capacity - instead showing the total for the entire column rather than for each specific date.
Why is Power Bi doing this and how can I solve it?
So far, I have tried generating the graph like this:
(https://i.stack.imgur.com/GV3vk.png)
and as you can see the capacity values are incredibly high they should be 25 days.
The total availability values are correct (ranging from 0 to 5.5 days).
When I create matrices to see the sum breakdown they are correct but it only appears to be that when combined together one of the values changes to the value for the whole column.
If anyone could help me with this issue that would be great! Thanks!

Power Bi Clustered Column Chart to show % or row total

I am using Power Bi to produce several reports, one of it is the NPS score for support. However, I am coming across an issue with the clustered column chart. It is showing the value against the total number rather than for each row.
What I want to see if the following (within Excel),
The NPS score is shown as a percentage for each week.
e.g. Week 3 has the Promoter at 95.5% and Detractor at 4.5%
However, when using Power Bi, I am shown the following, which is a Percentage of the grand total, instead of each week.
Using a Matrix, I could see the following as total numbers.
I can copy this Matrix and show it as a Percentage of each Row, which is also correctly showing the results.
I have the dates already set up using a feeder table to allow me to get the week number etc from a date within the main raw data, so they sort in the correct order..
My Chart is using the following table entries
Cal Week and WeekNo are both from the feeder table (Fiscal)
Net Promoter and Count of Case Num are from the RawData table.
How can I get the chart to show the percentages per week instead of the total?
I am also planning to use slicers to filter down further, for example, Regions (which are in the RawData).
I believe I will need to add an extra column to the RawData, but no idea what to put in it and then how to use that in the chart, and still allow it to slice.
Any help would be greatly appreciated.
Thanks
DD

How to calculate the mean of a dataframe column and find the top 10%

I am very new to Scala and Spark, and am working on some self-made exercises using baseball statistics. I am using a case class create a RDD and assign a schema to the data, and am then turning it into a DataFrame so I can use SparkSQL to select groups of players via their stats that meet certain criteria.
Once I have the subset of players I am interested in looking at further, I would like to find the mean of a column; eg Batting Average or RBIs. From there I would like to break all the players into percentile groups based on their average performance compared to all players; the top 10%, bottom 10%, 40-50%
I've been able to use the DataFrame.describe() function to return a summary of a desired column (mean, stddev, count, min, and max) all as strings though. Is there a better way to get just the mean and stddev as Doubles, and what is the best way of breaking the players into groups of 10-percentiles?
So far my thoughts are to find the values that bookend the percentile ranges and writing a function that groups players via comparators, but that feels like it is bordering on reinventing the wheel.
I was able to get the percentiles by using Windows Functions and apply ntile() and cumeDist() over the window. The ntile() can create grouping based off of an input number. If you want things grouped by 10%, just enter ntile(10), if by 5% then ntile(20). For a more fine-tuned restult, cumeDist() applied over the window will output a new column with the cumulative distribution, and those can be filtered from there through select(), where(), or a SQL query.

Adding Reference Line for Weighted Average in Tableau

I've got a bar chart with three months worth of data. Each column in the chart is one month's data showing the percentage of Rows that met a certain criterion for that month. In the first month, 100% of 2 rows meet the measure. In the second month, 24.2% of 641 rows meet the measure. In the 3rd month, 28.3% of 1004 rows meet the measure. My reference line which is supposed to show the average across the entire time-frame is showing 50.8%, the simple average (i.e. [100+24.2+28.3]/3) instead of the weighted average (i.e. [100*2+641*24.2+1004*28.3]/[2+641+1004]).
In the rows shelf, I have a measure called "% that meet the criterion", this is defined as SUM([Criterion])/SUM([NUMBER OF RECORDS])
The criterion measure is 1 for any record that qualifies and null for any that do not qualify.
If I go to Analysis >> Totals >> Show Row Grand Totals, a 4th bar is added, and that bar shows the correct weighted average of the other three bars (26.8%), but I really want this to be shown as a reference line instead of having an extra bar on the chart. (Adding the Grand Total bar also drops the reference line down to 44.8%, which is the simple average of the 4 bars now shown on the chart--I can't think of a less useful piece of information than that).
How can I add the weighted average as a reference line?
Instead of using 'Average' as your aggregation, try using 'Total' instead in the Edit Reference Line dialogue window.
I have to say it's a bit counter-intuitive, but this is what the Tableau online help has to say about it:
http://onlinehelp.tableau.com/current/pro/online/mac/en-us/reflines_addlines.html
Total - places a line at the aggregate of all the values in either the cell, pane, or the entire view. This option is particularly useful when computing a weighted average rather than an average of averages. It is also useful when working with a calculation with a custom aggregation. The total is computed using the underlying data and behaves the same as selecting one of the totals option the Analysis menu.
If you are using Tableau 9, you can make second calculated field using an LOD expression
{ SUM([Criterion]) / SUM([NUMBER OF RECORDS]) }
This will calculate the ratio for the entire data set after applying context and data source filters, without partitioning the data by any of the other dimensions in your view (such as month in your case)
If you place that new field on the detail shelf then you can use it to create a reference line.
There are other ways to generate a weighted average, but this is probably the simplest in your case.