How to compare two averages for the same column? - tableau-api

I want to take an average of the whole data and then filter that data and take another average and then compare the two. Any help is appreciated.

Translating on sample superstore-
The following expression will give average of sales for entire data
{ Avg([sales])}
across all rows. The following expression will, however, give category wise average sales
{FIXED [Category]: avg([sales])}
across all rows again. If you want to apply filters on these calculations add that filter to context but be cautious that the filters will then filter the data used for calculations in both the expressions. If you just want to filter data for viewing purpose and not the calculations dont add the filters to context.

Related

Sum Filtered Data in Tableau

I have a database of users and each user record has "User ID" and "Group". After filtering out a chunk of the records, I'd like to sum the number of users within each group. Currently I am doing that with the calculation:
{FIXED[Group]:SUM([Number of Records])}
The problem here is this calculation appears to ignore any records that I've filtered out and just gives a total count per group from all of the unfiltered data.
Is there a quick way to sum the number of visible users in each group after applying a filter?
The easiest way of solving this would be to take advantage of the order of operations in Tableau.
The issue you are having at the moment is the LOD calculation is performed prior to a dimension filter.
If you want to calculate a field at a different level of detail then the view than a LOD is still the way to go. All you need to do is force tableau to apply the filters before calculating the fixed calculation.
In order to do this change your filters to a context filter. This is done by right clicking on the filter and selecting "Add to context. You will see the filter change from blue to grey.
Your calculated field should now be sensitive to any context filters.
Find out more here

How to avoid aggregation of measure values in tableau

I'm trying to retrieve and analyze records from SQL server. Whenever I drag the measure values to rows field it gets auto aggregating itself in tableau.
I dont want this aggregation to be done since my values doesn't make sense when it gets agregated.
Is there a solution to remove this aggregation in tableau ?
Thanks
Yes, you can avoid aggregating values. However, your problem isn't that you are aggregating your values, your problem is that you are treating dimensions as measures.
To fix this you can convert Year from a measure to a dimension:
Of course, if you want to disaggregate the measures then you can always do that too:

How to calculate the mean of a dataframe column and find the top 10%

I am very new to Scala and Spark, and am working on some self-made exercises using baseball statistics. I am using a case class create a RDD and assign a schema to the data, and am then turning it into a DataFrame so I can use SparkSQL to select groups of players via their stats that meet certain criteria.
Once I have the subset of players I am interested in looking at further, I would like to find the mean of a column; eg Batting Average or RBIs. From there I would like to break all the players into percentile groups based on their average performance compared to all players; the top 10%, bottom 10%, 40-50%
I've been able to use the DataFrame.describe() function to return a summary of a desired column (mean, stddev, count, min, and max) all as strings though. Is there a better way to get just the mean and stddev as Doubles, and what is the best way of breaking the players into groups of 10-percentiles?
So far my thoughts are to find the values that bookend the percentile ranges and writing a function that groups players via comparators, but that feels like it is bordering on reinventing the wheel.
I was able to get the percentiles by using Windows Functions and apply ntile() and cumeDist() over the window. The ntile() can create grouping based off of an input number. If you want things grouped by 10%, just enter ntile(10), if by 5% then ntile(20). For a more fine-tuned restult, cumeDist() applied over the window will output a new column with the cumulative distribution, and those can be filtered from there through select(), where(), or a SQL query.

Different Aggregation calculations of a measure using two dimensions in Tableau

It is a Tableau 8.3 Desktop Edition question.
I am trying to aggregate data using two different dimensions. So, I want to aggregate twice: first I want to sum over all the rows and then multiply the results in a cummulative manner (so I can build a graph). How do I do that? Ok, too vague, here follow some more details:
I have a set of historical data. The columns are the date, the rows are the categories.
Easy part: I would like to sum all the rows.
Hard part: Given this those summations I want to build a graph that for each date it shows the product of all the summations from the earlier date till this date.
In another words:
Take the sum of all rows, call it x_i, where i is the date.
For each date i find y_i such that y_i = x_0 * x_1 * ... * x_i (if there is missing data, consider it to be one)
Then show a line graph for the y values versus the date.
I have searched for a solution for this and tried to figure it out by myself, but failed.
Thank you very much for your time and help :)
You need n calculated fields (number of columns you have), and manually do the calculation you need:
y_i = sum(field0)*sum(field1)
Basically because you cannot iterate on columns. For tableau, each column represent a different dimension or measure. So it won't consider that there is a logic order among them, meaning, it won't assume that column A comes before column B. It will assume A and B are different things.
Tableau works better with tables organized as databases. So if you have year columns, you should reorganize your data, eliminate all those columns and create a single field called 'Date', which will identify the value of your measure for that date. Yes, you will have less columns but far more rows. But Tableau works better this way (for very good reasons).
Tableau 9.0 allows you to do that directly. I only watched a demo (it was launched yesterday), but I understand that now there is an option to selected those columns (in the Data Connection tab) and convert them to a database format.
With that done, you can use a PREVIOUS_VALUE function to help you. I'm not with Tableau right now. As soon as I get to it I'll update this with the final answer . Unless you take the lead and discover yourself before that ;)

Aggregate bins in Tableau

I want to aggregate bins in tableau.
See the following figure:
I want to aggregate (merge) the NumberM from 6 untill 16 in one category. 5+/(6 and higher) for example and sum the values of 6-16 in that category. I think this can be done with a few simple clicks but I am not able to manage.
Thanks in advance,
Tim
There are several ways to classify data rows into different groups or classes: each with different strengths.
Create a calculated field As emh mentioned, one approach is to create a calculated field to assign a value to a new field indicating which group each data row belongs to. For the effect you want, the calculated field should be discrete (blue). If your calculation doesn't return a value for in one case, e.g. an if statement without an else clause, then the field will be null in that case which is a group in itself. This is a very general approach, and can handle much more complex cases. The only downsides are the need to maintain the calculated field definition and that the cutoff values are hard coded and by itself can't be changed dynamically via a control on the view. BUT those issues can by easily resolved by using a parameter instead of a numeric literal in your calculated field. In fact, that's probably the number one use case for parameters. If you think in SQL, a discrete field on a shelf is like a group by clause.
Use a filter If you only want a subset of the data in your view, e.g. data rows with NumberM in [6, 16] then you can drag the NumberM field onto the filters shelf and select the range you want. Note for continuous (green) numeric fields, filter ranges include their endpoints. Filters are very quick and easy to drop on a view. They can be made dynamically adjustable by right clicking on them and creating a quick filter. Its obvious from the view that a filter is in use, and the caption will include the filter settings in its description. But a filter doesn't let you define multiple bins. If you think in SQL, a filter is like a where clause (or in some cases using the condition tab, like a having clause)
Define histogram bins If you want to create regular sized bins to cover a numeric range, such as values in [1,5], [6,10], [11-15] ..., Tableau can create the bin field for you automatically. Just right click on a numeric field, and select Create Bins.
Define a group Very useful for aggregating discrete values, such as string fields, into categories. Good for rolling up detail or handling multiple spellings or variants in your data. Just right click on a field and select Create Group. Or select some discrete values on an axis or legend and press the paperclip option. If you then edit a group, you'll see what's going on. If you think in SQL, a group is like a SQL case statement.
Define a set Another way to roll up values. The definition of a set can be dynamically computed or a hard coded list of members. Both kinds are useful. You can combine sets with union, intersection, set difference operators, and can test set membership in calculated fields. Sets are useful for binary decisions, rows are divided into those that are members of the set and those that are not.
Filters, sets, groups, calculated fields and parameters can often be combined to accomplish different effects.
Most if not all of these features can be implemented using calculated fields, especially if the business rules get complicated. But if a filter, bin, group or set fits your problem well, then it's often best to start with that, rather than define a calculated field for each and every situation. That said, learning about the 4 kinds of calculated fields really makes a difference in being able to use Tableau well.
You can do this with calculated fields.
Go to: Select Analysis > Create Calculated Field.
Then use this formula:
IF NumberM > 5 THEN "OVER 5"
You can then use that calculated field as a filter on the worksheet in your screenshot.
Answering my own question:
With Tableau 9 this can be easily done with the increased flexibility of the level of detail expressions (LOD). I can really recommend this blog on that subject and many more Tableau functions.