How can I extract the IN count portion of a Tableau set? I can see the IN/OUT counts when I drop the set into Text but can't figure out how to get at the IN value by itself.
Ultimately, I want to create a Pie Chart of three sets with just the IN counts as the measures.
I am using Tableau Public if that is a factor.
You have to be a little careful about specifying what you wish to count.
One way to think of a set is as a Boolean function that gives a value to each data record denoting whether that record is associated with the set.
Another way to think of a set is as a mathematical set whose members are a subset of the values for some discrete field. (Or Tuple of fields)
The difference between the two views is really just a mindset, whether you consider the set as a Boolean function whose domain is a data row in the data source, or whose domain is the field on which the set definition is based.
Say you are looking at Tableau’s Superstore data set where each data record is a line item for a product attached to an order.
If your set is based on the field Region, say its called [My Favorite Regions] and currently contains {“East”, “Central”} do you want your count to be 2 (i.e. the number of regions in the set) ? Or do you want your count to be in the tens of thousands (i.e the number of line items on orders from the regions in the set)? Or something in between, maybe the number of distinct orders (i.e. order ids) within the selected regions...
If you want to count data rows that are associated with the set, you can simply filter by the set and calculate SUM([Number of Records[). If you want to count the regions in the set even though the level of detail of the data is at the order line item level,then you’ll have to use either a COUNTD to count the distinct regions, or some approach to specify what it is you want Tableau to count.
For example, put your set on the filter shelf, and show COUNTD(Region) which could be slow for very large data sets. To get the same effect without an explicit filter, you can define a LOD calculation such as:
{ COUNTD(if [My Favorite Regions] then [Region] end) }
Or you could use a table calc with the SIZE() function to do the calculation in the Tableau client instead of by the data source.
Not sure what your data looks like but you could set a certain condition when creating a set or split the IN/OUT into two different sets.
Here's a link to sets in Tableau.
You can do this with an if statement
IF [set] = TRUE THEN 1 ELSE 0 END
Then I suppose you could sum this calculated field
The most common usage is when you have a lot of categories and want to create an 'Other' category based on the categories that aren't in a set, if the set is a "Top N Set"
To do this:
IF [set] = TRUE THEN [dimension] ELSE 'Others' END
Related
I've been tasked to set up a Tableau worksheet of counts of data (ultimately to create percentages) where the contrived incoming data looks like the following.
id fruit
1 apple
1 orange
1 lemon
2 apple
2 orange
3 apple
3 orange
4 lemon
4 orange
The worksheet needs to look something like the following:
Count of ids
2 Lemons
2 No lemons
I've only been using Tableau for about 4 hours, so is this doable? Can anyone point me in the right direction?
The data is coming in from a SQL Server database in a format that I can control if that helps contribute towards a solution.
Alex's solution based on sets are very good for this scenario, but I would like to show that LODs can be more flexible if you need to extend your solution to include more categories.
for the current scenario, create a calculation with below formula and create text table using COUNTD(Id)
{FIXED [Id]:IF MAX([Fruit]='lemon') THEN 'Lemon' ELSE 'No Lemon' END}
Now for the extension part, you are considering below list where you want to count IDs with Lemon, Apple and others. Since no double counting of Ids are allowed, categorization will follow the order. (This kind of precedence will be a headache without LODs)
Now you can change your calculation as below:
{FIXED [Id]:IF MAX([Fruit]='lemon') THEN 'Lemon'
ELSEIF MAX([Fruit]='apple') THEN 'Apple'
ELSE 'No Lemon or Apple' END}
Now your visualization automatically changes to include the new category. This can be extended for any number of fruits.
This is a good use for a set.
In the data pane on the left sidebar, right click on the Id field and create a set named "Ids that contain at least one lemon" (or use a shorter less precise name)
In the set definition dialog panel, define the set by choosing "Use all" from the General tab, and then on the Condition tab, define the condition by the formula max([Fruit]="lemon")
There are many ways to think of a set, but the most abstract is just as a mathematical set of Ids that satisfy the condition. Remember each Id has many data rows, so the condition is a function of many data rows and uses the aggregation function MAX(). For booleans, True is treated as greater than False, so MAX() will return True if at least one of the data rows satifies the condition. By contrast, MIN() is True only if ALL (non-null) data rows satisfy the condition.
Once you have a set that separates your ids into Lemon scented Ids and others, then you can use that set in many ways - in calculated fields, in filters, in combination with other sets to make new sets, and of course on shelves to make visualizations.
To get a result like your question seeks, you could put your new set on the Row shelf, and put CNTD(ID) on the text shelf or columns shelf. Make sure you understand why you need count distinct (CNTD) instead of SUM([Number of Records]) here.
BTW, the LOD calculation { fixed [Id] : max([Fruit]="lemon") } is effectively the same solution.
My end goal is to have a box change color when the last 3 records input into a field (based on the time of input) in FileMaker achieve a certain criteria (ex. variance < 2). I would like to know how to make this happen, or how a calculation/script can be written to only look at the last 3 records.
There are several ways you could approach this. A simple one would be to use a script to:
Show all records in the given table;
Unsort them (assuming they were entered in chronological order; otherwise sort them by creation timestamp);
Omit all records except the last three;
Get the value of a summary field defined as Standard Deviation of your value field;
Set a global variable/field to the square of the returned value.
Then use the global variable/field to conditionally format your "box".
If you don't want to use a script, you will have to define a relationship in order to get the last three values in the table, regardless of the current found set and/or sort order. Or you may use the ExecuteSQL() function for this.
I am trying to use a calculated measure as a way to filter my data, but it's looking more difficult than expected. Let me explain through an example.
I have data of the following type, with two dimensions - one is a unique ID, the other a category - and four measures.
Initial table
My first step is to rank each element by its score, where the ranking is evaluated within the same category. I therefore create a new measure:
=aggr(rank(sum(Score1)), Category, UniqueID)
I do this for all three scores, resulting in three new calculated measures. My final calculated measure is the average of the three rankings. Below the example, the calculated measure of interest is the one in bold. Note that in my real world calculation I directly evaluate 'New Measure', without creating the intermediate columns 'RankingScore'.
Data with newly calculated measure
Note that this measure is tricky, as it changes according to previous selections. Say, for instance, that I select only entries with 'Amount' > 1000. The relative rankings will change and therefore also 'New Measure'.
In my actual App I need to filter my entries by 'New Measure', after I've done some previous selections on fields like 'Amount'. If it simply were a field, I would normally have created a filter pane, our used the qsVariable extension to have a slide range, to select only rows with 'New Measure' above a set threshold. Unfortunately it seems I cannot do that with my calculated measure.
How would you approach the problem? I was wondering, for example, if it were possible to 'convert' my new measure to an actual field, after all previous selections have been done, but perhaps this is nonsense.
Thank you in advance, and apologies for the long post!
If I'm understanding correctly, I believe this solution should work:
Create a variable for your slider: new_measure_slider.
Create a New Sheet Object -> Slider/Calendar Object.
Configure your slider to control your new new_measure_slider variable.
Create a calculated dimension in your chart substituting your 'New Measure' formula (the one you stated was an average of the three ranks). It should be a conditional like this:
=if(aggr([your average formula here], Category, UniqueID) >= new_measure_slider, [Category], null()).
Basically, compare your formula to the new_measure_slider variable. If true, use the Category (or UniqueID, whichever you need) as the dimension, if false, null().
Check the 'Suppress When Value is Null' checkbox on your new dimension. This is key. This is what will actually filter your chart.
In the chart properties, Presentation tab, click on your new calculated dimension and hit 'Hide Column'. We don't need to see this because we are using it only as a filter.
You can tell QV to ignore your filtering in the field Amount by adding "Amount=" to your set analysis.
I dont know how your average calculation looks like but maybe:
(aggr(rank(sum({<Amount=>} Score1)), Category, UniqueID) +
aggr(rank(sum({<Amount=>} Score2)), Category, UniqueID) +
aggr(rank(sum({<Amount=>} Score3)), Category, UniqueID)) / 3
I'm currently looking to count the number of instances a values shared across multiple dimensions. For example, say I have the following set of data:
And I want to return something like:
But ideally in the form of a bar graph. I want to keep the names associated with the data, so I can filter lets say by all "Bobs" or all "Hannahs".
Does anyone have any advice on how to do this in Tableau?
Here are a couple of ways you may be able to do this.
1) Create a calculated field for each food type. This is a bit cumbersome and you would need to add new ones for any new foods added. You calculations would look like this:
Hamburgers:
SUM(IF [Food1] = 'Hamburgers' OR [Food2] = 'Hamburgers' THEN 1 END)
Then you would make use of the Measure Names and Measure Values built-in fields.
2) You can normalize your data. If you are referencing a Excel or Text file, you can do this right in Tableau. Simply go to the Data Source tab, select the Food fields, and choose to Pivot them:
Goes to:
Now you can do:
Finally, both results support creating a bar chart:
I'm using Tableau Desktop 9.0 on OSX. I have data (loaded from a local CSV file) that looks like this:
code,org,items
0212000AA,142,10
0212000AA,143,15
0313000AA,142,90
0314000AA,143,85
I want a chart that shows the number of items beginning with 0212 as a percentage of all items, for each organisation. (I mean as a percentage of the organisation's items - for example, in the above, I would like to show 0.1 (10/(10+90)) for organisation 142.)
I have been able to get part way there, by adding org to Columns, and SUM(items) to Rows. Then by adding a Wildcard filter on code, for starts with 0212.
This shows me the number of items starting with 0212, by organisation.
But what I don't know how to do is show this divided by the value of all items for the organisation.
Is this possible in Tableau, or do I need to pre-calculate it before loading my data source?
One way is to define a calculated field called matches_code_prefix as:
left(code, 4) = "0212"
You can also define a parameter called, say, code_prefix to avoid hard coding the prefix string:
left(code, 4) = code_prefix
And then show the parameter control for code_prefix to allow the user to interact with it.
If you use this new field as a dimension to separate SUM(items) according to those that match the prefix and those that don't, you can then use a quick table calculation to get the percent of total.
For example, you can place org on the Rows shelf and matches_code_prefix on the Columns shelf, and SUM(items) on the Text shelf to make a table. Then under the analysis menu, turn on grand totals for both rows and columns to see the behavior. Next, right click on SUM(items) and choose Quick Table Calc->Percent of Total. Tableau will display the percents of total in the table.
If you want the percent of total defined differently than the default, then right click on the measure again and set Compute Using to a different value such as matches_code_prefix in your case. It's usually better to set compute using to a specific field.
If you only want to display the value for the matching case, select the column header you don't want to see and choose hide. You can also turn off the grand totals from the analysis menu when you are done.
When you are confident in the values in your table, you can turn it into a bar chart for example by moving matches_code_prefix to the detail shelf and the measure to the Columns shelf.
--
The above is the drag and drop approach. If you prefer to hard code everything in a single calculated field that is calculated on the database side, you could instead define a calculation such as:
zn(sum(if matches_code_prefix then items end)) / sum(items)
Then set the default number format for that field to display as a percentage