How to display 40 + columns in Tableau? - tableau-api

I am trying to do a list report with about 40 columns(Dims+measure) but not able to get it right,
the requirement pushes the Tableau limitation by exploiting its limit to only 16 columns.
How can I get this done?
I read this
Here is my Tableau workbook with 16+ columns but no column header

Go to Analysis-->Table Layout -->Advanced and change the number in Rows and Columns as per your need.
You can't add more than 16 to this, but increase it to 16 (for identification).
So, save the Tableau file with extension .TWB. Then open this file in notepad.
Then search for the text: attr='row-levels'.
You will find something like:
<format attr='row-levels' value='16' />
<format attr='row-horiz-levels' value='16' />
Change the value of 16 to desired column numbers. Save the notepad file. Open it in Tableau.

The measures names and measures values special fields can help here and covers most use cases. (Using the measure names and values fields is likely a better choice than creating 40+ marks cards as you did in your posted example)
Put Measure Names on the column and filter shelves and measure values on the text shelf. Then add the measure fields you want to the Measures Values shelf. Then put the dimensions that you wish on the rows shelf.
A single field+aggregation can only be on the Measure Values shelf once, but a field can repeat with different aggregations -- so you can show the min, avg and max of a measure in 3 different columns.
As you mentioned, you can increase the max col and row headers up to 16 each via the Analysis->Table Layout->Advanced menu and panel. Beyond that point, adjacent columns will still display, just be coalesced for display.
Still you can have an apparently arbitrary number of fields on the measures values shelf, so can display as many columns of measures (data) as you wish, even though adjacent header columns for dimension (~category) get coalesced for display once you hit the header limit.
Tableau is optimized for summarizing data for efficient interpretation by humans, so displaying extremely wide tables of data is not the best fit for the tool (or a human reader frankly). Importing and exporting large tables is certainly possible.

At the 2015 conference I went to a session called "Use Tableau Like a Sith" and they showed us how to change the XML to workaround the 16 limit. Caveat being this is not supported.
Find the entries in the attached image and change their value to 40. In the screenshot, the Sith presenters were changing them to 36.

Here is a workaround for some data sets:
convert your fields from Dimension to Measure, and then
display using Measure Names / Measure Values, as #Alex Blakemore suggested.
For example, Boolean fields can be converted to numeric using INT().
PROS:
It is easier to change which fields to plot using Measure Names / Measure Values.
Faster performance, at least for some data sets.
CONS:
Often data sets have some fields that cannot or should not be converted to measure.
Not as easy or straightforward as changing Analysis > Table Layout > Advanced settings, or the xml-editing workaround suggested by #Cyndi1976.

There are Two ways:
Edit the saved .twb file and edit the Below xml code by opening the workbook with Notepad
<format attr='row-levels' value='16' />
<format attr='row-horiz-levels' value='16' />
Create 3 different worksheets each consisting multiple column but each worksheet consisting columns >16 and place them in single dashboard. So you will get one view with 40 columns.

A good way to do this is to create groups and filters. I'm sure, out of 40+ columns, a good number of them can be converted to either of the above, giving a neater look to your dashboard, making it easy to comprehend your data.
Let us assume you're creating a dashboard to show the overall split of mobile recharges for a company x.
One of the option is to have multiple columns; each for:
the mobile OS
OS version
service provider
recharge rank
Sub-category (Prepaid / Postpaid)
...
the easier and elegant way to reduce the number of columns is to populate a dropdown list with these values. Not only this will make the dashboard easier to comprehend, it will reduce the number of columns one has to refer to interpret the data and would also reduce the technical limitations imposed on the number of columns.
to create a group in Tableau:
include the fields in the result set i.e. use the column[s] in select statement.
select os, os_version, service_provider, rank, subcategory ... from schema.recharge_table [where...];
In the Sheets view of Tableau, right click on the field to create group. Let's create a split on subcategory.
Group the sub-categories, give them proper alias to be recognised easily.
Drag the Group to filter and you've successfully and elegantly reduced one column.

16 is the maximum limit for row/column labels in tableau table.

Put 20 columns on one sheet and 20 one the other dashabord. Drag and drop both sheets on to your dashbaord, and you should be having 40 columsn.

Related

Is is possible limit the number of rows in the output of a Dataprep flow?

I'm using Dataprep on GCP to wrangle a large file with a billion rows. I would like to limit the number of rows in the output of the flow, as I am prototyping a Machine Learning model.
Let's say I would like to keep one million rows out of the original billion. Is this possible to do this with Dataprep? I have reviewed the documentation of sampling, but that only applies to the input of the Transformer tool and not the outcome of the process.
You can do this, but it does take a bit of extra work in your Recipe--set up a formula in a new column using something like RANDBETWEEN to give you a random integer output between 1 and 1,000 (in this million-to-billion case). From there, you can filter rows based on whatever random integer between 1 and 1,000 as what you'll keep, and then your output will only have your randomized subset. Just have your last part of the recipe remove this temporary column.
So indeed there are 2 approaches to this.
As Courtney Grimes said, you can use one of the 2 functions that create random-number out of a range.
randbetween :
rand :
These methods can be used to slice an "even" portion of your data. As suggested, a randbetween(1,1000) , then pick 1<x<1000 to filter, because it's 1\1000 of data (million out of a billion).
Alternatively, if you just want to have million records in your output, but either
Don't want to rely on the knowledge of the size of the entire table
just want the first million rows, agnostic to how many rows there are -
You can just use 2 of these 3 row filtering methods: (top rows\ range)
P.S
By understanding the $sourcerownumber metadata parameter (can read in-product documentation), you can filter\keep a portion of the data (as per the first scenario) in 1 step (AKA without creating an additional column.
BTW, an easy way of "discovery" of how-to's in Trifacta would be to just type what you're looking for in the "search-transtormation" pane (accessed via ctrl-k). By searching "filter", you'll get most of the relevant options for your problem.
Cheers!

How to sum two different group by calculated fields in Tableau?

I have two calculated fields (HomeScore, AwayScore) and I grouped them by different dimensions(Home, Away). Now, I have TotalRuns per Team both in HomeGames and AwayGames. My problem is that I want to find the sum of TotalRuns per Team not separetely for home games and away games. I want to add these group-by fields somehow. I attach a screenshot to see my work. For example first column for both charts is "Arizona Diamondbacks" which has 263 Runs in first chart and 337 in the second one. I want to show the 263+337=600 Runs. Any Idea?
You'll want to create a LOD expression.
{FIXED [Team Name] : SUM([Total Runs])}
Think of your data as a big table (which it technically always is in Tableau). Every grouping, filter, etc. that you do narrows down the number of columns and rows you have left until you are left with your data set that contributes to your chart. LOD expressions allow you to back out of the filters, etc. in your calculation. In this case, you narrowed down to home or away games, and we are backing out of that to get a bigger picture of the data.

tableau show categories from calculation even when a category is not visible

I have a calculation and it outputs multiple values. Then I am creating a table on those values. For example, in below data my formula is
if data is 1 then calculation is `one`
if data is 2 then calculation is `two`
if data is 3 then calculation is `three`
as three doesn't really appear in the output, when I create a table, three is not displayed. Is there any way to display it?
I tried table layout >> show empty rows and columns and it didn't work
data calculation
1 one
2 two
Tableau discovers the possible values for a dimension field dynamically from the query results.
If ‘three’ does not appear in your data, then how do you expect Tableau to know to make a column header for that non existent, but potential, value? It can’t read your mind.
This situation does occur often though - perhaps you want row or column headers to remain stable, even when you change filters in a way that causes some to no longer appear in the query results.
There are a few ways you can force Tableau to pad ** or **complete a domain:
one solution is to pad your data to make sure each value for your dimension field appears in at least one data row.
You can often do this easily by using a union to append some extra rows to your original data. You can often add padding rows that don’t impact any results by leaving all your Measure columns null since nulls are ignored by aggregation functions
Another common solution that is a bit more effort is to make what is known as scaffolding data source that is not much more than a list of your dimension members. You can then use that data source as a primary data source with data blending, making your original data source secondary.
There are two situations where Tableau can detect the absence of data and leave space for it in the visualization automatically
for numeric types, you can create a bin field that will automatically pad for missing bins
similarly, date fields can show missing values because, like bins, Tableau can tell when a month doesn’t appear in the data and leave room for it in the view

Tallying unknown words across columns in Tableau (or from comma separated column)

I have an issue that I have been trying to solve for the better part of a week now. I have a large database (in Google sheets) representing casestudies. I have some columns with multiple categories listed (in this example 'species', 'genera', and 'morphologies'), and I want to be able to tally how many times each category occurs in the data set.
I use Tableau to visalise the data, and the final output will be a large publc tableau. I know I can do a "find" based on the specific string, but I'd like the dataset to be dynamic and be able to handle new data being added without having to update calculated fields? Is there a way of finding uniqe terms (either from a single column of comma separated values, or from multiple columns), and tallying them?
Things I have tried so far:
1 - A pivot table in Tableau. Works well, but messes with all the other data, since it repeats lines.
2 - A pivot table on its own data source in Tableau. Also works well, and avoids the problem of messing with the other data. However, now each figure is disconnected from the others so I can't do a large dashboard where everything is filtered by each other (ie filtering species and genera by country at the same time).
3 - An SQL query() in google sheets, which finds all unique terms and queries them, which can then be plotted in Tableau. Also works well, but similar problem of the data being disconnected from all the other terms in the dataset.
Any ideas of a field calculation that will find, list and tally unique terms in a single comma separated column (or across multiple columns), without changing the data structure?
I have placed a sample data set here (google sheets), which is a smaller version of what I'm actually working on. In it I have marked comma separated columns in grey, and they're followed by a bunch of columns with the values split into columns. I only need to analyse either of those (ie either a calculation to separate comma separate values or from multiple columns).
I've also added a sample Tableau workbook here.

Show calculated measure in row?

I'm using Tableau Desktop 9.0 on OSX. I have data (loaded from a local CSV file) that looks like this:
code,org,items
0212000AA,142,10
0212000AA,143,15
0313000AA,142,90
0314000AA,143,85
I want a chart that shows the number of items beginning with 0212 as a percentage of all items, for each organisation. (I mean as a percentage of the organisation's items - for example, in the above, I would like to show 0.1 (10/(10+90)) for organisation 142.)
I have been able to get part way there, by adding org to Columns, and SUM(items) to Rows. Then by adding a Wildcard filter on code, for starts with 0212.
This shows me the number of items starting with 0212, by organisation.
But what I don't know how to do is show this divided by the value of all items for the organisation.
Is this possible in Tableau, or do I need to pre-calculate it before loading my data source?
One way is to define a calculated field called matches_code_prefix as:
left(code, 4) = "0212"
You can also define a parameter called, say, code_prefix to avoid hard coding the prefix string:
left(code, 4) = code_prefix
And then show the parameter control for code_prefix to allow the user to interact with it.
If you use this new field as a dimension to separate SUM(items) according to those that match the prefix and those that don't, you can then use a quick table calculation to get the percent of total.
For example, you can place org on the Rows shelf and matches_code_prefix on the Columns shelf, and SUM(items) on the Text shelf to make a table. Then under the analysis menu, turn on grand totals for both rows and columns to see the behavior. Next, right click on SUM(items) and choose Quick Table Calc->Percent of Total. Tableau will display the percents of total in the table.
If you want the percent of total defined differently than the default, then right click on the measure again and set Compute Using to a different value such as matches_code_prefix in your case. It's usually better to set compute using to a specific field.
If you only want to display the value for the matching case, select the column header you don't want to see and choose hide. You can also turn off the grand totals from the analysis menu when you are done.
When you are confident in the values in your table, you can turn it into a bar chart for example by moving matches_code_prefix to the detail shelf and the measure to the Columns shelf.
--
The above is the drag and drop approach. If you prefer to hard code everything in a single calculated field that is calculated on the database side, you could instead define a calculation such as:
zn(sum(if matches_code_prefix then items end)) / sum(items)
Then set the default number format for that field to display as a percentage