Different Aggregation calculations of a measure using two dimensions in Tableau - tableau-api

It is a Tableau 8.3 Desktop Edition question.
I am trying to aggregate data using two different dimensions. So, I want to aggregate twice: first I want to sum over all the rows and then multiply the results in a cummulative manner (so I can build a graph). How do I do that? Ok, too vague, here follow some more details:
I have a set of historical data. The columns are the date, the rows are the categories.
Easy part: I would like to sum all the rows.
Hard part: Given this those summations I want to build a graph that for each date it shows the product of all the summations from the earlier date till this date.
In another words:
Take the sum of all rows, call it x_i, where i is the date.
For each date i find y_i such that y_i = x_0 * x_1 * ... * x_i (if there is missing data, consider it to be one)
Then show a line graph for the y values versus the date.
I have searched for a solution for this and tried to figure it out by myself, but failed.
Thank you very much for your time and help :)

You need n calculated fields (number of columns you have), and manually do the calculation you need:
y_i = sum(field0)*sum(field1)
Basically because you cannot iterate on columns. For tableau, each column represent a different dimension or measure. So it won't consider that there is a logic order among them, meaning, it won't assume that column A comes before column B. It will assume A and B are different things.
Tableau works better with tables organized as databases. So if you have year columns, you should reorganize your data, eliminate all those columns and create a single field called 'Date', which will identify the value of your measure for that date. Yes, you will have less columns but far more rows. But Tableau works better this way (for very good reasons).
Tableau 9.0 allows you to do that directly. I only watched a demo (it was launched yesterday), but I understand that now there is an option to selected those columns (in the Data Connection tab) and convert them to a database format.
With that done, you can use a PREVIOUS_VALUE function to help you. I'm not with Tableau right now. As soon as I get to it I'll update this with the final answer . Unless you take the lead and discover yourself before that ;)

Related

How to pass a vector from tableau to R

I have a need to pass a vector of arguments to Rserve from tableau. Specifically, I am using IRR calculations in R (on Rserve), and i want to pass vector of cash-flows that are as columns in my table (instead of rows/measure). So, i want to collect all those CF in a vector and pass it on to Rserve. Passing them one at a time slows down IO.
SCRIPT_REAL("r_func(c(.arg1, .arg2, .arg3))",sum(cf1), sum(cf2), sum(cf3))
cf1..cfn are cashflows corresponding to various periods. Above code works well when cf are few but takes a long time when i have few hundereds. Further, time spent is not in calculation but IO when communicating with remote Rserve. If i have a local Rserve, this calculation happens under few seconds while on remote, it takes well over a minute.
Also, want to point out that tableau / Rserve, set one argument after another and that takes time. My expectation is that once i have a vector, it would be just 1 transfer and setting of arguments, and therefore this should speed up
The first step in understanding how Tableau interacts with R or Python, is understanding how Tableau's table calcs work.
Tableau Script_XXX() functions are table calculations which means that you invoke them on a vector of aggregate query results and the corresponding R or Python code needs to return a vector usually of the same size. (I think you may be able to return a scalar or smaller vector which gets replicated to appear like a vector of the same size as the argument -- but not certain)
You can control how your data is partitioned into vectors, and also the ordering of data in the vectors, by editing the table calc to specify the partitioning and addressing for that calc.
Partitioning determines how your aggregate query results are broken up into vectors for calculation purposes. Addressing determines how the elements of each vector are ordered. You can either do that based on the physical layout of the table structure, or (better) based on the specific dimensions.
See the Tableau on-line help for table calcs for more info, and look online training videos from Tableau or blog entries (especially from anyone named Bora)
One way to test your understanding of these concepts is create a Tableau table (i.e., a viz with a mark type of text) with several dimensions on row and column shelves. Then create calculated fields for INDEX() and SIZE() and display them on text. Finally, change the partitioning and addressing in different ways by editing those table calcs. Try several different permutations. When you can confidently predict what those functions will produce for different settings, then you're ready to do more complex tasks - such as talking to R.
It is also instructive to experiment with FIRST(), LAST(), LOOKUP(), WINDOW_SUM() etc -- and finally dig into PREVIOUS_VALUE(). Warning, PREVIOUS_VALUE() is a bit odd, and does not behave the way you probably assume it does. Still, it is a useful technique that can implement a recursive calculation, and is about as close to a for loop as Tableau gets.

How to sort by any measure in a Tableau table

I've built a new worksheet that has two dimensions and several facts. When I try to sort on any column, it only seems to sort within the dimensions. Is it possible to sort based on the column, ignoring dimensions? I find if I concatenate the two dimensions into one... that does work, but is not ideal.
Ah yes, sorting in Tableau. Took me a long time to understand it. It doesn't do sorting the way you would expect in other tools like Excel. This is because it's grouping dimensions from left to right. Think of each dimension getting nested inside the one to the left of it. Another way to think of it is that Tableau doesn't sort measures, it sorts dimensions based on some value a measure. That's why concatenating dimensions will yield the expected result, because you have just one calculated dimension and that dimension gets sorted by the value of a measure. You can right click on the concatenated dimension in your Rows shelf and choose Show Header. That's probably your best bet.
See this article from The Information Lab on the sorting in Tableau: https://www.theinformationlab.co.uk/2014/11/03/understanding-sorting-tableau/
There are some Tableau Community posts about it too.
https://community.tableau.com/thread/118958
https://community.tableau.com/thread/221956
https://community.tableau.com/thread/164714

Tableau Dual Axis with different filters

I am trying to create a graph with two lines, with two filters from the same dimension.
I have a dimension which has 20+ values. I'd like one line to show data based on just one of the selected values and the other line to show a line excluding that same value.
I've tried the following:
-Creating a duplicate/copy dimension and filtering the original one with the first, and the copy with the 2nd. When I do this, the graphic disappears.
-Creating a calculated field that tries to split the measures up. This isn't letting me track the count.
I want this on the same axis; the best I've been able to do is create two sheets, one with the first filter and one with the 2nd, and stack them in a dashboard.
My end user wants the lines in the same visual, otherwise I'd be happy with the dashboard approach. Right now, though, I'd also like to know how to do this.
It is a little hard to tell exactly what you want to achieve, but the problem with filtering is common.
The principle that is important is that Tableau will filter the whole dataset by row. So duplicating the dimension you want to filter won't help as the filter on the original dimension will also filter the corresponding rows in the second dimension. Any solution has to be clever enough to work around this issue.
One solution is to build two new dimensions that use a calculation rather than a filter to create the new result. Let's say you have a dimension, [size] that has a range of numbers from 1 to 10 and you want to compare the total number of rows including and excluding the number 5. You could create a new field using a formula like if [size] <> 5 then 1 else 0 end
Summing the new field will give a count of the number of rows that don't contain a 5 and this can be compared directly to a rowcount of the original [size] field which will give the number including the value 5.
This basic principle can be extended to much more complex logic. The essential point is to realise that filters act on every row in your data and can't, by themselves, show comparisons with alternative filter choices on a single visualisation.
Depending on the nature of your problem there may be other solutions worth looking at including sets and groups but you would need to provide more specific details for users here to tell you whether they would be useful.
We can make a a set out of the values of the dimension and then place it in the required shelf. So, you will have your dimension which will plot accordingly and set which will have data as per the requirement because with filter you can't have that independence of showing data everytime you want.

Multiply all rows in a Tableau table chart

This seems pretty simple, but I can't seem to find a way to do this.
I need to multiply all rows in a chart - or all columns, whichever is easiest. AKA. I am looking for something like the product() function in Excel.
Any ideas on how to accomplish this?
EDIT: Row values may change, so this needs to be a dynamic calculation. Like a function to aggregate all values into a product of the values.
A calculated field can be created with: sum([Sales])*PREVIOUS_VALUE(1)
This gives the running product. Then, that field can be inserted into the table.
You can certainly create a calculated field that will multiply various columns together. Something along the lines of [Column1] * [Column2] * [Column3] will generate a new calculated measure that is the product of all three columns.
That being said, if you're doing that much data manipulation within Tableau, you should probably be giving some hard thought as to why that's necessary. While calculations are certainly possible and new Tableau 9 features such as level of detail functions make doing calcs on measures not present in your viz easier, Tableau is primarily a data presentation layer. Data manipulation apart from simple calcs and pivot/unpivot operations should be done upstream. Doing advanced manipulations within Tableau, while sometimes/often possible, can be very hard to debug and reproduce.

Date calculation - Tableau

We have data that is submitted that is only YTD numbers. I'm wondering how I could display numbers that are subtracted along the Date field.
Ie, if I want to show the MTD movement on March. I will have to go March less February.
Now I know I can do this for individual measure fields. But having around 40+ measures seems a bit tedious.
http://kb.tableausoftware.com/articles/knowledgebase/creating-ytd-mtd-calculations
I tried to enter "Measure Values" but that is not a valid measure to put in the calculation.
Is there a way to set up a custom dimension?
Thanks,
Gem
After days of research, can't be done in tableau unless you want to labour for a week creating an almost cell by cell calculation. Data transformation in SQL will be a more feasible solution.
I had pivoted the data previously in SQL, so that I end up with 1 measure column instead of 40+. That enables you to minimise the calculation fields, so that you don't have to repeat all the calculation for individual measures.
Works well. Not for ratios though, as you will need to extract individual measures again so that you can divide them against each other. It's got pros and cons. Number of rows in the DB also multiplies.
Other solutions that preserves the table structure will be to use temp tables and do calculations on several temp tables.