Running percentage calculation in tableau - tableau-api

I have calculated running total as below, when I am trying to calculate running %, it gives wrong values
Quarter Status
Closed Closed % Open Open% Total Total %
Q1 16 21.62 58 78.38 74 100
Q2 29 17.57 119 82.34 148 100
Q3 29 191 100 220 100
% values displayed are actual percentage values of the cell count and not the ones calculated on the running total count
How do I fix this
Expected output:
Quarter Status
Closed Closed % Open Open% Total Total %
Q1 16 21.62162 58 78.37838 74 100
Q2 29 19.59459 119 80.40541 148 100
Q3 29 13.18182 191 86.81818 220 100
I have tried % total all options

Have you tried Edit Table Calculation and setting the calculation to restart every [Quarter]? Right click on the table calculation in the view to set different "Compute Using" fields.

Related

How can I efficiently convert the output of one KDB function into three table columns?

I have a function that takes as input some of the values in a table and returns a tuple if you will - three separate return values, which I want to transpose into the output of a query. Here's a simplified example of what I want to achieve:
multiplier:{(x*2;x*3;x*3)};
select twoX:multiplier[price][0]; threeX:multiplier[price][1]; fourX:multiplier[price][2] from data;
The above basically works (I think I've got the syntax right for the simplified example - if not then hopefully my intention is clear), but is inefficient because I'm calling the function three times and throwing away most of the output each time. I want to rewrite the query to only call the function once, and I'm struggling.
Update
I think I missed a crucial piece of information in my explanation of the problem which affects the outcome - I need to get other data in the query alongside the output of my function. Here's a hopefully more realistic example:
multiplier:{(x*2;x*3;x*4)};
select average:avg price, total:sum price, twoX:multiplier[sum price][0]; threeX:multiplier[sum price][1]; fourX:multiplier[sum price][2] by category from data;
I'll have a go at adapting your answers to fit this requirement anyway, and apologies for missing this bit of information. The real function if a proprietary and fairly complex algorithm and the real query has about 30 output columns, hence the attempt at simplifying the example :)
If you're just looking for the results themselves you can extract (exec) as lists, create dictionary and then flip the dictionary into a table:
q)exec flip`twoX`threeX`fourX!multiplier[price] from ([]price:til 10)
twoX threeX fourX
-----------------
0 0 0
2 3 4
4 6 8
6 9 12
8 12 16
10 15 20
12 18 24
14 21 28
16 24 32
18 27 36
If you need other columns from the original table too then its trickier but you could join the tables sideways using ,'
q)t:([]price:til 10)
q)t,'exec flip`twoX`threeX`fourX!multiplier[price] from t
An apply # can also achieve what you want. Here data is just a table with 10 random prices. # is then used to apply the multiplier function to the price column while also assigning a column name to each of the three resulting lists:
q)data:([] price:10?100)
q)multiplier:{(x*2;x*3;x*3)}
q)#[data;`twoX`threeX`fourX;:;multiplier data`price]
price twoX threeX fourX
-----------------------
80 160 240 240
24 48 72 72
41 82 123 123
0 0 0 0
81 162 243 243
10 20 30 30
36 72 108 108
36 72 108 108
16 32 48 48
17 34 51 51

Replace values within matlab matrix using column values from another matrix

I have a big matrix (8656x25960) with some speckle noise within it. I used the findpeaks tool in order to find in what columns I indeed have peaks above a certain threshold. The output of the findspeaks tool is a matrix containing all of the bad columns, for example -
loc =
Columns 1 through 6
30 51 155 307 333 338
Columns 7 through 12
642 955 1409 1567 1728 1730
Columns 13 through 18
2332 2546 2615 2685 2806 2995
Columns 19 through 24
3002 3122 3124 3164 3690 4176
Columns 25 through 30
4430 4475 4539 5142 5155 5244
Columns 31 through 36
5246 5941 5943 6114 6486 6922
Columns 37 through 42
7165 7169 7460 7587 7647 8944
Columns 43 through 44
12754 13693
How can I use those columns numbers with the original matrix and replace the values of this 'bad' column with the value 0 (for example).
Hoping I'm clear enough.
For row vector Ioc simply use indexing:
yourmatrix(:,Ioc) = 0;

Checking if value exists in a matrix and getting its columns

I have a 500x500 matrix with values ranging from 1-100.
I need to look at 5 rows at a time and see if those 5 rows contain values that are greater than 75. I then need to get the index of the first column where the value is greater than 75 and the index of the last column where the value is greater than 75.
So far, I have the following:
i = 1;
while i < size(data,1)
if (i + 5) <= size(data,1)
if any(envNoClutterscansV(i:i + 5, 1:500) > 75)
% do something
end
end
i = i + 5;
end
The idea here is that I am looking at 5 rows at a time. For every 5 rows, I'm looking through all the columns to see if there are values that meet my criteria. So far, this doesn't find any values, even though I'm sure that my dataset contains the values. Additionally, I am not sure what to do from here.
I think the trouble might be that the result of any in the above code is a vector of 500 true and false values. You should sum them if you e=want to respond every time there are larger than 75 values:
if sum(any(envNoClutterscansV(i:i + 5, 1:500) > 75))
If you want to speed it up, you can avoid the loop and vectorize it, for example like this:
data = [
11 76 25 44 55 75;
11 75 95 44 85 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 90 25 44 55 88;
11 0 25 44 55 0;
91 0 25 44 55 80;
];
% Geting the number of rows
nRows=size(data,1);
% Retting a logical matrix with all the cells that are above the treshold
cellsOverTreshold=data>75;
% Getting a logical index to all the rows that contain values above
% treshold
matchingRows=any(cellsOverTreshold,2);
% In nexy line of code "reshape" rearange the data to put in columns the
% values associated to each goup of 5 rows
% So colum 1 have group one corresponding to data columns 1,2,3,4,5
% colum 2 have group two corresponding to data columns 6,7,8,9,10
% and so on
% Now we can get all the row groups that have velues above threshold
matchingRowGroups=find(any(reshape(matchingRows,5,[])));
% Now e put each row of on a cell array to be able to operate row-wise
cellRows = num2cell(cellsOverTreshold, 2);
% We now get the first and last column over the threshold for each row
firstColumOfRow = cellfun(#(x)find(x,1,'first'), cellRows,'UniformOutput',false);
lastColumOfRow = cellfun(#(x)find(x,1,'last'), cellRows,'UniformOutput',false);
% We replace the empty cells with NaNs so we can convert them to vectors
% without losing the indexing
firstColumOfRow(~matchingRows)={NaN};
lastColumOfRow(~matchingRows)={NaN};
% We rearrange the data as above and get the minimum of the first columns
% of each group, that is the first colum of the group above the threshold
firstColInGroup=nanmin(reshape([firstColumOfRow{:}]',5,[]));
% With the maximum of the last colums we get the last column of each group
lastColInGroup=nanmax(reshape([lastColumOfRow{:}]',5,[]));
% We finaly keep only the data of the groups with at that have at least one
% element above the threshold
firstColInGroup=firstColInGroup(matchingRowGroups);
lastColInGroup=lastColInGroup(matchingRowGroups);
In this way the variable "matchingRowGroups" have the indexes of each group of 5 rows that matchs. The variable "firstColInGroup" have the first column matching for each group and "lastColInGroup" the last one.
In addition to my previous answer, here is another option of vectorization, avoiding to transform data into cell arrays and avoiding using cellfun too, therefore, it is probably faster. Here it is:
data = [
11 76 25 44 55 75;
11 75 95 44 85 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 90 25 44 55 88;
11 0 25 44 55 0;
91 0 25 44 55 80;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 84 55 0;
11 0 25 44 55 0;
];
% Geting the number of rows
[nRows, nCols]=size(data);
% Retting a logical matrix with all the cells that are above the treshold
cellsOverTreshold=data>75;
% Getting a logical index to all the rows that contain values above
% treshold
matchingRows=any(cellsOverTreshold,2);
% In nexy line of code "reshape" rearange the data to put in columns the
% values associated to each goup of 5 rows
% So colum 1 have group one corresponding to data columns 1,2,3,4,5
% colum 2 have group two corresponding to data columns 6,7,8,9,10
% and so on
% Now we can get all the row groups that have velues above threshold
matchingRowGroups=find(any(reshape(matchingRows,5,[])))
%We find the rows and columns of all the first and last columns of each row
% that have values above threshold
[firstRow, firstCol]=find(cumsum(cumsum(cellsOverTreshold,2),2)==1);
[lastRow, lastCol]=find(cumsum(cumsum(cellsOverTreshold,2,'reverse'),2,'reverse')==1);
% Sort this data in vectors with one value per row, leaving NANs for rows
% with no element above threshold
firstColumOfRow=NaN(nRows,1);
lastColumOfRow=NaN(nRows,1);
firstColumOfRow(firstRow)=firstCol;
lastColumOfRow(lastRow)=lastCol;
% We rearrange the data as above and get the minimum of the first columns
% of each group, that is the first colum of the group above the threshold
firstColInGroup=nanmin(reshape(firstColumOfRow,5,[]));
% With the maximum of the last colums we get the last column of each group
lastColInGroup=nanmax(reshape(lastColumOfRow,5,[]));
% We finaly keep only the data of the groups with at that have at least one
% element above the threshold
firstColInGroup=firstColInGroup(matchingRowGroups)
lastColInGroup=lastColInGroup(matchingRowGroups)
This code looks 5 rows a time. Use find to locate the values > 75 and ind2sub to convert the indices returned by find to rows (ignored) and columns cols.
data = [
11 76 25 44 55 78;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 0 25 44 55 88;
11 0 25 44 55 0;
11 0 25 44 55 0;
];
for row = 1:5:size(data, 1)
fprintf('Row %d - %d\n', row, row+4);
indices = find(data(row:row+4,:) > 75);
if ~isempty(indices)
[~, cols] = ind2sub([5 size(data, 2)], indices);
col_min = min(cols);
col_max = max(cols);
fprintf('Column: %d and %d\n', col_min, col_max);
end
end
After thinking a bit more, here you have yet another simpler, faster and more compact solution. See my first solution for more datils on the naming of variables, but they are quite self explanatory
data = [
11 76 25 44 55 75;
11 75 95 44 85 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 90 25 44 55 88;
11 0 25 44 55 0;
91 0 25 44 55 80;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 84 55 0;
11 0 25 44 55 0;
];
% Geting the number of rows and columns
[nRows, nCols]=size(data);
%We create arrays with rows and column numbers of each element
[colNum,rowNum]=meshgrid(1:nCols,1:nRows);
% Set NaN the column numbers that do not match the treshold
colNum(data<=75)=NaN;
% Get the group number of each element
groupNum=ceil(rowNum/5);
%The matching groups are those that have at least one non-NaN element
matchingRowGroups = accumarray(groupNum(:),colNum(:),[],#(x)any(~isnan(x)))
%We get the minimum of the column numbers matching thershold on each group
firstColumOfGroup = accumarray(groupNum(:),colNum(:),[],#nanmin)
%We get the maximum of the column numbers matching thershold on each group
lastColumOfGroup = accumarray(groupNum(:),colNum(:),[],#nanmax)
The only difference with the previous solutions is that matchingRowGroups is a logical index, and firstColumOfGroup and lastColumOfGroup have one entry per group, instead of entries only for groups with elements above the threshold. Groups with no entry above threshold have NaN values

Need to calculate until a specific date in tableau?

There are three columns, date, x, y
I need to calculate the running sum/total of y for a specific date (today's date more specifically). The data is in two datasources and looks like this in first data source.
DATE X Z
5-Sep
6-Sep 26 101
7-Sep 27 100
8-Sep 28 99
9-Sep 29 98
10-Sep 30 98
11-Sep 30 98
12-Sep 30 97
13-Sep 31 96
14-Sep 32 95
15-Sep 33 94
16-Sep 34 93
17-Sep 35 92
18-Sep 35 92
and like this is second data source
DATE Y
5-Sep 166
6-Sep 182
7-Sep 130
8-Sep 93
9-Sep 107
10-Sep 95
11-Sep 128
12-Sep 173
13-Sep 154
14-Sep 136
15-Sep 79
16-Sep 61
17-Sep 156
18-Sep 66
Lets say that today's date is 17th Sep, then I need to calculate the running sum of 'Z' until today and display it next to the 'X' column. Something like this
17-Sep 35 1499.
How do I do that?
(I tried using sets with date by limiting the date to today but then the running sum doesn't work, also there are some errors in calculated field which is because the data is in two different sources)
Please ask if need more clarification
Using the Super store data, I created a date parameter. Then created a calculated field as follows:
if [date param] >= [Order Date] then [Sales] end
Now this will display sales prior to your selected date parameter. I also created a filter calc to only see data prior to the selected date in the param.
[date param]>=[Order Date]
Place this in the filter shelf and select True.
Now place date field on Rows and your sales calculated field on Text pill. Right click on it and select Quick Table Calculation > Running Total.
See sample workbook here: https://www.dropbox.com/s/p42tx86v4qidlvn/170327%20stack%20question.twbx?dl=0
EDIT:
If you just want to see the total and the date selected, create a calc field for "last" as last() then filter that for zero.

Calculate Variance of a Group data

I have a table contain height and frequency.I want to calculate the variance of it.
Height 140 150 160 170 180 190
Frequency 3 5 57 63 30 2
I have tried the below code:
height=[140 150 160 170 180 190;3 5 57 63 30 2]
height=height(:)
V = var(height) %Calculate Variance
**This give an answer of 5.7316e+03**
while with formula it give an answer of 81.8594. Now please tell me how can i do this?
Use weighted variance:
h=height;
var(h(1,:),h(2,:))