Calculated field in Tableau - tableau-api

I have a very simple problem but i am totally new in Tableau. So needs some help in solving this problem.
My Data Set contain
Year_Track_4,Year_Track_5,Year_Track_6,Year_Track_7,.... N
Each Year_Track contain 1 /0 values. 1 means graduated and 0 means didnot graduated or failed
enter image description here
y4 y5 N
1 8
0 5
1 6
0 1
1 2
1 5
1 7
1 8
1 5
0 7
1 5
1 8
1 6
1 1
So , I want to create a placeholder in Tableau or Calculated Field or parameter to select one YEAR and count number of graduated or didn't graduated.
I need to create the same for OverAll_0 and OverAll_1 as one Calculated field and it contains the value of 1 and 0 . So, that i can use the SUM(N) and and calculate it.

I used IFF statement to solve this problem
IIF(Year_Track_4 = 0) then 'graduated in 4 year '
.......
......

Related

Is there a simple way in Pyspark to find out number of promotions it took to convert someone into customer?

I have a date-level promotion data frame that looks something like this:
ID
Date
Promotions
Converted to customer
1
2-Jan
2
0
1
10-Jan
3
1
1
14-Jan
3
0
2
10-Jan
19
1
2
10-Jan
8
0
2
10-Jan
12
0
Now I want to see what were the number of promotions it took to convert someone into a customer
For eg., It took (2+3) promotions to convert ID 1 to the customer and (19) to convert ID 2 to the customer.
Eg.
ID
Date
1
5
2
19
I am unable to think of an idea to solve it. Can you please help me?
#Corralien and mozway have helped with the solution in Python. But I am unable to implement it in Pyspark because of the huge dataframe size (>1 TB).
You can use:
prom = (df.groupby('ID')['Promotions'].cumsum()
.where(df['Converted to customer'].eq(1))
.dropna().astype(int))
out = df.loc[prom.index, ['ID', 'Date']].assign(Promotion=prom)
print(out)
# Output
ID Date Promotion
1 1 10-Jan 5
3 2 10-Jan 19
Use one groupby to generate a mask to hide the rows, then one groupby.sum for the sum:
mask = (df.groupby('ID', group_keys=False)['Converted to customer']
.apply(lambda s: s.eq(1).shift(fill_value=False).cummax())
)
out = df[~mask].groupby('ID')['Promotions'].sum()
Output:
ID
1 5
2 19
Name: Promotions, dtype: int64
Alternative output:
df[~mask].groupby('ID', as_index=False).agg(**{'Number': ('Promotions', 'sum')})
Output:
ID Number
0 1 5
1 2 19
If you potentially have groups without conversion to customer, you might want to also aggregate the "" column as indicator:
mask = (df.groupby('ID', group_keys=False)['Converted to customer']
.apply(lambda s: s.eq(1).shift(fill_value=False).cummax())
)
out = (df[~mask]
.groupby('ID', as_index=False)
.agg(**{'Number': ('Promotions', 'sum'),
'Converted': ('Converted to customer', 'max')
})
)
Output:
ID Number Converted
0 1 5 1
1 2 19 1
2 3 39 0
Alternative input:
ID Date Promotions Converted to customer
0 1 2-Jan 2 0
1 1 10-Jan 3 1
2 1 14-Jan 3 0
3 2 10-Jan 19 1
4 2 10-Jan 8 0
5 2 10-Jan 12 0
6 3 10-Jan 19 0 # this group has
7 3 10-Jan 8 0 # no conversion
8 3 10-Jan 12 0 # to customer
you want to compute something by ID, so a groupby ID seems appropriate, e.g.
data.groupby("ID").apply(fct)
Now write a separate function agg_fct which computes the result for a
dataframe consisting of only one ID
Assuming data are ordered by Date, I guess that
def agg_fct(df):
index_of_conv = df["Converted to customer"].argmax()
return df.iloc[0:index_of_conv,df.columns.get_loc("Promotions")].sum()
would be fine. You might want to make some adjustments in case of a customer who has never been converted.

MATLAB - Frequency of an array element with a condition

I need some help please. I have an array, as shown below, 6 rows and 5 columns, none of the elements in any one row repeats. The elements are all single digit numbers.
I want to find out, per row, when a number, let's say 1 appears, I want to keep of how often the other numbers of the row appear. For example, 1 shows up 3 times in rows one, three and five. When 1 shows up, 2 shows up one time, 3 shows up two times, 4 shows up two times, 5 shows up one time, 6 shows up two times, 7 shows up one time, 8 shows up three times, and 9 shows up zero times. I want to keep a vector of this information that will look like, V = [3,1,2,2,1,2,1,3,0], by starting with a vector like N = [1,2,3,4,5,6,7,8,9]
ARRAY =
1 5 8 2 6
2 3 4 6 7
3 1 8 7 4
6 5 7 9 4
1 4 3 8 6
5 7 8 9 6
The code I have below does not give the feedback I am looking for, can someone help please? Thanks
for i=1:length(ARRAY)
for j=1:length(N)
ARRAY(i,:)==j
V(j) = sum(j)
end
end
Using indices that is in A creae a zero and one 6 * 9 matrix that [i,j] th element of it is 1 if i th row of A contains j.
Then multiply the zero and one matrix with its transpose to get desirable result:
A =[...
1 5 8 2 6
2 3 4 6 7
3 1 8 7 4
6 5 7 9 4
1 4 3 8 6
5 7 8 9 6]
% create a matrix with the size of A that each row contains the row number
rowidx = repmat((1 : size(A,1)).' , 1 , size(A , 2))
% z_o a zero and one 6 * 9 matrix that [i,j] th element of it is 1 if i th row of A contains j
z_o = full(sparse(rowidx , A, 1))
% matrix multiplication with its transpose to create desirable result. each column relates to number N
out = z_o.' * z_o
Result: each column relates to N
3 1 2 2 1 2 1 3 0
1 2 1 1 1 2 1 1 0
2 1 3 3 0 2 2 2 0
2 1 3 4 1 3 3 2 1
1 1 0 1 3 3 2 2 2
2 2 2 3 3 5 3 3 2
1 1 2 3 2 3 4 2 2
3 1 2 2 2 3 2 4 1
0 0 0 1 2 2 2 1 2
I don't understand how you are approaching the problem with your sample code but here is something that should work. This uses find, any and accumarray and in each iteration for the loop it will return a V corresponding to the ith element in N
for i=1:length(N)
rowIdx = find(any(A == N(i),2)); % Find all the rows contain N(j)
A_red = A(rowIdx,:); % Get only those rows
V = [accumarray(A_red(:),1)]'; % Count occurrences of the 9 numbers
V(end+1:9) = 0; % If some numbers don't exist place zeros on their counts
end

Delete adjacent repeated terms

I have the following vector a:
a=[8,8,9,9,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8]
From a I want to delete all "adjacent" repetitions to obtain:
b=[8,9,1,2,3,4,5,6,7,8]
However, when I do:
unique(a,'stable')
ans =
8 9 1 2 3 4 5 6 7
You see, unique only really gets the unique elements of a, whereas what I want is to delete the "duplicates"... How do I do this?
It looks like a run-length-encoding problem (check here). You can modify Mohsen's solution to get the desired output. (i.e. I claim no credit for this code, yet the question is not a duplicate in my opinion).
Here is the code:
a =[8,8,9,9,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8]
F=find(diff([a(1)-1, a]));
Since diff(a) returns an array of length (length(a) -1), we want to add a value at the beginning (i.e the a(1)) to get a vector the same size as a. Here we subtract 1 so that, as mentioned by #surgical_tubing, the command find effectively finds it because it looks for non zero elements, so we want to make sure the value is non zero.
Hence diff([a(1)-1, a]) looks like this:
Columns 1 through 8
1 0 1 0 -8 0 1 0
Columns 9 through 16
1 0 1 0 1 0 1 0
Columns 17 through 20
1 0 1 0
Now having found the repeated elements, we index back into a with the positions found by find:
newa=a(F)
and output:
newa =
Columns 1 through 8
8 9 1 2 3 4 5 6
Columns 9 through 10
7 8

how to sort from lowest to largest in matlab [duplicate]

This question already has answers here:
How can I sort a 2-D array in MATLAB with respect to one column?
(2 answers)
Closed 7 years ago.
i have this number array
A= [1 2 3 4
1 2 3 1
3 1 1 2
1 2 1 1
2 1 0 6
1 2 1 0]
i want to sort the 4th column from smallest to largest, and the corresponding rows will followed in their new position, something like this:
A =[1 2 1 0
1 2 3 1
1 2 1 1
3 1 1 2
1 2 3 4
2 1 0 6]
so the last row, become on the top because zero in the 4th column is the smallest number in 4th column, so how i will do that? thanks
This will do:
sortrows(A,columnNumber);
You can do this:
[~,order] = sort(A(:,4));
A = A(order,:);

Matlab: command for counting occurrences in ascending order not cumulatively?

This must be asked before but I cannot find it now. It calculates the amount of zeros, add the count of zeros to vector, then calculate the amount of ones, append the count of ones to the vector and so on. If zero count, make it as zero.
Is there some zero command to do this counting in Matlab?
Input ---> Output
0 1 1 1 2 3 3 4 7 ---> [1,3,1,2,1,0,0,1]
0 1 1 1 ---> 1 3
2 7 ----> 0 0 1 0 0 0 0 1
To get the total count of occurrences of each number, use histc:
x = [0 1 1 1 2 3 3 4 7]; %// example data
histc(x, 0:max(x))