Need to calculate until a specific date in tableau?

Need to calculate until a specific date in tableau? - tableau-api

There are three columns, date, x, y
I need to calculate the running sum/total of y for a specific date (today's date more specifically). The data is in two datasources and looks like this in first data source.
DATE X Z
5-Sep
6-Sep 26 101
7-Sep 27 100
8-Sep 28 99
9-Sep 29 98
10-Sep 30 98
11-Sep 30 98
12-Sep 30 97
13-Sep 31 96
14-Sep 32 95
15-Sep 33 94
16-Sep 34 93
17-Sep 35 92
18-Sep 35 92
and like this is second data source
DATE Y
5-Sep 166
6-Sep 182
7-Sep 130
8-Sep 93
9-Sep 107
10-Sep 95
11-Sep 128
12-Sep 173
13-Sep 154
14-Sep 136
15-Sep 79
16-Sep 61
17-Sep 156
18-Sep 66
Lets say that today's date is 17th Sep, then I need to calculate the running sum of 'Z' until today and display it next to the 'X' column. Something like this
17-Sep 35 1499.
How do I do that?
(I tried using sets with date by limiting the date to today but then the running sum doesn't work, also there are some errors in calculated field which is because the data is in two different sources)
Please ask if need more clarification

Using the Super store data, I created a date parameter. Then created a calculated field as follows:
if [date param] >= [Order Date] then [Sales] end
Now this will display sales prior to your selected date parameter. I also created a filter calc to only see data prior to the selected date in the param.
[date param]>=[Order Date]
Place this in the filter shelf and select True.
Now place date field on Rows and your sales calculated field on Text pill. Right click on it and select Quick Table Calculation > Running Total.
See sample workbook here: https://www.dropbox.com/s/p42tx86v4qidlvn/170327%20stack%20question.twbx?dl=0
EDIT:
If you just want to see the total and the date selected, create a calc field for "last" as last() then filter that for zero.

Related

Filling a calendar using Arrayformula or LOOKUP

I've made a calendar sheet and would like to fill it using an Arrayformula or some kind of Lookup.
The problem is, the code in each cell is different, do I need it all to be the same code or is it possible to do an Arrayformula that does a different formula for each line?
I spent ages getting the calendar code working but would now like to simplify the code and I'm not sure what my next step should be:
https://docs.google.com/spreadsheets/d/1u_J7bmOFyDlYXhcL5dW3CHFJ1esySAKK_yPc6nFTdLA/edit?usp=sharing
Any advice would be much appreciated.

I've added a new sheet in your file called 'Aresvik'.
The green cells have new formula.
Cell B3 can be =date(B1,1,1)
Then each successive month can be =eomonth(B3,0)+1, =eomonth(J3,0)+1 etc.
The date formula in cell B5 is:
=arrayformula(iferror(vlookup(sequence(7,7,1),{array_constrain(sequence(40,1),day(eomonth(B3,0))+weekday(B3,3),1),query({flatten(split(rept(",",day(eomonth(B3,0))-1),",",0,0));sequence(day(eomonth(B3,0)),1,1)},"offset "&day(eomonth(B3,0))-weekday(B3,3)&" ",0)},2,false),))
It can be copied to each other cell below Mo, so B5 will change to J5, R5, Z5 etc.
Notes
The concept revolves around using the SEQUENCE function to generate a grid of numbers, 6 rows, 7 columns:
sequence(6,7)
which looks like this:
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 32 33 34 35
36 37 38 39 40 41 42
Then using these numbers in a VLOOKUP to get a corresponding date for the calendar. If the first of the month falls on a Thursday (April 2021), the vlookup range needs 3 gaps at the top of the list of dates. player0 has a more elegant solution than my original query using offset, so I've incorporated it below. Cell Z3 is the date 1/4/2021:
=arrayformula(
iferror(
vlookup(sequence(6,7),
{sequence(day(eomonth(Z3,0))+weekday(Z3,2),1,0),
{iferror(sequence(weekday(Z3,2),1)/0,);sequence(day(eomonth(Z3,0)),1,Z3)}},
2,false)
,))
The first column in the vlookup range is:
sequence(day(eomonth(Z3,0))+weekday(Z3,2),1,0)
which is an array of numbers from 0, corresponding with the number of days in the month plus the number of gaps before the 1st day.
The second column in the vlookup range is:
{iferror(sequence(weekday(Z3,2),1)/0,);sequence(day(eomonth(Z3,0)),1,Z3)}},
It is an array of 2 columns in this format: {x;y}, where y sits below x because of the ;.
These are the gaps: iferror(sequence(weekday(Z3,2),1)/0,), followed by the date numbers: sequence(day(eomonth(Z3,0)),1,Z3)
(Example below is April 2021):
0
1
2
3
4
5
6 44317
7 44318
8 44319
9 44320
10 44321
11 44322
12 44323
13 44324
14 44325
15 44326
16 44327
17 44328
18 44329
19 44330
20 44331
21 44332
22 44333
23 44334
24 44335
25 44336
26 44337
27 44338
28 44339
29 44340
30 44341
31 44342
32 44343
33 44344
34 44345
35 44346
36 44347
The vlookup takes each number in the initial sequence (6x7 layout), and brings back the corresponding date from col2 in the range, based on a match in col1.
When the first day of the month is a Monday, iferror(sequence(weekday(BB1,2),1)/0,) generates a gap in col2 of the vlookup range. This is why col1 in the vlookup range has to start with 0.
I've updated the sheet at https://docs.google.com/spreadsheets/d/1u_J7bmOFyDlYXhcL5dW3CHFJ1esySAKK_yPc6nFTdLA/edit#gid=68642071
Values on the calendar are dates so the formatting has to be d.
If you want numbers, then use:
=arrayformula(
iferror(
vlookup(sequence(6,7),
{sequence(day(eomonth(Z3,0))+weekday(Z3,2),1,0),
{iferror(sequence(weekday(Z3,2),1)/0,);sequence(day(eomonth(Z3,0)),1)}},
2,false)
,))

shorter solution:
=INDEX(IFNA(VLOOKUP(SEQUENCE(6, 7), {SEQUENCE(DAY(EOMONTH(B3, ))+WEEKDAY(B3, 2), 1, ),
{IFERROR(ROW(INDIRECT("1:"&WEEKDAY(B3, 2)))/0); SEQUENCE(DAY(EOMONTH(B3, )), 1, B3)}}, 2, )))

Running percentage calculation in tableau

I have calculated running total as below, when I am trying to calculate running %, it gives wrong values
Quarter Status
Closed Closed % Open Open% Total Total %
Q1 16 21.62 58 78.38 74 100
Q2 29 17.57 119 82.34 148 100
Q3 29 191 100 220 100
% values displayed are actual percentage values of the cell count and not the ones calculated on the running total count
How do I fix this
Expected output:
Quarter Status
Closed Closed % Open Open% Total Total %
Q1 16 21.62162 58 78.37838 74 100
Q2 29 19.59459 119 80.40541 148 100
Q3 29 13.18182 191 86.81818 220 100
I have tried % total all options

Have you tried Edit Table Calculation and setting the calculation to restart every [Quarter]? Right click on the table calculation in the view to set different "Compute Using" fields.

How can I efficiently convert the output of one KDB function into three table columns?

I have a function that takes as input some of the values in a table and returns a tuple if you will - three separate return values, which I want to transpose into the output of a query. Here's a simplified example of what I want to achieve:
multiplier:{(x*2;x*3;x*3)};
select twoX:multiplier[price][0]; threeX:multiplier[price][1]; fourX:multiplier[price][2] from data;
The above basically works (I think I've got the syntax right for the simplified example - if not then hopefully my intention is clear), but is inefficient because I'm calling the function three times and throwing away most of the output each time. I want to rewrite the query to only call the function once, and I'm struggling.
Update
I think I missed a crucial piece of information in my explanation of the problem which affects the outcome - I need to get other data in the query alongside the output of my function. Here's a hopefully more realistic example:
multiplier:{(x*2;x*3;x*4)};
select average:avg price, total:sum price, twoX:multiplier[sum price][0]; threeX:multiplier[sum price][1]; fourX:multiplier[sum price][2] by category from data;
I'll have a go at adapting your answers to fit this requirement anyway, and apologies for missing this bit of information. The real function if a proprietary and fairly complex algorithm and the real query has about 30 output columns, hence the attempt at simplifying the example :)

If you're just looking for the results themselves you can extract (exec) as lists, create dictionary and then flip the dictionary into a table:
q)exec flip`twoX`threeX`fourX!multiplier[price] from ([]price:til 10)
twoX threeX fourX
-----------------
0 0 0
2 3 4
4 6 8
6 9 12
8 12 16
10 15 20
12 18 24
14 21 28
16 24 32
18 27 36
If you need other columns from the original table too then its trickier but you could join the tables sideways using ,'
q)t:([]price:til 10)
q)t,'exec flip`twoX`threeX`fourX!multiplier[price] from t

An apply # can also achieve what you want. Here data is just a table with 10 random prices. # is then used to apply the multiplier function to the price column while also assigning a column name to each of the three resulting lists:
q)data:([] price:10?100)
q)multiplier:{(x*2;x*3;x*3)}
q)#[data;`twoX`threeX`fourX;:;multiplier data`price]
price twoX threeX fourX
-----------------------
80 160 240 240
24 48 72 72
41 82 123 123
0 0 0 0
81 162 243 243
10 20 30 30
36 72 108 108
36 72 108 108
16 32 48 48
17 34 51 51

Creating a matrix for a given value of a given value - For loop issues

I'm not sure if the title is very clear.
What I am trying to do is analyze a huge amount of data and produce a singular matrix for each ID.
The data comes in the form:
1001 00101 150
1001 00102 146
1001 00103 145
......
1001 19401 178
1001 19402 194
ID(1:4) Day(6:8) Half hour within a 24 hour period (9:10) Usage(12:end)
e.g ID=1001 Day=001 Half Hour=01 Usage=150
The ID, Day and Half hour values follow a strict pattern however the usage is the measured value.
I am trying to output
ID Value - Average usage per half hour
1001 01 150
1001 02 160
1001 03 173
1001 04 194
.... .. ...
1001 48 150
.... .. ...
1100 48 147
I've broken down the data into each specific component, however I am having trouble outputting the data into the average usage for each half hour, constantly getting trapped in for loops with no end product.
My base code currently examines the first line and extracts each component
fid = fopen('test.txt');
tline = fgetl(fid);
ID=tline(1:4);
disp(ID);
Day=tline(6:8);
disp(Day);
HalfHour=tline(9:10);
disp(HalfHour);
Usage=tline(12:end);
disp(Usage);
However I am really struggling to amplify this to the entire data set and produce the specified output.
Any help would be much appreciated.

How to count dates in Stata and adding index numbers to every date while setting a certain date to index 0

I want to number/index observations in my Stata dataset by date with the following logic: if the eventdate = observation date --> assign index (or counting number) 0, give n-1 n-2 ... for the previous dates and n+1 , n+2 .... for the following dates.
I checked the help but I couldn't find a convincingly helpful answer.

You do not define n, but using the same guess as #Metrics, the variable you ask for is just
. gen diff = cond(current_date == event_date, 0, current_date)

This is not the efficient way; but I think this should work:
use http://dss.princeton.edu/training/tsdata.dta
gen date1=substr(date,1,7)
gen datevar=quarterly(date1,"yq")
format datevar %tq
browse date date1 datevar
tsset datevar
gen time=_n
gen index=0
*say your event date is 1996q1
replace index=time if tin(1957q1,1995q4)|tin(1996q2,2005q1)
list datevar time index gdp in 150/160
gdp datevar time index
.5369373 1994q2 150 150
.6064236 1994q3 151 151
-.0578999 1994q4 152 152
.0906607 1995q1 153 153
.3216962 1995q2 154 154
1.020297 1995q3 155 155
.4759386 1995q4 156 156
.5014071 1996q1 157 0
1.115691 1996q2 158 158
.2851675 1996q3 159 159
1.187331 1996q4 160 160