kdb how to pass a column name into a function - kdb

As a simplifying example, I have
tbl:flip `sym`v1`v2!(`a`b`c`d; 50 280 1200 1800; 40 190 1300 1900)
and I d like to pass a column name into a function like
f:{[t;c];:update v3:2 * c from t;}
In this form it doesnt work. any suggestion how I can make this happen?
Thanks

Another option is to use the functional form of the update statement.
https://code.kx.com/q/ref/funsql/#functional-sql
q)tbl:flip `sym`v1`v2!(`a`b`c`d; 50 280 1200 1800; 40 190 1300 1900)
q)parse"update v3:2*x from t"
!
`t
()
0b
(,`v3)!,(*;2;`x)
q){![x;();0b;enlist[`v3]!enlist(*;2;y)]} [tbl;`v2]
sym v1 v2 v3
------------------
a 50 40 80
b 280 190 380
c 1200 1300 2600
d 1800 1900 3800

One option to achieve this is using # amend:
q){[t;c;n] #[t;n;:;2*t c]}[tbl;`v1;`v3]
sym v1 v2 v3
------------------
a 50 40 100
b 280 190 560
c 1200 1300 2400
d 1800 1900 3600
This updates the column c in table t saving the new value as column n. You could also alter this to allow you to pass in custom functions too:
{[t;c;n;f] #[t;n;:;f t c]}[tbl;`v1;`v3;{2*x}]

Related

SQL translation of IF statement to PowerBi Dax Query

I'm kinda new to DAX and PowerBi and I need to translate my SQL IF statement for whatever syntax is this on PowerBi to achieve the output I want.
Sql code I want to translate:
IF (Payment.payment>0) AND (Account.PV = Account.GV) THEN 1 ELSE 0
I want to make a calculated column on Payment table which will return 1 or 0 so that I can use this to filter all the records that meets my condition
account_id is the relationship of these two tables
Here is a sample data for reference: Account table
account_id
pv
gv
due_date
123
100
200
08/08/2022
124
200
200
08/09/2022
125
300
800
08/10/2022
126
400
670
08/11/2022
127
500
500
08/12/2022
128
600
600
08/13/2022
129
700
1000
08/14/2022
130
800
760
08/15/2022
131
900
900
08/16/2022
132
1000
1000
08/17/2022
133
1100
2300
08/09/2022
Here is a sample data for reference: Payment table
payment_id
payment_number
payment
payment_date
account_id
_test
101
554321
1000
03/01/2022
123
0
102
554322
1200
03/21/2022
124
1
103
554322
1100
04/28/2022
124
1
104
554323
2500
05/04/2022
131
1
105
554324
3000
05/14/2022
133
0
106
554325
3000
05/14/2022
132
1
107
554322
1200
03/21/2022
124
1
108
554323
2500
04/05/2022
131
1
109
554328
3100
04/05/2022
128
0
Codes I tried but I can't help myself to find the correct way to do it correctly and return the output that I need
_test = IF(Payments[payment]>0 && RELATED('Account'[PV])=RELATED('Account'[GV]), 1)
_test = IF(AND(Payments[payment])>0, RELATED('Account'[PV])=RELATED('Account'[GV])),1,0)
Any suggestion is much appreciated. Please recommend what kind of syntax/function should be used in order to achieve the output or what would be the work around to use other than IF statement
The problem that you are facing with RELATED is that RELATED only works from 1 side to many side.
Meaning, that if you bring the axis from 1-side and perform a calculation on the many side the filter works perfectly. Take a look at the direction of the filter below. The direction of the filter tells you on normal circumstances, you should bring your axis from Account and whatever calculation you perform on `Payment table it will work out.
But you are doing exactly the reverse. You are bringing the axis from Payment and hoping for RELATED to work. It won't cause the direction to be as such.
However, DAX is much more dynamic than that. If for whatever reason, you need to bring axis from many side where you need to still filter on 1-side, you can define a reverse filter direction on-the-fly (because DAX is magical) without needing to change anything in the data model by using CROSSFILTER. With CROSSFILTER you are customizing the filter direction as such
CROSSFILTER(<LEFtblcolumnName1>, <RIGHTtblcolumnName2>, <direction>)
This is how, (with your given dataset)
Column =
VAR cond1 =
CALCULATE (
MAX ( Account[Account.pv] ),
CROSSFILTER ( Payment[Payment.account_id], Account[Account.account_id], BOTH )
)
- CALCULATE (
MAX ( Account[Account.gv] ),
CROSSFILTER ( Payment[Payment.account_id], Account[Account.account_id], BOTH )
)
RETURN
IF ( cond1 == 0 && Payment[Payment.payment] > 0, 1, 0 )

How can I get the cumulative sum between a predefined number of days in SAS EG

I would like to find a way to calculate the cumulative sum between a predefined number of days.
Just to clarify, I am looking to calculate the cumulative sum between days. Those can be consecutive (e.g. 03AUG and 04AUG) or not (e.g. 04AUG and 06AUG).
Below the data I have:
data have;
input ID $ DT:date9. Amount;
format DT date9.;
datalines;
A 09JUL2021 3600
A 03AUG2021 456
A 04AUG2021 33
A 06AUG2021 235
A 07AUG2021 100
A 09AUG2021 86
A 12AUG2021 456
A 24AUG2021 22
A 25AUG2021 987
A 26AUG2021 916
A 27AUG2021 81
;
run;
I want to be able to create a new variable that shows the cumulative amount between 2 or more days.
I should be able every time to select if I want the cumulative amount between 2 days, or 3 days and so on.
Below the data I want, when I select to calculate the sum between two days:
data what_I_want;
input ID $ DT:date9. Amount Sum_Between_Days;
format DT date9.;
datalines;
A 09JUL2021 3600 0
A 03AUG2021 456 0
A 04AUG2021 33 489
A 06AUG2021 235 268
A 07AUG2021 100 335
A 09AUG2021 86 186
A 12AUG2021 456 0
A 24AUG2021 22 0
A 25AUG2021 987 0
A 26AUG2021 916 1925
A 27AUG2021 81 1984
;
run;
Below the data I want when I select to calculate the sum between 3 days:
data what_I_want;
input ID $ DT:date9. Amount Sum_Between_Days;
format DT date9.;
datalines;
A 09JUL2021 3600 0
A 03AUG2021 456 0
A 04AUG2021 33 0
A 06AUG2021 235 724
A 07AUG2021 100 0
A 09AUG2021 86 186
A 12AUG2021 456 0
A 24AUG2021 22 0
A 25AUG2021 987 0
A 26AUG2021 916 0
A 27AUG2021 81 2006
;
run;
Hopefully, I make sense, but please let me know if not..
Thanks in advance.

remove description lines and add time to the first column

AWk experts, I have a file as descried below and I wonder if it is possible to easily convert it to the form that I want:
The file containing multiple variables over one month (one observance ONLY in one day, but some days may be missing). The format for each day is the same except the date/value. However there is some description lines (containing words and numbers) at the end of each day, and the number of description lines varies among different days.
KBO BTA Observations at 12Z 01 Feb 2020
-----------------------------------------------------------------------------
PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
hPa m C C % g/kg deg knot K K K
-----------------------------------------------------------------------------
1000.0 92
925.0 765
850.0 1516
754.0 2546 13.0 9.3 78 9.85 150 2 310.2 340.6 312.0
752.0 2569 14.0 9.2 73 9.80 149 2 311.5 342.0 313.4
700.0 3173 -9.20 7.5 89 9.38 120 6 312.6 341.9 314.4
Station information and sounding indices
Station elevation: 2546.0
Lifted index: 1.83
Pres [hPa] of the Lifted Condensation Level: 693.42
1000 hPa to 500 hPa thickness: 5798.00
Precipitable water [mm] for entire sounding: 21.64
8022 KBO BTA Observations at 00Z 02 Feb 2020
-----------------------------------------------------------------------------
PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
hPa m C C % g/kg deg knot K K K
-----------------------------------------------------------------------------
1000.0 97
925.0 758
850.0 1515
753.0 2546 10.8 6.8 76 8.30 190 3 307.9 333.4 309.5
750.0 2580 12.6 7.9 73 8.99 186 3 310.2 338.1 311.9
Here is what I want: remove all the description lines and read the date/time information and put it as the first column.
Time PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
20200201t12Z 754.0 2546 13.0 9.3 78 9.85 150 2 310.2 340.6 312.0
20200201t12Z 752.0 2569 14.0 9.2 73 9.80 149 2 311.5 342.0 313.4
20200201t12Z 700.0 3173 -9.2 7.5 89 9.38 120 6 312.6 341.9 314.4
20200202t00Z 753.0 2546 10.8 6.8 76 8.30 190 3 307.9 333.4 309.5
20200202t00Z 750.0 2580 12.6 7.9 73 8.99 186 3 310.2 338.1 311.9
Any help is appreciated.
Kelly
something like this...
$ awk 'function m(x)
{return sprintf("%02d",int(index("JanFebMarAprMayJunJulAugSepOctNovDec",x)-1)/3+1)}
NR==1 {print "time PRES TEMP WDIR WSPD RELH"}
/^-+$/ {f=!f}
f {date=p[n] m(p[n-1]) p[n-2]}
!f {n=split($0,p)}
NF==11 && !/[^ 0-9.-]/ {print date,$0}' file | column -t
time PRES TEMP WDIR WSPD RELH
20200201 1000 10 230 5 90
20200201 900 9 200 6 85
20200201 800 9 100 6 87
20200202 1000 9.2 233 5 90
20200202 900 9.1 200 4 80
20200202 800 9 176 2 80
Explanation
function just returns the month number from the month string by looking up the index of and converting to formatted number
f keeps track of the dashed lines so that from the previous line we can parse the date,
finally to find the data lines the heuristic is number of fields and no non-number signs (digits, spaces, dots or negative signs).
$ cat tst.awk
/^-+$/ && ( ((++dashCnt) % 2) == 1 ) {
mthNr = (index("JanFebMarAprMayJunJulAugSepOctNovDec",p[n-1])+2)/3
time = sprintf("%04d%02d%02d", p[n], mthNr, p[n-2])
}
/^[[:upper:][:space:]]+$/ && !doneHdr++ { print "Time", $0 }
/^[0-9.[:space:]]+$/ { print time, $0 }
{ n = split($0,p) }
.
$ awk -f tst.awk file | column -t
Time PRES TEMP WDIR WSPD RELH
20200001 1000 10 230 5 90
20200001 900 9 200 6 85
20200001 800 9 100 6 87
20200002 1000 9.2 233 5 90
20200002 900 9.1 200 4 80
20200002 800 9 176 2 80

Accumulate all values at every point in time by symbol

I have this table:
execs:([]time:til 12;sym:`a`b`c`a`c`c`c`b`b`a`b`c;leavesQty:(1000;900;1300;800;1200;900;600;800;400;300;200;100))
I have different syms and for each at some time a leavesQty. And now I want to extend the table this way that at every row I get the sum of all leavesQty entries by sym at this time.
So I would have to come up to these values for this example:
execs:([]time:til 12;sym:`a`b`c`a`c`c`c`b`b`a`b`c;leavesQty:(1000;900;1300;800;1200;900;600;800;400;300;200;100);accLeavesQty:(1000;1900;3200;3000;2900;2600;2300;2200;1800;1300;1100;600))
You add this column with a single update statement if you use fby:
q)update accLeavesQty:sums (deltas;leavesQty) fby sym from execs
time sym leavesQty accLeavesQty
-------------------------------
0 a 1000 1000
1 b 900 1900
2 c 1300 3200
3 a 800 3000
4 c 1200 2900
5 c 900 2600
6 c 600 2300
7 b 800 2200
8 b 400 1800
9 a 300 1300
10 b 200 1100
11 c 100 600
Firstly you want to get the deltas of the leaves Qunatity for each symbol so can see how the value changes over time. After that you just need to do a cumulative sum of the resulting column.
q)update sums accLeavesQty from update accLeavesQty:deltas leavesQty by sym from execs
time sym leavesQty accLeavesQty
-------------------------------
0 a 1000 1000
1 b 900 1900
2 c 1300 3200
3 a 800 3000
4 c 1200 2900
5 c 900 2600
6 c 600 2300
7 b 800 2200
8 b 400 1800
9 a 300 1300
10 b 200 1100
11 c 100 600
You have a nice case for fby
q)update accLeavesQty:sums (deltas;leavesQty) fby sym from execs
time sym leavesQty accLeavesQty
-------------------------------
0 a 1000 1000
1 b 900 1900
2 c 1300 3200
3 a 800 3000
4 c 1200 2900
5 c 900 2600
6 c 600 2300
7 b 800 2200
8 b 400 1800
9 a 300 1300
10 b 200 1100
11 c 100 600
Another method involves recursion:
update accLeavesQty:sum each #[;;:;]\[()!();sym;leavesQty] from execs
It keeps a running dictionary of the last accLeavesQty for each sym and then calculates the sum of each of them
q)update accLeavesQty:#[;;:;]\[()!();sym;leavesQty] from execs
time sym leavesQty accLeavesQty
---------------------------------------
0 a 1000 (,`a)!,1000
1 b 900 `a`b!1000 900
2 c 1300 `a`b`c!1000 900 1300
3 a 800 `a`b`c!800 900 1300
4 c 1200 `a`b`c!800 900 1200
5 c 900 `a`b`c!800 900 900
6 c 600 `a`b`c!800 900 600
7 b 800 `a`b`c!800 800 600
8 b 400 `a`b`c!800 400 600
9 a 300 `a`b`c!300 400 600
10 b 200 `a`b`c!300 200 600
11 c 100 `a`b`c!300 200 100

kdb voolkup. get value from table that is mapped to smallest val larger than x

Assuming I have a dict
d:flip(100 200 400 800 1600; 1 3 4 6 10)
how can I create a lookup function that returns the value of the smallest key that is larger than x? Given a table
tbl:flip `sym`val!(`a`b`c`d; 50 280 1200 1800)
I would like to do something like
{[x] : update new:fun[x[`val]] from x} each tbl
to end up at a table like this
tbl:flip `sym`val`new!(`a`b`c`d; 50 280 1200 1800; 1 4 10 0N)
sym val new
a 50 1
b 280 4
c 1200 10
d 1800
stepped dictionaries may help
http://code.kx.com/q/cookbook/temporal-data/#stepped-attribute
q)d:`s#-0W 100 200 400 800 1600!1 3 4 6 10 0N
q)d 50 280 1200 1800
1 4 10 0N
I think you will want to use binr to return the next element greater than or equal to x. Note that you should use a sorted list for this to work correctly. For the examples above, converting d to a dictionary with d:(!). flip d I came up with:
q)k:asc key d
q)d k k binr tbl`val
1 4 10 0N
q)update new:d k k binr val from tbl
sym val new
------------
a 50 1
b 280 4
c 1200 10
d 1800
Where you get the dictionary keys to use with: k k binr tbl`val.
Edit: if the value in the table needs to be mapped to a value greater than x but not equal to, you could try:
q)show tbl:update val:100 from tbl where i=0
sym val
--------
a 100
b 280
c 1200
d 1800
q)update new:d k (k-1) binr val from tbl
sym val new
------------
a 100 3
b 280 4
c 1200 10
d 1800