How to update kdb table transversely - kdb

I have these two tables:
tab:([]col1:`abc`def`ghe`abc;val_00:`a`b`c`e;val_01:`d`e`f`t;val_02:`g`h`e`g;val_03:`r`t`y`o)
tab2:([]col1:`abc`abc`abc`abc`def`def`def`ghe`ghe`ghe;col2:0 1 2 3 4 5 6 7 8 9;col3:`Ashley`Peter`John`Molly`Apple`Orange`Banana`Robin`Tony`Bob)
and this is the result I am looking for:
tabResult:([]col1:`abc`def`ghe`abc;val_00:`Ashley`b`c`Ashley;val_01:`Peter`e`f`Peter;val_02:`John`h`e`John;val_03:`Molly`t`y`Molly)
col1 val_00 val_01 val_02 val_03
abc Ashley Peter John Molly
def b e h t
ghe c f e y
abc Ashley Peter John Molly
I would like to update tab depending on tab2. If col1=`abc,col2=1 in tab2, I would like to update val_01 to `Peter in tab, and if col1 =`abc,col2=2 in tab2, I would like to update val_02 field with `John in tab etc.
This is what I have so far:
{![tab;enlist(=;`col1;enlist x);0b;(enlist y)!enlist z]} . (`abc;`val_01;)
The function above works if the field is numerical and I use a number as the last arg. However, I am not sure how to update symbols and how to generalise this function for all tables.

If I'm understanding your request correctly, you're trying to update a field that has a long type with values that are of symbol type. This is going to fail with a 'type error as column values are expected to be uniform in type. What you can alternatively do is create new columns for the symbol entries, and after that select the columns you want.
Is something like this what you had in mind? I've assumed that the column name is determined by its col2 value in tab. Also it looks like you have two val_01 columns in your tab input, I assumed one of these was supposed to be val_02.
q)(uj/){![tab;enlist(=;`col1;enlist x);0b;(enlist`$"val_0",string[y],"_sym")!enlist enlist z]}.'flip tab2`col1`col2`col3
col1 val_00 val_01 val_02 val_03 val_01_sym val_02_sym val_03_sym val_04_sym val_05_sym val_06_sym val_07_sym val_08_sym val_09_sym
-----------------------------------------------------------------------------------------------------------------------------------
abc 1 2 2 3 Peter
def 2 2 3 2
ghe 3 3 1 1
abc 1 2 2 3 John
def 2 2 3 2
ghe 3 3 1 1
abc 1 2 2 3 Molly
def 2 2 3 2
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2 Apple
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2 Orange
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2 Banana
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2
ghe 3 3 1 1 Robin
abc 1 2 2 3
def 2 2 3 2
ghe 3 3 1 1 Tony
abc 1 2 2 3
def 2 2 3 2
ghe 3 3 1 1 Bob
EDIT:
Based on your comments, I've amended my solution:
q)cols[tab]#{![x;enlist(=;`col1;enlist y`col1);0b;(enlist`$"val_0",string y`col2)!enlist enlist y`col3]}/[tab;tab2]
col1 val_00 val_01 val_02 val_03
--------------------------------
abc Ashley Peter John Molly
def b e h t
ghe c f e y
abc Ashley Peter John Molly

Related

Power BI - Counting number of projects in execution per month

I want to make a line graph with the number of running projects by fortnight (could be monthly, whatever is easier).
Given table (ETC used when the project is not finished yet):
project ID
Start date
Finish date
ETC date
Category
1
04/12/2022
08/23/2022
Type A
2
04/14/2022
09/21/2022
Type B
3
05/18/2022
12/17/2022
Type A
4
06/21/2022
09/25/2022
Type C
5
06/28/2022
10/02/2022
Type A
6
07/08/2022
12/23/2022
Type C
7
07/20/2022
12/08/2022
Type C
8
07/29/2022
10/12/2022
Type B
In Excel, I am using the COUNTIFS function to determine how many projects were in progress at the same time. For example: There is 1 project running (ID 1) on the first fortnight of April (1-14) and 2 projects running (IDs 1 and 2) on the second fortnight (14-30)
My table in excel looks like this:
Fortnight
Type A
Type B
Type C
Total
04/22 F1
1
1
04/22 F2
1
1
2
05/22 F1
1
1
2
05/22 F2
2
1
3
06/22 F1
2
1
3
06/22 F2
3
1
1
5
07/22 F1
3
1
2
6
07/22 F2
3
2
3
8
08/22 F1
3
2
3
8
08/22 F2
3
2
3
8
09/22 F1
2
2
3
7
09/22 F2
2
2
3
7
10/22 F1
2
1
2
5
10/22 F2
1
2
3
11/22 F1
1
2
3
11/22 F2
1
2
3
12/22 F1
1
2
3
12/22 F2
1
1
2

separating the records in a kdb table

There is a table with a column that I would like to break into multiple records. For example
q)tab:([]a:1 2 3;b:(`a;`$"b c";`d);c:2 3 4)
q)tab
a b c
-------
1 a 2
2 b c 3
3 d 4
There is a space between b and c in the second entry of column b, I would like the table to become
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4
I tried
" " string vs exec b from tab
but didn't work.
Any idea?
Since b is the column with multiple entries per row, you can count each value and expand the corresponding row entries accordingly. Then ungroup like Terry mentioned should work.
q)t:([]a:1 2 3;b:(`a;`b`c;`d);c:2 3 4)
q)![t;();0b;{x!(enlist({(count each x)#'y};`b)),/:x}cols t]
a b c
------------
,1 ,`a ,2
2 2 `b`c 3 3
,3 ,`d ,4
q)ungroup ![t;();0b;{x!(enlist({(count each x)#'y};`b)),/:x}cols t]
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4
EDIT: Realised after your comment that the input is different. I think this is what you want.
q)t:([]a:1 2 3;b:(`a;`$"b c";`d);c:2 3 4)
q)ungroup update`$" "vs'string b from t
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4
You would normally do this using ungroup:
q)ungroup([]a:1 2 3;b:((),`a;`b`c;(),`d);c:2 3 4)
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4

KDB+/Q:Input agnostic function for single and multi row tables

I have tried using the following function to derive a table consisting of 3 columns with one column data holding a list of an arbitrary schema.
fn:{
flip `time`data`id!(x`b;(x`a`b`c`d`e);x`a)
};
which works well on input with multiple rows i.e.:
q)x:flip `a`b`c`d`e!(5#enlist 5?10)
q)fn[`time`data`id!(x`b;(x`a`b`c`d`e);x`a)]
time data id
-----------------
8 8 5 2 8 6 8
5 8 5 2 8 6 5
2 8 5 2 8 6 2
8 8 5 2 8 6 8
6 8 5 2 8 6 6
However fails when using input with a single row i.e.
q)x:`a`b`c`d`e!5?10
q)fn[`time`data`id!(x`b;(x`a`b`c`d`e);x`a)]
time data id
------------
8 7 7
8 8 7
8 4 7
8 4 7
8 6 7
which is obviously incorrect.
One might fix this by using enlist i.e.
q)x:enlist `a`b`c`d`e!5?10
q)fn[`time`data`id!(x`b;(x`a`b`c`d`e);x`a)]
time| 8
data| 7 8 4 4 6
id | 7
Which is correct, however if one were to apply this in the function i.e.
fn:{
flip enlist `time`data`id!(x`b;(x`a`b`c`d`e);x`a)
};
...
time| 2 5 8 7 9
data| 2 5 8 7 9 2 5 8 7 9 2 5 8 7 9 2 5 8 7 9 2 5 8 7 9
id | 2 5 8 7 9
Which has the wrong format of data values.
My question here is how might one avert this conversion issue and derive the same field values whether the argument is a multi row or single row table.
Or otherwise what is the canonical implementation of this in kdb+/q
Thanks
Edit:
To clarify: my problem isn't necessarily with the data input as one could just apply enlist if it is only one row. My question pertains to how one might use enlist in the fn function to make single row input conform to the logic seen when using multi row tables. i.e. how to replace fn enlist input with fn data (how to make the function input agnostic) Thanks
Are you meaning to flip the data perpendicular to the rest of the table? Your 5 row example works because there are 5 rows and 5 columns. The single row doesn't work due to 1 row to 5 columns.
Correct me if I'm wrong but I think this is what you want:
fn:{([]time:x`b;data:flip x`a`b`c`d`e;id:x`a)};
--------------------------------------------------
t1:flip `a`b`c`d`e!(5#enlist til 5);
a b c d e
---------
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
fn[t1]
time data id
-----------------
0 0 0 0 0 0 0
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
--------------------------------------------------
t2:enlist `a`b`c`d`e!til 5;
a b c d e
---------
0 1 2 3 4
fn[t2]
time data id
-----------------
1 0 1 2 3 4 0
Note without the flip you get this:
([]time:t1`b;data:t1`a`b`c`d`e;id:t1`a)
time data id
-----------------
0 0 1 2 3 4 0
1 0 1 2 3 4 1
2 0 1 2 3 4 2
3 0 1 2 3 4 3
4 0 1 2 3 4 4
In this case the time is no longer in line with the data but it works because of 5 row and cols.
Edit - I can't think of a better way to convert a dictionary to a table when needed other than using count first in a conditional. Note if the first key is a nested list this wouldn't work
{ $[1 = count first x;enlist x;x] } `a`b`c`d`e!til 5
Note, your provided function doesn't work with this:
{
flip `time`data`id!(x`b;(x`a`b`c`d`e);x`a)
}{$[1 = count first x;enlist x;x]} `a`b`c`d`e!til 5

How to sum consecutive identical numbers in a list in kdb?

I have a list like this:
a:1 1 1 1 2 3 1 1 4 4 4 5 6 4
How can I sum all of the consecutive identical numbers in a, so that it will become:
a:4 2 3 2 12 5 6 4
There are many ways - one method:
q) a:1 1 1 1 2 3 1 1 4 4 4 5 6 4
q) sum each where[differ a] _ a
4 2 3 2 12 5 6 4
Another method to achieve this using prev & <>:
sum each cut[where a<>prev[a]; a]
4 2 3 2 12 5 6 4

How to implement combinations of a list

All
I need to get the combinations and permutations of a list.
A function have been implemented for permutations.
perm:{[N;l]$[N=1;l;raze .z.s[N-1;l]{x,/:y except x}\:l]}
However, I have no idea about combinations, just like this:
l: 1 2 3
comb[2;l]
1 2
1 3
2 3
l: 1 2 3 4
comb[3;l]
1 2 3
1 2 4
1 3 4
2 3 4
Thanks!
From your solution, you can do:
q)comb:{[N;l]$[N=1;l;raze .z.s[N-1;l]{x,/:y where y>max x}\:l]}
q)comb[2;1 2 3]
1 2
1 3
2 3
Another approach using over:
q)perm:{{raze x{x,/:y except x}\:y}[;y]/[x-1;y]}
q)comb:{{raze x{x,/:y where y>max x}\:y}[;y]/[x-1;y]}
One option is to use your permutation function like this:
q) comb:{[N;l] distinct asc each perm[N;l] }
q)l: 1 2 3 4
q) comb[3;l]
output:
1 2 3
1 2 4
1 3 4
2 3 4
Note: This will change the order of elements because of asc. So if your list should have (1 3 2) in answer, it will give (1 2 3).
To maintain order, use any other function/logic in place of asc to filter duplicate elements in sets (ex: (1 2 3) and (1 3 2) are duplicates).