separating the records in a kdb table - kdb

There is a table with a column that I would like to break into multiple records. For example
q)tab:([]a:1 2 3;b:(`a;`$"b c";`d);c:2 3 4)
q)tab
a b c
-------
1 a 2
2 b c 3
3 d 4
There is a space between b and c in the second entry of column b, I would like the table to become
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4
I tried
" " string vs exec b from tab
but didn't work.
Any idea?

Since b is the column with multiple entries per row, you can count each value and expand the corresponding row entries accordingly. Then ungroup like Terry mentioned should work.
q)t:([]a:1 2 3;b:(`a;`b`c;`d);c:2 3 4)
q)![t;();0b;{x!(enlist({(count each x)#'y};`b)),/:x}cols t]
a b c
------------
,1 ,`a ,2
2 2 `b`c 3 3
,3 ,`d ,4
q)ungroup ![t;();0b;{x!(enlist({(count each x)#'y};`b)),/:x}cols t]
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4
EDIT: Realised after your comment that the input is different. I think this is what you want.
q)t:([]a:1 2 3;b:(`a;`$"b c";`d);c:2 3 4)
q)ungroup update`$" "vs'string b from t
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4

You would normally do this using ungroup:
q)ungroup([]a:1 2 3;b:((),`a;`b`c;(),`d);c:2 3 4)
a b c
-----
1 a 2
2 b 3
2 c 3
3 d 4

Related

Power BI - Counting number of projects in execution per month

I want to make a line graph with the number of running projects by fortnight (could be monthly, whatever is easier).
Given table (ETC used when the project is not finished yet):
project ID
Start date
Finish date
ETC date
Category
1
04/12/2022
08/23/2022
Type A
2
04/14/2022
09/21/2022
Type B
3
05/18/2022
12/17/2022
Type A
4
06/21/2022
09/25/2022
Type C
5
06/28/2022
10/02/2022
Type A
6
07/08/2022
12/23/2022
Type C
7
07/20/2022
12/08/2022
Type C
8
07/29/2022
10/12/2022
Type B
In Excel, I am using the COUNTIFS function to determine how many projects were in progress at the same time. For example: There is 1 project running (ID 1) on the first fortnight of April (1-14) and 2 projects running (IDs 1 and 2) on the second fortnight (14-30)
My table in excel looks like this:
Fortnight
Type A
Type B
Type C
Total
04/22 F1
1
1
04/22 F2
1
1
2
05/22 F1
1
1
2
05/22 F2
2
1
3
06/22 F1
2
1
3
06/22 F2
3
1
1
5
07/22 F1
3
1
2
6
07/22 F2
3
2
3
8
08/22 F1
3
2
3
8
08/22 F2
3
2
3
8
09/22 F1
2
2
3
7
09/22 F2
2
2
3
7
10/22 F1
2
1
2
5
10/22 F2
1
2
3
11/22 F1
1
2
3
11/22 F2
1
2
3
12/22 F1
1
2
3
12/22 F2
1
1
2

How to update kdb table transversely

I have these two tables:
tab:([]col1:`abc`def`ghe`abc;val_00:`a`b`c`e;val_01:`d`e`f`t;val_02:`g`h`e`g;val_03:`r`t`y`o)
tab2:([]col1:`abc`abc`abc`abc`def`def`def`ghe`ghe`ghe;col2:0 1 2 3 4 5 6 7 8 9;col3:`Ashley`Peter`John`Molly`Apple`Orange`Banana`Robin`Tony`Bob)
and this is the result I am looking for:
tabResult:([]col1:`abc`def`ghe`abc;val_00:`Ashley`b`c`Ashley;val_01:`Peter`e`f`Peter;val_02:`John`h`e`John;val_03:`Molly`t`y`Molly)
col1 val_00 val_01 val_02 val_03
abc Ashley Peter John Molly
def b e h t
ghe c f e y
abc Ashley Peter John Molly
I would like to update tab depending on tab2. If col1=`abc,col2=1 in tab2, I would like to update val_01 to `Peter in tab, and if col1 =`abc,col2=2 in tab2, I would like to update val_02 field with `John in tab etc.
This is what I have so far:
{![tab;enlist(=;`col1;enlist x);0b;(enlist y)!enlist z]} . (`abc;`val_01;)
The function above works if the field is numerical and I use a number as the last arg. However, I am not sure how to update symbols and how to generalise this function for all tables.
If I'm understanding your request correctly, you're trying to update a field that has a long type with values that are of symbol type. This is going to fail with a 'type error as column values are expected to be uniform in type. What you can alternatively do is create new columns for the symbol entries, and after that select the columns you want.
Is something like this what you had in mind? I've assumed that the column name is determined by its col2 value in tab. Also it looks like you have two val_01 columns in your tab input, I assumed one of these was supposed to be val_02.
q)(uj/){![tab;enlist(=;`col1;enlist x);0b;(enlist`$"val_0",string[y],"_sym")!enlist enlist z]}.'flip tab2`col1`col2`col3
col1 val_00 val_01 val_02 val_03 val_01_sym val_02_sym val_03_sym val_04_sym val_05_sym val_06_sym val_07_sym val_08_sym val_09_sym
-----------------------------------------------------------------------------------------------------------------------------------
abc 1 2 2 3 Peter
def 2 2 3 2
ghe 3 3 1 1
abc 1 2 2 3 John
def 2 2 3 2
ghe 3 3 1 1
abc 1 2 2 3 Molly
def 2 2 3 2
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2 Apple
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2 Orange
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2 Banana
ghe 3 3 1 1
abc 1 2 2 3
def 2 2 3 2
ghe 3 3 1 1 Robin
abc 1 2 2 3
def 2 2 3 2
ghe 3 3 1 1 Tony
abc 1 2 2 3
def 2 2 3 2
ghe 3 3 1 1 Bob
EDIT:
Based on your comments, I've amended my solution:
q)cols[tab]#{![x;enlist(=;`col1;enlist y`col1);0b;(enlist`$"val_0",string y`col2)!enlist enlist y`col3]}/[tab;tab2]
col1 val_00 val_01 val_02 val_03
--------------------------------
abc Ashley Peter John Molly
def b e h t
ghe c f e y
abc Ashley Peter John Molly

SQL aggregate sum produces unexpected output

I don't understand how sum works.
For a PostgreSQL table in dbeaver:
a
b
c
d
1
2
3
2
1
2
4
3
2
1
3
2
2
1
4
2
3
2
4
2
the query
select a, b, c, d, sum(c) as sum_c, sum(d) as sum_d from abc a group by a, b, c, d
produces
a
b
c
d
sum_c
sum_d
1
2
3
2
3
2
1
2
4
3
4
3
2
1
3
2
3
2
2
1
4
2
4
2
3
2
4
2
4
2
and I don't understand why: I expected sum_c would be 18 in each row, which is the sum of values in c, and sum_d would be 11 for the same reason.
Why do sum_c and sum_d just copy the values from c and d in each row?
You can't get the result that you want with group by.
When you aggregate with group by you create groups for all the columns that are after group by and for each of these groups you get the aggregated results.
For your sample data, one group is 1,2,3,2 and for this combination of values you get the sum of c which is 3 since there is only 1 row with c=3 in that group.
Use SUM() window function:
SELECT a, b, c, d,
SUM(c) OVER () sum_c,
SUM(d) OVER () sum_d
FROM abc

add column to table in kdb based of existing columns?

I want to add a new column to a kdb table, it should add based of the existing column by populating with the non null value as below
q)t:([]a:`a`b`c`d`e`f`g`h;b:1 0n 3 4 0n 6 0n 8;c:0n 2 0n 0n 5 0n 7 0n)
q)t
a b c
-----
a 1
b 2
c 3
d 4
e 5
f 6
g 7
h 8
I want to add a column d that would take the value from c or d that isn't null
to produce a table like this
a b c d
-------
a 1 1
b 2 2
c 3 3
d 4 4
e 5 5
f 6 6
g 7 7
h 8 8
I tried concatenating but then it has the null in it:
q)update d:(b,'c)from t
a b c d
----------
a 1 1
b 2 2
c 3 3
d 4 4
e 5 5
f 6 6
g 7 7
h 8 8
A vector conditional might be what you’re after, something like the below:
update d:?[null b;c;b] from t
You can read more about vector conditionals here. This expects a Boolean list as the first argument and returns values from a list in the second argument where True, or values from a list in the third argument where False.
For example:
q)?[10101b;”abcde”;”ABCDE”]
“aBcDe”
When used in conjunction with a select/update statement, columns of the table can be specified as the arguments to the vector conditional as these are simply lists.
As an aside, the null keyword returns a Boolean true where a value is null and is useful as part of your solution.
You can use the ^(fill) operator.
t:([]a:`a`b`c`d`e`f`g`h;b:1 0n 3 4 0n 6 0n 8;c:0n 2 0n 0n 5 0n 7 0n)
q)update d:b^c from t
a b c d
-------
a 1 1
b 2 2
c 3 3
d 4 4
e 5 5
f 6 6
g 7 7
h 8 8
It is worth noting that if you had a row with non-null values for b and c then the query above would default to the value in c. If you would prefer the value in b to be default then switch the inputs:
q)t:([]a:`a`b`c`d`e`f`g`h;b:1 0n 3 4 0n 6 0n 8;c:0n 2 0n 0n 5 100 7 0n)
q)update d:b^c from t
a b c d
-----------
a 1 1
b 2 2
c 3 3
d 4 4
e 5 5
f 6 100 100
g 7 7
h 8 8
q)update d:c^b from t
a b c d
---------
a 1 1
b 2 2
c 3 3
d 4 4
e 5 5
f 6 100 6
g 7 7
h 8 8
You could use 'or(|)' operator.
q)update d:b|c from t
Concat will give you a list with items from both 'b' and 'c' column. It will not remove null. 'or' will compare each pair of 'b' and 'c' and will return maximum value from that pair. As null is lesser than an integer, it will give you integer value either from 'b' or 'c' column.
Can use fill here - https://code.kx.com/wiki/Reference/Caret
q)t:([]a:`a`b`c`d`e`f`g`h;b:1 0n 3 4 0n 6 0n 8;c:0n 2 0n 0n 5 0n 7 0n)
q)update d:c^b from t
a b c d
-------
a 1 1
b 2 2
c 3 3
...

PostgreSQL, sum data from row of table?

x a b c d
----------
A 1 2 3 4
B 5 6 7 8
C 6 7 8 9
I want my sum of A = 1 + 2 + 3 + 4 and so for B and C, Is there any command that can sum row of data in PostgreSQL?
There is no such built-in function, but you can simply do the following:
select x, a+b+c+d as column_sum from mytable
Assuming, of course, that the data type of a, b, c and d are numeric.