How can I delete a column by index from a kdb table? - kdb

For example how would you delete the first column from the following table:
q)t: ([] a: (2018.09.25; 2018.09.25; 2018.09.25); b: `ABC`XYZ`BAC ; c: (10 20 30))
q)t
a b c
-----------------
2018.09.25 ABC 10
2018.09.25 XYZ 20
2018.09.25 BAC 30
The expected result:
b c
---------
ABC 10
XYZ 20
BAC 30
It is possible to use delete a from t but I would like to be able to delete without knowing the exact column name beforehand.

You could use a functional delete:
q){[t;index]![t;();0b;enlist cols[t]index]}[t;0]
b c
------
ABC 10
XYZ 20
BAC 30
https://code.kx.com/q/ref/funsql/#delete
Use parse in order to see what the q-sql statement looks like in functional form:
q)parse"delete a from t"
!
`t
()
0b
,,`a

You could use
{(_/[cols x;desc y])#x}[t;0 2]
This takes in the columns of your table, takes the indices you want to drop and uses a drop scan to drop these columns. If you wanted to remove only one index, you'd have to enlist, like so:
{(_/[cols x;desc y])#x}[t;enlist 0]

If your table is not keyed then you can do simple deletion from dictionary:
q) f:{[t;ind] enlist[cols[t] ind]_t}
q) f[t;0]
b c
------
ABC 10
XYZ 20
BAC 30

Using flip and drop :
q)flip 1_flip 0!t
b c
------
ABC 10
XYZ 20
BAC 30

Related

Replace first n entries in a column in kdb

How can I replace the values in the first n columns of my table?
i.e. mycol:(1 2 3 4) to mycol:(a a 3 4)
Thank you in advance!
If it's the values within mycol that you want updated then they will need to be of the same type as the existing values. See below.
q)t:([]mycol:`$string 1+til 4;mycol2:til 4)
q)update mycol:`a from t where i<2
mycol mycol2
------------
a 0
a 1
3 2
4 3
One way around this though is to enlist mycol, that way updates of any type can be made.
q)t:([]mycol:1+til 4;mycol2:til 4)
q)update mycol:`a from(update enlist each mycol from t)where i<2
mycol mycol2
------------
`a 0
`a 1
,3 2
,4 3
q)meta update mycol:`a from(update enlist each mycol from t)where i<2
c | t f a
------| -----
mycol |
mycol2| j
It's unclear from your question whether you want the column names or the column values changed. If it's the column names, you can use xcol.
q)(2#`a)xcol([]w:3#til 3;x:3#.Q.a;y:`;z:0N)
a a y z
-------
0 a
1 b
2 c

Utility like except for tables in kdb

As we have except function for lists in kdb to find the elements which are present in one list and not in another, similarly do we have any utility to extract the rows present in one table and not in another based on a column?
Eg: I have two tables:
l:([]c1:`a`b`c`d;c2:10 20 30 40)
r:([]c1:`a`a`a`b`b;c3:100 200 300 400 50)
Since, for column c1 in table l we have row c d which are not present in column c1 of table r.
Do we have any utility in kdb which can be used to get output like below?
c1 c2
-----
c 30
d 40
I got the output using -
select from l where c1 in l[`c1] except r`c1
But, I'm searching for better/optimised solution/utility to get the same output.
I don't think there's anything wrong with your current implementation but you could use drop (aka _) on a keyed table for a more succinct approach:
q)#[1#`c1;r]_1!l
c1| c2
--| --
c | 30
d | 40
This also remains pretty neat when they "key" is more than one column:
l0:([]c0:`x`y`z`w;c1:`a`b`c`d;c2:10 20 30 40)
r0:([]c0:`y`x`x`x`y;c1:`a`a`a`b`b;c3:100 200 300 400 50)
q)#[`c0`c1;r0]_2!l0
c0 c1| c2
-----| --
z c | 30
w d | 40
A more functional form would be this:
{cl:cols[x]inter cols y;x where not(cl#x)in cl#y}[l;r]
c1 c2
-----
c 30
d 40
This should work if you don't know the columns to match on because of cols[x] inter cols[y] at the start which obtains common cols between the two tables. It also works without columns being keyed.
Although in this specific case, the following would be a little bit faster:
l where not l[`c1] in r[`c1]

Functional update - multivariable function with dynamic columns

Any help with the following would be much appreciated!
I have two tables: table1 is a summary table whilst table2 is a list of all data points. I want to be able to summarise the information in table2 for each row in table1.
table1:flip `grp`constraint!(`a`b`c`d; 10 10 20 20);
table2:flip `grp`cat`constraint`val!(`a`a`a`a`a`b`b`b;`cl1`cl1`cl1`cl2`cl2`cl2`cl2`cl1; 10 10 10 10 10 10 20 10; 1 2 3 4 5 6 7 8);
function:{[grpL;constraintL;catL] first exec total: sum val from table2 where constraint=constraintL, grp=grpL,cat=catL};
update cl1:function'[grp;constraint;`cl1], cl2:function'[grp;constraint;`cl2] from table1;
The fourth line of this code achieves what I want for the two categories:cl1 and cl2
In table1 I want to name a new column with the name of the category (cl1, cl2, etc.) and I want the values in that column to be the output from running the function over that column.
However, I have hundreds of different categories, so don't want to have to list them out manually as in the fourth line. How would I pass in a list of categories, e.g. below?
`cl1`cl2`cl3
Sticking to your approach, you would just have to make your update statement functional and then iterate over the columns like so:
{![`table1;();0b;(1#x)!enlist ((';function);`grp;`constraint;1#x)]} each `cl1`cl2
Assuming you can amend table1 in place. If you must retain the original table1 then you can pass it by value though it will consume more memory
{![x;();0b;(1#y)!enlist ((';function);`grp;`constraint;1#y)]}/[table1;`cl1`cl2]
Another approach would be to aggregate, pivot and join though it's not necessarily a better solution as you get nulls rather than zeros
a:select sum val by cat,grp,constraint from table2
p:exec (exec distinct cat from a)#cat!val by grp,constraint from a
table1 lj p
There are several different methods you can look into.
The easiest method would be a functional update - http://code.kx.com/wiki/JB:QforMortals2/queries_q_sql#Functional_update
Below, though, should somewhat prove more useful, quicker and neater:
Your problem can be split into 2 parts. For the first part, you are looking to create a sum of each category by grp and constraint within table2. As for the second part, you are looking to join these results (the lookups) onto the corresponding records from table1.
You can create the necessary groups using by
q)exec val,cat by grp,constraint from table2
grp constraint| val cat
--------------| ------------------------------
a 10 | 1 2 3 4 5 `cl1`cl1`cl1`cl2`cl2
b 10 | 6 8 `cl2`cl1
b 20 | ,7 ,`cl2
Note though, this will only create nested lists of the columns in your select query
Next is to sum each of the cat groups
q)exec sum each val group cat by grp,constraint from table2
grp constraint|
--------------| ------------
a 10 | `cl1`cl2!6 9
b 10 | `cl2`cl1!6 8
b 20 | (,`cl2)!,7
Then, to create the cat's columns you can use a pivot like syntax - http://code.kx.com/wiki/Pivot
q)cats:asc exec distinct cat from table2
q)exec cats#sum each val group cat by grp,constraint from table2
grp constraint| cl1 cl2
--------------| -------
a 10 | 6 9
b 10 | 8 6
b 20 | 7
Now you can use this lookup table and index into each row from table1
q)(exec cats#sum each val group cat by grp,constraint from table2)[table1]
cl1 cl2
-------
6 9
8 6
To fill the nulls with zeros, use the carat symbol - http://code.kx.com/wiki/Reference/Caret
q)0^(exec cats#sum each val group cat by grp,constraint from table2)[table1]
cl1 cl2
-------
6 9
8 6
0 0
0 0
And now you can join on each row from table1 to your results using join-each
q)table1,'0^(exec cats#sum each val group cat by grp,constraint from table2)[table1]
grp constraint cl1 cl2
----------------------
a 10 6 9
b 10 8 6
c 20 0 0
d 20 0 0
HTH, Sean
This approach is the easiest way to pass in a list of categories
{table1^flip x!function'[table1`grp;table1`constraint;]each x}`cl1`cl2

KDB selecting first row from each group

Very silly question... Consider the table t1 below which is sorted by sym.
t1:([]sym:(3#`A),(2#`B),(4#`C);val:10 40 12 50 58 75 22 103 108)
sym val
A 10
A 40
A 12
B 50
B 58
C 75
C 22
C 103
C 108
I want to select the first row corresponding to each sym, like this:
(`sym`val)!(`A`B`C;10j, 50j, 75j)
sym val
A 10
B 50
C 75
There's got to be a one-liner to do this. To get the LAST row for each sym, it would be as simple as select by sym from t1. Any hints?
select first val by sym from t1
Or for multiple columns, you can reverse the table and run your query:
select by sym from reverse t1
You could use fby
q)select from t1 where i=(first;i) fby sym
sym val
-------
A 10
B 50
C 75

kdb: dynamically denormalize a table (convert key values to column names)

I have a table like this:
q)t:([sym:(`EURUSD`EURUSD`AUDUSD`AUDUSD);server:(`S01`S02`S01`S02)];volume:(20;10;30;50))
q)t
sym server| volume
-------------| ------
EURUSD S01 | 20
EURUSD S02 | 10
AUDUSD S01 | 30
AUDUSD S02 | 50
I need to de-normalize it to display the data nicely. The resulting table should look like this:
sym | S01 S02
------| -------
EURUSD| 20 10
AUDUSD| 30 50
How do I dynamically convert the original table using distinct values from server column as column names for the new table?
Thanks!
Basically you want 'pivot' table. Following page has a very good solution for your problem:
http://code.kx.com/q/cookbook/pivoting-tables/
Here are the commands to get the required table:
q) P:asc exec distinct server from t
q) exec P#(server!volume) by sym:sym from t
One tricky thing around pivoting a table is - the keys of the dictionary should be of type symbol otherwise it won't generate the pivot table structure.
E.g. In the following table, we have a column dt with type as date.
t:([sym:(`EURUSD`EURUSD`AUDUSD`AUDUSD);dt:(0 1 0 1+.z.d)];volume:(20;10;30;50))
Now if we want to pivot it with columns as dates , it will generate a structure like :
q)P:asc exec distinct dt from t
q)exec P#(dt!volume) by sym:sym from t
(`s#flip (enlist `sym)!enlist `s#`AUDUSD`EURUSD)!((`s#2018.06.22 2018.06.23)!30j, 50j;(`s#2018.06.22 2018.06.23)!20j, 10j)
To get the dates as the columns , the dt column has to be typecasted to symbol :
show P:asc exec distinct `$string date from t
`s#`2018.06.22`2018.06.23
q)exec P#((`$string date)!volume) by sym:sym from t
sym | 2018.06.22 2018.06.23
------| ---------------------
AUDUSD| 30 50
EURUSD| 20 10