How do I select by month in KDB? - kdb

I have a table of the form
t r v
-------------------------------------------------
2016.01.04D09:51:00.000000000 -0.01507338 576
2016.01.04D09:52:00.000000000 -0.001831502 200
2016.01.04D11:37:00.000000000 -0.001100514 583
2016.01.04D12:04:00.000000000 -0.001653045 1000
I want to get the October 2020 values.
I tried doing a query:
select from x where t.month = 2020.10
but this didn't work. I think I might need to cast a date type? What am I doing wrong?

The trailing m allows the interpreter know that the atom is of month type instead of float type.
q)type 2020.10
-9h
q)type 2020.10m
-13h
q)select from x where t.month=2020.10
t
-
q)select from x where t.month=2020.10m
t
-----------------------------
2020.10.20D20:20:00.000000000

Related

In KDB Q, how can I select the last letter of a symbol?

Say I have a table with an enumerated symbol column with values:
sym
_ _ _ _
AAPL
MSFT
INTC
I'm trying to select just the rows where the last letter of the symbol is C.
I've selecting against last string sym and -1#string sym, but I get an incompatible list length error every time.
What am I doing wrong?
The keyword like works for symbols as well as strings so no need to cast to string if you're trying to pattern match
q)select from ([]sym:`AAPL`MSFT`INTC`ABC) where sym like"*C"
sym
----
INTC
ABC
Does this work?
q)t: ([] sym: `AAPL`MSFT`INTC)
q)t
sym
----
AAPL
MSFT
INTC
q)select last each string sym from t
sym
---
L
T
C

Select from a table with Limit expression works, without - fails

For a table t with a custom field c which is dictionary I could use select with limit expression, but simple select failes:
q)r1: `n`m`k!111b;
q)r2: `n`m`k!000b;
q)t: ([]a:1 2; b:10 20; c:(r1; r2));
q)t
a b c
----------------
1 10 `n`m`k!111b
2 20 `n`m`k!000b
q)select[2] c[`n] from t
x
-
1
0
q)select c[`n] from t
'type
[0] select c[`n] from t
^
Is it a bug, or am I missing something?
Upd:
Why does select [2] c[`n] from t work here?
Since c is a list, it does not support key indexing which is why it has returned a type
You need to index into each element instead of trying to index the column.
q)select c[;`n] from t
x
-
1
0
A list of confirming dictionaries outside of this context is equivalent to a table, so you can index like you were
q)c:(r1;r2)
q)type c
98h
q)c[`n]
10b
I would say that the way complex columns are represented in memory makes this not possible. I suspect that any modification that creates a copy of a subset of the elements will allow column indexing as the copy will be formatted as a table.
One example here is serialising and deserialising the column (not recommended to do this). In the case of select[n] it is selecting a subset of 2 elements
q)type exec c from t
0h
q)type exec -9!-8!c from t
98h
q)exec (-9!-8!c)[`n] from t
10b

How do I cast a float to an integer datatype in KDB?

I have a table in KDB(Q) with a size column that is currently in float format. How do I cast the entire column from float to int, while truncating the decimal place?
You can do this by updating your table
q)tab:([]bid:1000?5f;price:1000?5f;size:1000?100f)
q)exec t from meta tab
"fff"
q)update "i"$size from `tab
`tab
q)exec t from meta tab
"ffi"
In the above the pertinent point is the application of "i"$ which is 'casting' the size column from a float to an integer
Another way of doing this is using the # amend to update the column(s):
q)t:([]sym:500?`3;px:500?10f;size:500?100f)
q)3#t
sym px size
---------------------
gdh 7.678514 95.25017
jlb 2.345028 42.09728
nln 5.553286 98.80532
q)t:#[t;`size;"i"$] / can also use `t to update t
q)3#t
sym px size
-----------------
gdh 7.678514 95
jlb 2.345028 42
nln 5.553286 98
I think its also worth pointing out that floor/ceiling functions round numbers down/up respectively and works slightly faster than "i"$ in this case, however these functions cast the column to a long instead of an int:
q)meta#[t;`size;floor]
c | t f a
----| -----
sym | s
px | f
size| j

KDB/Q-sql Dynamic Grouping and con-canting columns in output

I have a table where I have to perform group by on dynamic columns and perform aggregation, result will be column values concatenating group-by tables and aggregations on col supplied by users.
For example :
g1 g2 g3 g4 col1 col2
A D F H 10 20
A E G I 11 21
B D G J 12 22
B E F L 13 23
C D F M 14 24
C D G M 15 25
and if I need to perform group by g1,g2,g4 and avg aggregation on col1 output should be like this
filed val
Avg[A-D-H-col1] 10.0
Avg[A-E-I-col1] 11.0
Avg[B-D-J-col1] 12.0
Avg[B-E-L-col1] 13.0
Avg[C-D-M-col1] 14.5
I am able to perform this if my group by columns are fixed using q-sql
t:([]g1:`A`A`B`B`C`C;g2:`D`E`D`E`D`D;g3:`F`G`G`F`F`G;g4:`H`I`J`L`M`M;col1:10 11 12 13 14 15;col2:20 21 22 23 24 25)
select filed:first ("Avg[",/:(({"-" sv x} each string (g1,'g2,'g4)),\:"-col1]")),val: avg col1 by g1,g2,g4 from t
I want to use functional query for the same , means I want a function which take list of group by columns, aggregation to perform and col name andtable name as input and output like above query. I can perform group by easily using dynamic columns but not able to con-cat in fields. function signature will be something like this
fun{[glist; agg; col,t] .. ;... }[g1g2g4;avg;col1,t]
Please help me to make above query as dynamic.
You may try following function:
specialGroup: {[glist;agg;col;table]
res: ?[table;();{x!x}glist; enlist[`val]!enlist(agg;col)];
aggname: string agg;
aggname: upper[1#aggname], 1_aggname;
res: ![res;();0b;enlist[`filed]!enlist({(y,"["),/:("-"sv/:string flip x),\:"]"};enlist,glist,enlist[enlist col];aggname)];
res
};
specialGroup[`g1`g2`g4;avg;`col1;t]
specialGroup aggregates values into val column first. And populates filed column after grouping. This helps to avoid generating filed duplicates and selecting first of them.
If you modify Anton's code to this it will change the output dynamically
specialGroup: {[glist;agg;col;table]
res: ?[table;();{x!x}glist; enlist[`val]!enlist(agg;col)];
res: ![res;();0b;enlist[`filed]!enlist({(#[string[y];0;upper],"["),/:("-"sv/:string flip x),\:"]"}[;agg];enlist,glist,enlist[enlist col])];
res
};
As the part of the code that made that string was inside another function you need to pass the agg parameter to the inner function.

q/kdb Selecting a variable in query

q)sym:`a`b`c
q)t:([] s:`g`v; p:2?10.)
Selecting the variable sym works fine in the following query :
q)select sym from t
However it throws an error while selecting with a table column, I am not able to figure out the reason
q)select sym, p from t
You get a 'length error because the lists sym and p (column from t) are different lengths.
q)sym:`a`b
q)select sym,p from t
sym p
------------
a 3.927524
b 5.170911
What is the output you are trying to get to with this?
Assuming you are trying to select as many elements of sym as the table count :
q)select p,(count i)#sym from t
p sym
------------
1.780839 a
3.017723 b