Reliably parsing/applying a function to tables vs dictionaries in kdb - kdb

I am trying to do a functional select within a function in q as follows:
dosel:{[tab] ?[tab;enlist(>;`scalar;5);0b;()]};
which works perfectly on tabular input i.e.
q)tab:([]time:2#.z.z;tag:0 0;direction:0 0;scalar:5 10)
q)count tab
2
q).Q.s1 tab
"+`time`tag`direction`scalar!(2020.12.23T12:28:08.254 2020.12.23T12:28:08.254;0 0;0 0;5 10)"
q)dosel tab
time tag direction scalar
--------------------------------------------
2020.12.23T12:49:19.885 0 0 10
And, as expected, doesn't work on dictionary tables:
q)tab:`time`tag`direction`scalar!(.z.z;0;0;4)
q)count tab
4
q)dosel tab
'type
[1] dosel:{[tab] ?[tab;enlist(>;`scalar;5);0b;()]}
^
You could fix this by using enlist i.e.
q))dosel enlist tab
time tag direction scalar
-------------------------
However this has obvious edge cases i.e.
q)tab:`time`tag`direction`scalar!(2#.z.z;2#0;2#0;2#4)
q))type tab
99h
q)count tab
4
q).Q.s1 tab
"`time`tag`direction`scalar!(2020.12.23T12:55:48.835 2020.12.23T12:55:48.835;0 0;0 0;4 4)"
q)dosel tab
'type
[1] dosel:{[tab] ?[tab;enlist(>;`scalar;5);0b;()]}
^
q)dosel enlist tab
'type
[4] dosel:{[tab] ?[tab;enlist(>;`scalar;5);0b;()]}
^
q)dosel flip tab
'type
[7] dosel:{[tab] ?[tab;enlist(>;`scalar;5);0b;()]}
^
One can see in the above example that count would not be a good approximation of the tabular nature therin.
How does one reliably parse tabular/dictionary data to their appropriate form such that dosel can be applied.
Apologies if this is a newbie question... thanks again.

Your last example works for me, I'm not sure why you're seeing an error:
q)tab:`time`tag`direction`scalar!(2#.z.z;2#0;2#0;2#4)
q)
q)dosel flip tab
time tag direction scalar
-------------------------
Either way - your function is designed to work on a table so you need to ensure your input is always a table. This could be achieved using:
makeTab:{$[98h=type x;x;#[(flip;enlist)0>type first x;x]]};
q)dosel makeTab ([]time:2#.z.z;tag:0 0;direction:0 0;scalar:5 10)
time tag direction scalar
--------------------------------------------
2020.12.23T13:18:58.909 0 0 10
q)
q)dosel makeTab `time`tag`direction`scalar!(.z.z;0;0;4)
time tag direction scalar
-------------------------
q)
q)dosel makeTab `time`tag`direction`scalar!(2#.z.z;2#0;2#0;2#4)
time tag direction scalar
-------------------------
This makeTab function assumes you'll pass in a table or dictionary but it can be generalised further if needed (e.g. keyed tables)

Related

KDB: select and round off each row

I created my own function of round off:
.q.rnd:{$[x < 0; -1; 1] * floor abs[x] + 0.5}
I have a table Test with a string column of COL
select "F"$(COL) from Test
24549.18741328
48939.50717263
-274853.33568872
-24549.18741328
298753.62574861
84822.70074144
-7468840.64371524
117944.21228603
-117944.21228603
7468840.64371524
-7468840.64371524
I want to derive a table that would round-off the records in Test
One would think that the statement below would work. But it does not.
select .q.rnd "F"$(COL) from Test
I get the error "type". So how do I round off the records?
The result if the if-else conditional must be an atomic boolean. When you run .q.rnd on a column, you are operating on a list and x<0 is going to return a list of booleans, not an atom. The vector conditional is ?
Nonetheless, it looks like you want a resulting integer/long anyway, so just use parse here
q)t:([]string (10?-1 1)*10?10000f)
q)select "F"$x from t
x
-------------------
4123.1701336801052
-9877.8444156050682
-3867.3530425876379
7267.8099689073861
4046.5459413826466
-8355.0649625249207
6427.3701561614871
-5830.2619284950197
1424.9352994374931
-9149.8820902779698
q)select "j"$"F"$x from t
x
-----
4123
-9878
-3867
7268
4047
-8355
6427
-5830
1425
-9150
To add to what Sean's said, if you wanted to use your function as well you could use each which will apply .q.rnd to each item in the list.
q)select .q.rnd each "F"$x from t
x
-----
-3928
5171
5160
-4067
-1781
3018
-7850
5347
-7112
-4116
but using select "F"$x from t is better as it is vectorised.
q)\t:1000 select "j"$"F"$x from t
22
q)\t:1000 select .q.rnd each "F"$x from t
33
Also it should be noted that the .q namespace isn't necessary and is "reserved for kx use". A lot of the default q functions are in the .q namespace and there's always a chance future kdb updates could add a .q.rnd that has different behaviour and will break any code where you have used your function in.

How to join strings in a table in kdb?

I would like to join string in kdb but didn't work well. This is the data:
tab:([]service:`CS`CS.US`CS.US_ABC;y:1 2 3)
`CS 1
`CS.US 2
`CS.US_ABC 3
I would like to add :0 and :primary depending on the given parameter. 0 is working now
update service:`$(((string[service],'(":"))),'("C"$string 0)) from tab
If I would like the data to become
`CS:primary 1
`CS.US:primary 2
`CS.US_ABC:primary 3
and the primary is either string or symbol, how could I join?
I am parameterizing the 0 and primary.
Currently, 0 works as follows
update service:`$(((string[service],'(":"))),'( "0")) from tab
but "primary" is not working
update service:`$(((string[service],'(":"))),'( "primary")) from tab
Your query gives you a length error:
q)tab:([]service:`CS`CS.US`CS.US_ABC;y:1 2 3)
q)update service:`$(((string[service],'(":"))),'( "primary")) from tab
'length
[0] update service:`$(((string[service],'(":"))),'( "primary")) from tab
^
This happens because ,' (concatenate each) expects vectors of equal length on both sides, but gets a table column size (3) vector on the left and a character vector of length 7 on the right. Notice what happens when you pass 3 characters:
q)update service:`$(((string[service],'(":"))),'( "pri")) from tab
service y
-------------
CS:p 1
CS.US:r 2
CS.US_ABC:i 3
Each row gets a different suffix. What you want is to use ,\: (concatenate each-left):
q)update service:`$(((string[service],'(":"))),\:( "primary")) from tab
service y
-------------------
CS:primary 1
CS.US:primary 2
CS.US_ABC:primary 3
Why does it work for "0"? It works because "0" is not a vector but a character scalar
q)type "0"
-10h
q)type "primary"
10h
and with a scalar on the right, ,' works the same as .\::
q)"ab",'"0"
"a0"
"b0"
q)"ab",\:"0"
"a0"
"b0"
Finally, your query will run faster if you first prepend ":" to the suffix and then append the result to each service:
q)update service:`$(string[service],\:":","primary") from tab
service y
-------------------
CS:primary 1
CS.US:primary 2
CS.US_ABC:primary 3
If you want primary to be a parameter rather than a fixed string, the following will work (primary is "no" in this example):
q)update {`$string[y],\:":",x}[primary;]service from tab
service y
--------------
CS:no 1
CS.US:no 2
CS.US_ABC:no 3
If primary is a fixed string then you can place it inside the lambda in lieu of "x" and replace "y" with "x", yielding the following:
q)update {`$string[x],\:":","primary"}service from tab
service y
-------------------
CS:primary 1
CS.US:primary 2
CS.US_ABC:primary 3
q)update service:`$(((string[service],'(":"))),'(count[i]#enlist "primary"))
from tab
service y
-------------------
CS:primary 1
CS.US:primary 2
CS.US_ABC:primary 3

Understanding how to read each-right and each-left combined in kdb

From q for mortals, i'm struggling to understand how to read this, and understand it logically.
1 2 3,/:\:10 20
I understand the result is a cross product when in full form: raze 1 2 3,/:\:10 20.
But reading from left to right, I'm currently lost at understanding what this yields (in my head)
\:10 20
combined with 1 2 3,/: ??
Help in understanding how to read this clearly (in words or clear logic) would be appreciated.
I found myself saying the following in my head whilst I program the syntax in q. q works from right to left.
Internal Monologue -> Join the string on the right onto each of the strings on the left
code -> "ABC",\:"-D"
result -> "A-D"
"B-D"
"C-D"
I think that's an easy way to understand it. 'join' can be replaced with whatever...
Internal Monologue -> Does the string on the right match any of the strings on the left
code -> ("Cat";"Dog";"CAT";"dog")~\:"CAT"
result -> 0010b
Each-right is the same concept and combining them is straightforward also;
Internal Monologue -> Does each of the strings on the right match each of the strings on the left
code -> ("Cat";"Dog";"CAT";"dog")~\:/:("CAT";"Dog")
result -> 0010b
0100b
So in your example 1 2 3,/:\:10 20 - you're saying 'Join each of the elements on the right to each of the elements on the left'
Hope this helps!!
EDIT To add a real world example.... - consider the following table
q)show tab:([] upper syms:10?`2; names:10?("Robert";"John";"Peter";"Jenny"); amount:10?til 10)
syms names amount
--------------------
CF "Peter" 8
BP "Robert" 1
IC "John" 9
IN "John" 5
NM "Peter" 4
OJ "Jenny" 6
BJ "Robert" 6
KH "John" 1
HJ "Peter" 8
LH "John" 5
q)
I you want to get all records where the name is Robert, you can do; select from tab where names like "Robert"
But if you want to get the results where the name is either Robert or John, then it is a perfect scenario to use our each-left and each-right.
Consider the names column - it's a list of strings (a list where each element is a list of chars). What we want to ask is 'does any of the strings in the names column match any of the strings we want to find'... that translates to (namesList)~\:/:(list;of;names;to;find). Here's the steps;
q)(tab`names)~\:/:("Robert";"John")
0100001000b
0011000101b
From that result we want a compiled list of booleans where each element is true of it is true for Robert OR John - for example, if you look at index 1 of both lists, it's 1b for Robert and 0b for John - in our result, the value at index 1 should be 1b. Index 2 should be 1b, index3 should be 1b, index4 should be 0b etc... To do this, we can apply the any function (or max or sum!). The result is then;
q)any(tab`names)~\:/:("Robert";"John")
0111001101b
Putting it all together, we get;
q)select from tab where any names~\:/:("Robert";"John")
syms names amount
--------------------
BP "Robert" 1
IC "John" 9
IN "John" 5
BJ "Robert" 6
KH "John" 1
LH "John" 5
q)
Firstly, q is executed (and hence generally read) right to left. This means that it's interpreting the \: as a modifier to be applied to the previous function, which itself is a simple join modified by the /: adverb. So the way to read this is "Apply join each-right to each of the left-hand arguments."
In this case, you're applying the two adverbs to the join - \:10 20 on its own has no real meaning here.
I find it helpful to also look at the converse case 1 2 3,\:/:10 20, running that code produces a 2x6 matrix, which I'd describe more like "apply join each-left to each of the right hand arguments" ... I hope that makes sense.
An alternative syntax which also might help is ,/:\:[1 2 3;10 20] - this might be useful as it makes it very clear what the function you're applying is, and is equivalent to your in-place notation.

Can someone explain the TI BASIC 🔺List command?

I understand that the command compares and can subtract values, but I don't see exactly how that works. I've used a TI BASIC programming tutorial site (http://tibasicdev.wikidot.com/movement-explanation) and I need clarification on 🔺List as a whole.
This portion of the code with 🔺List is as follows,:
:min(8,max(1,A+sum(ΔList(Ans={25,34→A
:min(16,max(1,B+sum(ΔList(K={24,26→B
and the website explains the code like this.:
"This is how this code works. When you press a key, its value is stored to K. We check to see if K equals one of the keys we pressed by comparing it to the lists {24,26 and {25,34. This results in a list {0,1}, {1,0}, or {0,0}. We then take the fancy command Δlist( to see whether to move up, down, left or right. What Δlist( does is quite simple. Δlist( subtracts the first element from the second in the previous list, and stores that as a new one element list, {1}, {-1}, or {0}. We then turn the list into a real number by taking the sum of the one byte list. This 1, -1, or 0 is added to A."
The ΔList( command subtracts every element in a list from its previous element. This code uses some trickery with it to compactly return 1 if a key is pressed and -1
ΔList( calculates the differences between consecutive terms of a list, and returns them in a new list.
ΔList({0,1,4,9,16,25,36})
{1 3 5 7 9 11}
That is, ΔList({0,1,4,9,16,25,36}) = {1-0, 4-1, 9-4, 16-9, 25-16, 36-25} = {1 3 5 7 9 11}.
When there are only two elements in a list, ΔList({a,b}) is therefore equal to {b-a}. Then sum(ΔList({a,b})) is equal to b-a, since that's the only term in the list. Let's say that K is 26 in your example; that is, the > key is pressed.
B+sum(ΔList(K={24,26→B Result of expression:
K 26
K={24,26 {0,1}
ΔList(K={24,26 {1} = {0 - 1}
sum(ΔList(K={24,26 -1
B [current x-position of player]
B+sum(ΔList(K={24,26→B [add 1 to current x-pos. of player]
Similarly, B will be decreased if key 24, the left key, is pressed.

Reshape (#) doesn't work with a dynamic argument

To form a matrix consisting of identical rows, one could use
x:1 2 3
2 3#x,x
which produces (1 2 3i;1 2 3i) as expected. However, attempting to generalise this thus:
2 (count x)#x,x
produces a type error although the types are equal:
(type 3) ~ type count x
returns 1b. Why doesn't this work?
The following should work.
q)(2;count x)#x,x
1 2 3
1 2 3
If you look at the parse tree of both your statements you can see that the second is evaluated differently. In the second only the result of count is passed as an argument to #.
q)parse"2 3#x,x"
#
2 3
(,;`x;`x)
q)parse"2 (count x)#x,x"
2
(#;(#:;`x);(,;`x;`x))
If you're looking to build matrices with identical rows you might be better off using
rownum#enlist x
q)x:100000?100
q)\ts do[100;v1:5 100000#x,x]
157 5767696j
q)\ts do[100;v2:5#enlist x]
0 992j
q)v1~v2
1b
I for one find this more natural (and its faster!)